Connected Thoughts – Thiago Almeida

June 29, 2009

Using the BizTalk WCF-SQL Adapter to load a flat file into a SQL Server 2008 table

Filed under: BizTalk — Thiago Almeida @ 4:17 pm

While preparing for my webcast on the WCF-SQL Adapter as part of a series of webcasts* I ran into a requirement that is quite frequent with BizTalk solutions: merging the contents of a flat file with a SQL Server database table. I wrote a post a while ago on how call the same stored procedure multiple times without a loop shape with the ‘old’ BizTalk SQL adapter, and this post also covers that with the new WCF-SQL Adapter. The source code will be made available as soon as the webcasts are out, and I’ll update this to reflect that.

I have a flat file that contains a list of products. I need to load the contents of this flat file into a SQL Server 2008 table using BizTalk Server 2009 and the WCF-SQL Adapter. The table might already have some of the products in the flat file, and in this case the product row should be updated.

The data in my sample flat file was extracted from the Adventure Works sample database in the SQL Server 2008 samples in Codeplex, which gave me 504 products to play with.

 

Flat File and Debatching

As you might already know, messages with multiple items in them (multiple Products in this case) coming into BizTalk can be disassembled and debatched on their way in by the disassembler pipeline components. In this case, since it is a flat file that we are receiving, we will use the flat file disassembler component that comes out of the box with BizTalk.

In this post I want to go over loading the file into SQL Server in two ways: one by splitting the Product items into individual messages and loading them individually with the WCF-SQL Adapter; and another by not splitting the Product items and sending one single message to the WCF-SQL Adapter with all the products.

For that I created two flat file schemas, and both look like the below:

image

On one of the schemas the Product node has its ‘Max Occurs’ set to ‘unbounded’. The other schema has the Product node’s ‘Max Occurs’ set to 1.  This property is what tells the flat file disassembler pipeline component if it should debatch the Products or not.

I created two BizTalk pipelines to handle the two different schemas. I dragged the flat file disassembler pipeline component to the disassemble stage of each pipeline, and selected the appropriate schema for each.

image image image

 

SQL Server Table and Stored Procedure

On the SQL Server database side, we have a table called Product (what a surprise!) with the following columns:

image

We are going to call a stored procedure for each line in the flat file to load each product. An easy way to either insert the product if it doesn’t exist in the table or update it if it exists is to use the MERGE statement that is new in SQL Server 2008. So all we have in our stored procedure is the following:

CREATE PROCEDURE [dbo].[ADD_PRODUCT]

@ProductShortDescription varchar(50), @ProductFullDescription varchar(max), @UOM nchar(10), @UnitPrice money

AS BEGIN

SET NOCOUNT ON;

–Use merge statement to either insert or update product based on product short description

MERGE INTO Product AS Target

USING (SELECT @ProductShortDescription, @ProductFullDescription, @UOM, @UnitPrice)

AS Source(ProductShortDescription, ProductFullDescription, UOM, UnitPrice)

ON (Target.ProductShortDescription = Source.ProductShortDescription)

WHEN matched THEN

UPDATE SET ProductFullDescription = Source.ProductFullDescription, UOM = Source.UOM, UnitPrice = Source.UnitPrice

WHEN not matched THEN

INSERT (ProductShortDescription, ProductFullDescription, UOM, UnitPrice)

VALUES (Source.ProductShortDescription, Source.ProductFullDescription, Source.UOM, Source.UnitPrice);

END

WCF-SQL Adapter Schemas

To add the SQL Server schemas used by the WCF-SQL Adapter from the BizTalk solution you can right click on the BizTalk project, select Add, and then ‘Add Generated Items’. From there you can either choose the ‘Add Adapter Metadata’ or the ‘Consume Adapter Service’ options. They will both bring the ‘Consume Adapter Service’ wizard where you can connect to the target SQL Server database and select what items and operations you want to consume. In our case we are only interested on the ADD_PRODUCT strongly typed stored procedure:

image

This will give you a schema like the following for the ADD_PRODUCT stored procedure:

image

Note that the ADD_PRODUCT node is the root node, and therefore can only exist once in the XML instances for this schema. This is the schema we are going to map to for the debatched Product information we get from the flat file schema with a max occurs of 1.  It is a straight map then from the single product flat file schema to the single stored procedure schema:

image

That takes care of mapping the products when the flat file is being debatched into single product messages. Now what do we do about mapping all the products in the flat file to only one XML that is sent to the WCF-SQL Adapter? Here’s where the WCF-SQL Adapter’s composite operations come in handy. The composite operations in the adapter have been described on Richard Seroter’s book (free sample chapter on the WCF-SQL Adapter) and Brian Loesgen’s blog. I created a new schema with a root node ‘Request’ and a second root node ‘RequestResponse’. The first root node name isn’t really important, as long as the second root node name is the same as the first with a ‘Respose’ suffix. I then added the single ADD_PRODUCT schema as an XSD Import to my composite schema. This allows me to create an unbounded record under the ‘Request’ node and change it to have a data structure type of ns0:ADD_PRODUCT, and an unbounded record under the ‘RequestResponse’ node and change it to have a data structure type of ns0:ADD_PRODUCTResponse.

image

This allows us to map from the non debatched flat file schema to the schema created above:

image

 

Calling the WCF-SQL Adapter

After deploying the solution I created two receive ports and two respective receive locations – one of them configured with the debatching pipeline and the single product map, and the other configured with the single file pipeline and the composite operation map.

I then created two one way send ports with the WCF-Custom Adapter and the sqlBinding, each with a filter for one of the receive ports. The send port that filters on the debatched single product insert receive port is configured as follows, with the TypedProcedure/dbo/ADD_PRODUCT action:

image

The send port that filters on the single file with multiple products and composite operation map is configured as follows, with the CompositeOperation action:

image

Both  send ports had a binding type of sqlBinding of course, with the default values (make sure useAmbientTransaction is enabled so that the stored procedure calls are inside a transaction):

image

 

Transactions and Conclusion

So when we debatch the Products flat file on the way in we end up with multiple concurrent calls to the stored procedure via the WCF-SQL Adapter, each in its own transaction:

image

When we map the entire file to the composite schema we end up with one transaction that wraps around all the stored procedure calls:

image 

If we monitor the Transactions/sec for the database we see barely any activity when we use the single file method:

image

If we use the debatch multiple message method we some spikes as the multiple transaction to the database are made:

image

As expected the single file method performs much faster for loading the 504 rows into the table. By placing a datetime column on the products table I could see the difference from the first insert to the last is only 254 milliseconds. With the debatch method BizTalk goes through the debatched records at a much slower pace  taking around 16 seconds to load them all, since it has to map each debatched message, route multiple messages to the send port, create multiple transactions against SQL Server, etc.

After looking into it a bit more I also noticed that for the debatched scenario the message delivery throttling and message publishing throttling were kicking off for the BizTalk host loading the messages into SQL Server. By simply changing the number of samples that the host should base its throttling decision on to something over the 504 records being inserted the time for the debatched inserts went down to 4 seconds from the 16 seconds mentioned above:

image image

The debatch method is still useful in many situations – if you need to perform extra steps for each message in the batch, or if your DBAs require one transaction for each stored procedure call, etc.

As I mentioned at the beginning, the source code will be made available as soon as the BizTalk 2009 webcasts are out, and I’ll update this post to reflect that.

*EDIT*

Links for this blog post and the full SQL Adapter webcast: Video, Slides, Source Code

 

Regards,

Thiago Almeida

 

* BizTalk 2009 Light and Easy Webcast SeriesMick Badran asked many BizTalk/CSD MVPs and Industry experts (around the globe) to share their knowledge and expertise. We came up with a series of Webcasts/Presentations and Demo Code for the community around many aspects of BizTalk 2009 – from new features in BAM, SharePoint, SQL Adapter… to Orchestration Performance. Brilliant! Stay tuned as the bits are being finalized.

Blog at WordPress.com.