on 2013 May 09 9:27 AM
We have a custom process for loading data into a table of our database that is currently written using an INSERT statement that loads one row of data into the table at a time. I want to speed that up by using the INSERT INTO SELECT FROM OPENSTRING
construct, where I'll pass a BLOB containing the data for a block of rows as a parameter.
Before anyone tells me to use SABulkCopy
, we've tested it and we get better performance loading data the way we're currently doing it, one row at a time. I think that SABulkCopy
must not be using the OPENSTRING
construct.
In any case, it's my assumption that the data in the BLOB must be layed out as an array of rows, with the layout of a row matching the layout of the columns in the table. What I need is documentation on how to format the data. I'm guessing that C# ints
are just 4 byte integers in little endian order, 'longs' are 8 byte integers in little endian order, etc. I think varchars are 2 bytes with a length followed by as many characters as their in the string. But how are uniqueidentifiers, DateTimes
, and DateTimeOffsets
stored in the BLOB?
Request clarification before answering.
It's not that hard... it's called openSTRING for a reason ( hint: it deals with strings 🙂
CREATE TABLE t ( a BIGINT NOT NULL PRIMARY KEY, b INTEGER NOT NULL, c VARCHAR ( 10 ) NOT NULL, d DECIMAL ( 11, 2 ) NOT NULL, e DATE NOT NULL, f TIMESTAMP NOT NULL ); BEGIN DECLARE @b LONG VARCHAR; SET @b = STRING ( '1,101,''aaa'',123.45,2013-05-09,2013-05-09 16:11:16.057\\x0d\ ', '2,202,''bbb'',234.56,2013-05-10,2013-05-11 16:11:16.063\\x0d\ ', '3,303,''ccc'',456.78,2013-05-11,2013-05-11 16:11:16.069\\x0d\ ' ); INSERT t SELECT * FROM OPENSTRING ( VALUE @b ) WITH ( TABLE t ( a, b, c, d, e, f ) ) OPTION ( DELIMITED BY ',' ) AS openstring_t; COMMIT; SELECT * FROM t ORDER BY a; END; a b c d e f -------------------- ----------- ---------- ------------- ---------- ----------------------- 1 101 aaa 123.45 2013-05-09 2013-05-09 16:11:16.057 2 202 bbb 234.56 2013-05-10 2013-05-11 16:11:16.063 3 303 ccc 456.78 2013-05-11 2013-05-11 16:11:16.069
See also: Pushing OPENSTRING and CROSS APPLY
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Keep in mind that it may be easier, but maybe not memory friendly nor faster in the end to do. Using up more memory space in your .NET program to parse, re-generate, and send strings to the database server takes away memory from the database server to actually insert the rows against the database pages.
This is also why when MobiLink sends rows down to a remote, both the database server and the synchronization client dynamically manage memory to try and balance these two conditions. MobiLink also uses a packed binary format to efficiently transfer the rows.
Jeff: This has been running in production now for a few months. Performance is great and definitely faster than inserting the data one row at a time. We've had to limit the amount of memory that the database server uses to begin with, or other applications do not perform well.
Plus, by sending data for many rows in one I/O operation, we end up reducing the total time doing I/O. That is, the time it takes to upload N rows in one I/O is less than the time it takes to upload N rows one at a time. It's been a big win for us.
User | Count |
---|---|
52 | |
10 | |
9 | |
8 | |
5 | |
5 | |
5 | |
5 | |
5 | |
4 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.