cancel
Showing results for 
Search instead for 
Did you mean: 

Improving program performance using DBLIB?

Former Member
4,219

My company has developed software that needs to load data transferred to clients from a server using XMPP. The client uses SQL Anywhere for its database; the server uses a different database product entirely.

Our process works by keeping track of which rows of data have been transferred from one machine to the other for various data types that a client can subscribe to. When a client is added to the system for the first time, all of the data in the server to date gets downloaded. From then on, only the rows that change get sent.

Our code is written in C# and we transmit batches of data over the wire from the server to the client. On the client, we iterate over the data in the batch one row at a time and run a stored procedure which performs an "upsert" operation on a table. This is generally done using an INSERT INTO ON EXISTING UPDATE statement, but in a couple of cases is a bit more complex.

We've built the application and its running, but we're not happy with the performance of loading data of one particular type. There are millions of rows of data of that type that have to be downloaded to the client from the server and inserted into the SA database on the first run. Our code is running as fast as it cans loading data one row at a time right now. We need to find a way to speed it up, preferably by sending all of the rows of this type in the batch to the database in one I/O.

I've been kicking around the idea of writing a DLL in C language to speed this process up. The C# code would convert the objects in a batch into an array of structs. It would then marshal that array to a function in the DLL. The C language DLL would then create a temporary table and somehow bulk load all of that data into the temporary table. Then it would execute a stored procedure which would update the main table using all of the rows in the batch in a single operation, probably using a MERGE statement or a cursor.

I can do all of this, except that I don't know how to do the BCP into the temporary table. I've read the documentation on Embedded SQL and nothing in there pops out at me as a way to do this. And when I search the documentation for "BCP" all I get back are references to the LOAD TABLE command and the INPUT statement. The former requires the data to be in a file on the file system and the latter is for Interactive SQL only. Not what I've looking for.

Can anyone point me in the right direction?

P.S. It's important that the actions of the C DLL participate in the same transaction used by the C# code. Can that be done?

VolkerBarth
Contributor

Doesn't that sound like a solution for some sync technology like MobiLink (which would batch inserts/updates by default, AFAIK)?

VolkerBarth
Contributor

Just to ask: Are you aware that as of v12, the INSERT ... VALUES statement can handle multiple rows in one step by specifying

INSERT ON EXISTING UPDATE ... VALUES (row-1-values), (row-2-values),...

That might be a way to batch inserts on the SQL side, as well, without having to use temporary tables.


Using the OPENSTRING function, i.e.

INSERT ... SELECT FROM OPENSTRING(FILE, ...)

comes to my mind, too.

Former Member
0 Kudos

Yes, I know about INSERT being able to insert multiple rows of data, but the problem is how do you get that to work from C#? You'd need one parameter per row of data to be inserted. That would probably put a limit on the number of rows in the batch equal to `MaxNumberOfParameters / NumberofColumns'. I don't know what the first number is, but we're talking about operating on at least 500 rows in a batch, maybe more. Regarding MobiLink, I don't think we have a license and the word "Mobi" in the name made me think of cell phones. This is a PC application, so it never occurred to me to look at that. Finally, thanks for the information on the OPENSTRING function. I didn't know that existed.

VolkerBarth
Contributor
0 Kudos
  • As to MobiLink: It can surely also be used to synchronize between different server-class boxes, and MobiLink clients are available on "normal" platfroms like Windows, Linux, Solaris, too.
  • As to the C# question: Sorry, I don't know - I thought you would generate "hard-coded" SQL statements for performance reasons, so parameter issues would not apply...

Accepted Solutions (1)

Accepted Solutions (1)

johnsmirnios
Participant

LOAD TABLE does have a "USING CLIENT FILE" clause that will allow it to read from a file on the client machine: http://dcx.sybase.com/index.html#sa160/en/dbreference/load-table-statement.html*d5e60003

Take note of some of the requirements in the 'Privileges' section though:

When loading from a file on a client computer:

READ CLIENT FILE privilege is also required.

Read privileges are required on the directory being read from.

The allow_read_client_file database option must be enabled.

The read_client_file secure feature must be enabled.

From ESQL, you can also do wide "puts" to insert multiple rows in one request. I'm not sure if the .net driver has a way of exposing it natively -- it might. See: http://dcx.sybase.com/index.html#1201/en/dbreference/put-statement.html

regdomaratzki
Product and Topic Expert
Product and Topic Expert
0 Kudos

I added an answer that shows a bulk insert sample using our .NET provider.

Former Member
0 Kudos

John:

Thanks! We'll be looking at the PUT statement in ESQL. We intend to create a temporary table, PUT all of the data for all of the rows in the batch into the temporary table in one operation, and then MERGE the rows in the temporary table into the production table. Hopefully, this will be faster than inserting the data one row at a time using SACommand.

Answers (2)

Answers (2)

regdomaratzki
Product and Topic Expert
Product and Topic Expert

Have you tried using the SABulkCopy class in the SQL Anywhere .NET provider? The following sample code (lacking any error checking for simplicity) cranks in 1000 rows at once.

using System;
using System.Data;
using iAnywhere.Data.SQLAnywhere;

namespace bulkInsert
{
    class Program
    {
        static void Main(string[] args)
        {
            SAConnection conn = new SAConnection( "Data Source=dba_sql_12" );
            conn.Open();

            // Assume the following table definition :
            // CREATE TABLE t1 ( pk integer primary key, c1 integer )

            DataTable table = new DataTable("t1");
            table.Columns.Add(new DataColumn("pk", typeof(int))); 
            table.Columns.Add(new DataColumn("c1", typeof(int)));

            for( int i=0; i<1000; i++ ) {
                table.Rows.Add(i, i);
            }

            using(SABulkCopy bulkCopy = new SABulkCopy(conn))
            {
                bulkCopy.BulkCopyTimeout = 600; // in seconds
                bulkCopy.DestinationTableName = "t1";
                bulkCopy.WriteToServer(table);
            }            
            conn.Close();
        }
    }
}

Here's another sample that uses a DataReader instead of a DataTable as the input to bulkCopy.WriteToServer(). In your case, the DataReader can point to a DataReader that you opened on the connection to the source database.

using System;
using System.Data;
using iAnywhere.Data.SQLAnywhere;

namespace bulkInsert
{
    class Program
    {
        static void Main(string[] args)
        {
            SAConnection conn = new SAConnection( "Data Source=dba_sql_12" );
            conn.Open();

            SACommand cmd = new SACommand("select row_num, row_num+1 from sa_rowgenerator(1,1000)", conn );
            SADataReader reader = cmd.ExecuteReader();

            using(SABulkCopy bulkCopy = new SABulkCopy(conn))
            {
                bulkCopy.BulkCopyTimeout = 600; // in seconds
                bulkCopy.DestinationTableName = "t1";
                bulkCopy.WriteToServer(reader);
            }            
            conn.Close();
        }
    }
}
Former Member
0 Kudos

Reg:

Actually, we looked at that when we first started using SQL Anywhere last summer, and the performance was abysmal. We believe that SaBulkCopy is actually sending the data one row at a time and not all of the rows in the batch in one I/O. The performance we got from it was actually worse than when we inserted the data one row at a time.

johnsmirnios
Participant
0 Kudos

There was a recent bug fix for SABulkCopy which describes an 'OutOfMemoryException' but the underlying problem was described as

"SABulkCopy was loading the whole source table before copying to destination table" and it was fixed by making all WriteToServer methods respect batch size.

Versions fixed: 12.0.1.3790, 11.0.1.2871, 10.0.1.4346

It sounds plausible that "loading the whole source table" could be a performance issue but I'm not familiar enough with the code to comment with any certainty.

Former Member
0 Kudos

I have revisited the performance of SQBulkCopy and it's still slower than code we've written to insert rows into a table one row at a time. And in the test case, the table has no indexes.

To be clear, our code connects to the database, creates an SACommand with a fixed INSERT INTO (...) VALUES (?, ?, ...) statement. It then creates one SAParameter for each column and adds them to the SACommand's Parameters collection. We then start looping over the data, change the values of the SAParameters, and call ExecuteNonQuery.

If I'm not mistaken, this is the fastest way to perform the operation short of a bulk load. But SABulkCopy is slower. I thought the purpose of a bulk load was to load the data faster than could be done using an INSERT statement.

VolkerBarth
Contributor
0 Kudos

Have you tried using SELECT FROM OPENSTRING(...)? - That should omit any parameter treatment as the data can be specified via a file or a blob variable...

Breck_Carter
Participant
0 Kudos

You may find some ideas in some of the articles about LOAD TABLE and BCP on my blog; here's a sampling...

UNLOAD and LOAD Via Named Pipes

UNLOAD To A Named Pipe Beats UNLOAD COMPRESSED

In-Memory Downloading from SQL Server (1)