Monday
Hello,
For a bit of context, I'm using ASE 16.0 SP03, an up-to-date copy of libsybdrvodb.so, Python 3.8.10, and an up-to-date copy of pyodbc ( https://github.com/mkleehammer/pyodbc )
I found a strange issue when running a simple "select" query through using pyodbc's "?" syntax. When I run this query directly in a sqsh console, it executes *very* quickly:
select * from files where filename='pgb.197901.bz2';
Adaptive Server cpu time: 0 ms. Adaptive Server elapsed time: 0 ms.
However, when I run that exact same query in a python script, it takes over a minute! The exact syntax I'm using is this:
I have found one other workaround which doesn't work perfectly here: https://community.sap.com/t5/technology-q-a/ase-15-5-poor-performance-from-odbc/qaq-p/12120748 , which suggests adding "DYNAMICPREPARE=0;" to my connection string. That indeed speeds up performance! When I use this connection string:
I'm not sure what to do--am I missing something here? Is there some way to speed up our queries so that they don't run orders-of-magnitude more slowly that *doesn't* break our existing stored procedures? Am I doing something strange in my connection information? Is there some argument that I'm missing out on?
Thank you for your help!
Request clarification before answering.
Hi,
I tried to reproduce with newer versions, but I didn't see any performance issues.
$ python3 --version
Python 3.11.2
$ strings $SYBASE/DataAccess64/ODBC/lib/libsybdrvodb.so | grep "SAP ASE"
SAP ASE ODBC Driver/16.1.00.00/16.1 GA/P/x86_64/Linux/drv161/Build 1/64-bit/Normal/Fri Dec 20 UTC 15:11:35 2024/SQLLEN8
$ setenv PYTHONPATH /work/pyodbc-5.1.0/build/lib.linux-x86_64-cpython-311
$ vi odbc.ini
[sampledsn]
Driver = /my/sybase/DataAccess64/ODBC/lib/libsybdrvodb.so
Server = asehost
Port = 5000
UserID = sa
Password = password
$ setenv ODBCINI /my/sybase/DataAccess64/ODBC/odbc.ini
$ cat test.py
import pyodbc
import time
conn = pyodbc.connect("DSN=sampledsn;", dynamicprepare=1)
print("Successfully established connection")
cur = conn.cursor()
start_time = time.perf_counter()
cur.execute("select * from sysusers where name = ?", ['guest'])
end_time = time.perf_counter()
duration = end_time - start_time
while True:
row = cur.fetchall()
if (not row):
break
print("%s" % (row[0]))
print ("Duration: %f" % duration)
cur.close()
conn.close()
dynamicprepare=1
$ vi test.py
$ python3 test.py
Successfully established connection
(-1, 2, 0, 'guest', None, 0)
Duration: 0.002088
dynamicprepare=0
$ vi test.py
$ python3 test.py
Successfully established connection
(-1, 2, 0, 'guest', None, 0)
Duration: 0.001405
I was, however, able to reproduce the dynamicprepare issue.
1> create proc test @V1 varchar(10) as select * from sysusers where name = @V1
2> go
1> test guest
2> go
suid uid gid name
environ
user_status
----------- ----------- ----------- ------------------------------
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-----------
-1 2 0 guest
NULL
0
(1 row affected)
(return status = 0)
change query
$ vi test.py
#cur.execute("select * from sysusers where name = ?", ['guest'])
cur.execute("exec test @V1=?", ['guest'])
$ python3 test.py
Successfully established connection
Traceback (most recent call last):
File "/my/sybase/OCS-16_1/sample/python/test.py", line 9, in <module>
cur.execute("exec test @V1=?", ['guest'])
pyodbc.ProgrammingError: ('42000', "[42000] [SAP][ASE ODBC Driver][Adaptive Server Enterprise]Incorrect syntax near '?'.\n (102) (SQLExecDirectW)")
dynamicprepare=1
It actually doesn't change the query into a stored proc but has a ?:
LANGUAGE Token (0x21); variable length.
Length [4]: 16
Status [1]: PARAMETERIZED (0x01)
Text Length [0]: [15]
Text [15]: "exec test @V1=?"
PARAMFMT Token (0xEC); variable length.
Length [2]: 14
Number of Params [2]: 1
Param 1
Name Length [1]: 0
Status [1]: PARAM_NULLALLOWED (0x20)
User Type [4]: 35
Data Type [1]: LONGBINARY
Length [4]: 255
Locale Length [1]: 0
PARAMS Token (0xD7); variable length.
Param 1
Length [4]: 10
Param data [10]: "guest"
(as bytes) [10]: 0x67007500650073007400
Capture Record Header
Source [4]: RESPONSE (0x00000002)
Length [4]: 72
PDU Header
TDS Packet Type [1]: BUF_RESPONSE (0x04)
Status [1]: BUFSTAT_EOM (0x01)
Length [2]: 72
Channel [2]: 0
Packet No. [1]: 0
Window [1]: 0
EED Token (0xE5); variable length.
Length [2]: 52
Message Number [4]: 102
Message State [1]: 181
Message Class [1]: 15
SQL State Length [1]: 5
SQL State [5]: "42000"
Status [1]: NO_EED (0x00)
Transaction State [2]: TDS_NOT_IN_TRAN (0x0000)
Message Length [2]: 27
Message Text [27]: "Incorrect syntax near '?'.
"
Server Name Length [1]: 4
Server Name [4]: "asename"
Stored Proc. Name Length [1]: 0
Line Number [2]: 1
Interesting enough, when you use dynamicprepare=0 it uses a named parameter
LANGUAGE Token (0x21); variable length.
Length [4]: 22
Status [1]: PARAMETERIZED (0x01)
Text Length [0]: [21]
Text [21]: "exec test @V1=@dr_ta0"
PARAMFMT Token (0xEC); variable length.
Length [2]: 21
Number of Params [2]: 1
Param 1
Name Length [1]: 7
Name [7]: "@dr_ta0"
Status [1]: PARAM_NULLALLOWED (0x20)
User Type [4]: 35
Data Type [1]: LONGBINARY
Length [4]: 255
Locale Length [1]: 0
PARAMS Token (0xD7); variable length.
Param 1
Length [4]: 10
Param data [10]: "guest"
(as bytes) [10]: 0x67007500650073007400
I did get an error about chained mode with my proc so I had to add "autocommit=1" to the connection string, but it succeeds if you use command:
cur.execute("test @V1=?", ['guest'])
Note: we remove the exec part of the query.
Let me know if you see performance issues with the example above and if this workaround is successful for the dynamicprepare=0.
Hope this helps,
Ryan
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
One other note for investigation purposes: you noticed that the "SELECT" statement didn't perform differently between your two DYNAMICPREPARE=1 tests. Well, I've found in some extra testing that this difference only exists when querying large tables (mine has 83 million rows), which might make a difference. My table *has* an index on the column I'm querying, and running "set showplan on" before my query reveals that the reason why it runs more slowly is because the plan says "Positioning at index start", whereas running the same query in sqsh says "Positioning by key". To be clear, the slow-running "SELECT" query *does* use the index, which really surprised me! I'm not too familiar with what that "positioning" statement means, but the plans are otherwise identical. I don't really know what to make of this, but it could be useful for further investigation
Whoops, entered a bad answer here by mistake. No way to delete it.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
runs more slowly is because the plan says "Positioning at index start", whereas running the same query in sqsh says "Positioning by key".
Positioning by key means the index was used to quickly zero in on the desired row(s) based on a distinct search value (eg, key='abc').
Positioning at index start implies an index scan (as opposed to a table scan) and will occur when (in this case) the only search clause is on the index key but the search value is not a distinct value (eg, key like '%abc%', or key=substring('abc',1,3)). In this situation ASE is left with scanning the entire table and applying the search clause against every row; if there's an index on the search column and ASE determines the index resides on fewer pages than the raw data then it will opt for an index scan. In your case ASE is opting to scan the index (fewer pages than scanning the entire table) while applying the search clause against every key value in the index ... this will (usually) be faster than a full table scan but it will still be quite a bit slower than a query that can benefit from Positioning by key.
Why is ASE opting for an index scan? This would require knowing what exactly ASE is seeing as the search value (submitted by your python script). This is where the previous comment about the DBA and MDA tables comes into play ... have the DBA capture the actual SQL being submitted by your application. [Or use the trace method provided by Ryan.] Having the actual (likely mangled) search value in hand should get you a step closer to determining if there's an issue with the way you're constructing the query or an issue with the library you're using.
Does the 'slow' python query eventually return the same results as the sqsh query? or does it return no results?
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Here's a SAP Note about a similar behavior, but with cached statements (automatically precompiled SQL statements) instead of prepared statements: https://me.sap.com/notes/2077161/E
Query takes a long time to complete due to bad table scan when statement cache or literal autoparam is on
ASE query showplan shows that optimizer is choosing a table scan instead of more efficient index scan when statement cache and literal autoparam are on.
Optimizer chooses the more efficient index scan when literal autoparam is off
Query contains multiple copies of the same literal value
Environment
SAP Adaptive Server Enterprise (ASE) 15.7
SAP Adaptive Server Enterprise (ASE) 16.0
Cause
When literal autoparam is used, the optimizer cannot assume that the literal values are all the same, as different values could be substituted for the next execution and the cached plan will need to be reasonable for both circumstances. When literal autoparam is not active, the optimizer can generate a better plan based on knowing the values of the literals.
The issue has been identified and logged under SAP CR 644531. Its description reads "When statement cache or literal autoparam is on, the query chooses a bad table scan that takes a long time to complete. With statement cache off, or literal autoparam off, the query chooses the index."
Resolution
There is currently no target Fix Pack or Service Pack scheduled for SAP CR 644531. It is being reviewed by our Product Group. Subscribe to this KBA to be automatically notified when SAP CR 644531is fixed.
Workarounds are:
set statement_cache off
or
sp_configure "literal autoparam", 0
Ahhh, you updated your comment while I was typing mine.
-------------
In OP's case I'd opt for set statement_cache off; select ...; set statement_cache on.
Disabling configs (eg, literal autoparam) at the server level should be thoroughly tested as said changes could adversely affect other queries, users and/or applications. ("Duh, Mark!" ?)
Maybe it's the thing were precompiled SQL statement (either via a "prepare" call or with ASE statement caching) has to use a different query plan because it can't optimize for a specific value of the filename.
Since DYNAMICPREPARE=0 in your connection string shouldn't affect calling existing stored procs (according to Chris), it sounds like that should be safe to use.
Also, try running the good query with "set showplan on" to see what index it's using, then add an index hint to the cursor.execute() call. Eg.
cursor.execute("select * from files (index <myindex)) where filename=?", ['pgb.197901.bz2'])
Based on https://github.com/mkleehammer/pyodbc/issues/214 it sounds like a statement is automatically prepared when cursor.execute() is called, but then subsequent cursor.execute() with similar queries will automatically use the same prepared statement. If this is true, and if auto-prepare on cursor.execute() is on, and if you execute a large number of queries that can't be recognized as similar, you stand the chance creating a huge number of unused prepared statement memory structures in the ASE server. I had a large ASE server overload once because of this.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
User | Count |
---|---|
88 | |
10 | |
9 | |
8 | |
7 | |
6 | |
6 | |
5 | |
5 | |
5 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.