cancel
Showing results for 
Search instead for 
Did you mean: 

SA12 UDF in select list evaluation time

Former Member
3,427

We are trying to upgrade our commercial application from SA9 to SA12. But there is serious performance problem. We are using query like:

SELECT name, getPrice(productId, _priceType_) FROM products ORDER BY name

priceType is provided by the application. The table is indexed by name. There is over 100k products. Function is evaluated in ~1ms, uses few other tables.

In SA9 query was performed in <1s, the function was called only for fetched rows. User can see first screen of product list in 1s.

In SA12 the function is called for each row in the table, and next during fetch again. So the query is executed in 2 minutes. User has to wait 2 minutes to view product list. This is independent of Optimization_goal option. Even for value-sensitive cursor type, where evaluating function for each row before fetch makes no sense, server performs evaluation.

Is there any way to force SA12 to behave like SA9? I know we can found workaround for this case, but there is over 100 of such queries in the application. It is significant cost to change everything.

Breck_Carter
Participant

Are you sure all 100 queries experience the same degradation of performance, or are you guessing at that?

User-defined functions have always been dangerous to performance, requiring care in usage, and this has increased over time. As a result function performance is sometimes a big problem when moving from version to version. The best approach is to look at the graphical plan with statistics to see what is going on. It may be possible to make changes (e.g., materialized views) that don't involve rewriting every query.

Former Member
0 Kudos

No, I wasn't checking all queries. But all tested work as described. The plan is very simple. If "order by" is by indexed column this is simple "scan table using index". Query inside the function is listed as subquery. Still I wonder: why evaluation of UDF is performed for each table row for value-sensitive cursor? The results cannot be used during fetch, so it is wasting time. Even if I turn off prefetching and reduce size of fetch buffers, still evaluation is performed. First version of our application was using ASA 5, and UDF was always evaluated during fetch, unless used for sorting or grouping.

Breck_Carter
Participant
0 Kudos

FWIW: "Rows cannot be prefetched for value-sensitive cursors. This requirement may affect performance." http://dcx.sybase.com/index.html#1201/en/dbprogramming/keyset-cursor-sqlapp.html

Breck_Carter
Participant
0 Kudos

What do you mean by "The results cannot be used during fetch"?

The rules for value-sensitive cursors may have (probably have?) changed over time, and they are VERY complex IMO: http://dcx.sybase.com/index.html#1201/en/dbprogramming/keyset-cursor-sqlapp.html

In particular, it seems that any changes made since the cursor was opened must be detected: "If a row was updated or may have been updated since the cursor was opened, SQL Anywhere returns a SQLE_ROW_UPDATED_WARNING when the row is fetched." ...that implies (to me, at least) the entire result set must be evaluated before the first fetch.

Perhaps someone who knows the internal workings can chime in 🙂

Former Member
0 Kudos

How simple is the function? Does it contain a single expression setting the return value?

Former Member
0 Kudos

Yes, you're right. Old values for value-sensitive cursor can be useful.
I was testing other cursor types too, but I will check again descriptions, maybe I can find any clues.

After check: sensitive cursor should be the best. But no success.
I was trying with DYNAMIC SCROLL cursor with DMRF=YES.

Former Member
0 Kudos

6 x if, 3 x select. 3 procedure calls (very simple procedure). Some assignments and single exit point (return local variable).

VolkerBarth
Contributor

I guess Elmi asked whether the UDF may be inlined - and that seems not possible because of the fact that is does comprise much more than a single SELECT INTO or RETURN statement...

VolkerBarth
Contributor
0 Kudos

FWIW, here's a link to a detailed whitepaper by Ani Nica called Query Processing Based on SQL Anywhere 12.0.1 Architecture - it has some information on UDFs, too.

Former Member
0 Kudos

Yes, that is what I meant, Volker, and you are right that a UDF with an IF statement would not be inlined. I'll need to do some more testing.

VolkerBarth
Contributor
0 Kudos

Another wild guess: I would assume that this UDF is a deterministic one. Does it make a difference when you change it to "NOT DETERMINISTIC" (or vice versa)? - It may have an impact whether the function's return values will be cached internally or not (though in my understanding, this caching would take place "on the fly" when processing each separate function call and not by an intial run to materialize the whole request...)

VolkerBarth
Contributor
0 Kudos

FWIW: Sensitive (aka DYNAMIC SCROLL) cursors should not prefetch with the default setting of the PREFETCH option ("conditional"), so specifying DMRF=YES should not be necessary by default.


When using sensitive cursors, are you sure the database server could apply them as desired (i.e. there was no warning that a different cursor type than requested was used, I guess with SQLCODE 121 SQLE_CURSOR_OPTIONS_CHANGED)?

I'm asking as I agree on your understanding that a sensitive cursor should not allow the evaluation of all rows before any row is returned... - still puzzled...

Former Member
0 Kudos

I was checking both. Behavior is same
The function (as a matter of fact called procedure) can have side effects, there is CREATE VARIABLE statement. Actually not in the execution path.
Can it cause server to mark function as not deterministic, against explicit declaration?

Former Member
0 Kudos

There is no warning.

VolkerBarth
Contributor
0 Kudos

that implies (to me, at least) the entire result set must be evaluated before the first fetch.

I would assume that the warning will only be displayed for already (pre-)fetched rows (so the cursor may internally buffer the previous row contents), thus not requiring a pre-evaluation of the whole result set. At least that's what the cited warning SQLE_ROW_UPDATED_WARNING (SQLCODE 104) does tell:

You fetched a row from a SCROLL cursor. The row was previously fetched from the same cursor, and one or more columns in the row has been updated since the previous fetch. The column(s) updated may or may not be fetched by the cursor; this warning just indicates that the row from the table has been updated. If the cursor involves more than one table, a row from one or more of the tables has been updated.

VolkerBarth
Contributor
0 Kudos

In SA12 the function is called for each row in the table, and next during fetch again.

Do you mean the function is called twice for each row in the result set (once initially in the discussed undesired "full evaluation" just before the first fetch, and a second time during the real fetch of the according row)?

Can you add a MESSAGE statement within the function (say, noting the according productId) to document the funcion call sequence?

Former Member
0 Kudos

Yes, there are 2 passes, each in order defined in ORDER BY.
That means by index, but not by primary key

VolkerBarth
Contributor
0 Kudos

What API are you using? Could it be the query is executed twice, and the first run is just needed to describe the result set?

(Note: That's wild wild wild guessing... - just as DBISQL seems to perform a second query request to describe the needed result set... - I do not claim at all that this would apply here in any way)

Former Member
0 Kudos

For testing purposes I have prepared SQL procedure. In the procedure there is FOR statement. This is the easiest way to control cursor type and fetching.
In the production environment embedded SQL is used.
There is no difference.
Simple call in DBISQL works similarly. But I will try to investigate this a bit. Maybe this is really Describe? But: if so, how to avoid it?

VolkerBarth
Contributor
0 Kudos

Have you tried testing with dbisqlc instead of DBISQL? (I don't claim it will make a difference here, though AFAIK it tends to have less overhead...)

Former Member
0 Kudos

No difference

Former Member
0 Kudos

Description and preparation is rather fast. Execution itself is time consuming.

VolkerBarth
Contributor
0 Kudos

BTW: What isolation level are you using (as that has implications for cursors)?

Do you note differences in locking (via sa_locks or the like) between v9 and v12?

Accepted Solutions (0)

Answers (5)

Answers (5)

Former Member

In version 10.0, expression evaluation in SQLA changed to be generally more efficient. A consequence is that in some cases expressions are evaluated before they are needed -- this reduces the overhead needed to track which expressions are currently valid. For the case of absolute and relative fetches, the changes that generally improve performance cause a degradation as they lead to evaluations of potentially expensive UDF or subquery expressions that are not retrieved by the client. In the case that the Row_counts option is on, an absolute fetch is used to count the rows in the cursor unless the query optimizer can determine a good estimate of the number of rows. For example, "SELECT * FROM T" can usually estimate the number of rows returned based on the number of rows in T. Various features of the query (predicates in the WHERE clause, use of computed columns or joins) may affect whether the estimate is considered sufficiently accurate to avoid this fetch.

In order to improve the performance in these cases, a server change has been made to avoid evaluating select expressions for rows skipped by a relative or absolute fetch (including those for the purpose of the Row_counts option). The change is expected to be available in a future SP with build number 4089 or higher (version 12) or 1858 or higher (version 16). The associated release notes may reference engineering case # 760144.

Please see also this same answer posted to a related question.

Breck_Carter
Participant

Glenn Paulley on UDFs (until recently he was head of the SQL Anywhere query engine development team): http://glennpaulley.ca/conestoga/2014/02/set-level-operations-do-matter/

Former Member
0 Kudos

I've found the reason of this strange behavior.
Product table contains computed column referencing another table. Removing this column causes function to be evaluated only during fetch.

VolkerBarth
Contributor
0 Kudos

Well, and that has not happened with SA9?

Breck_Carter
Participant
0 Kudos

Oops, I just accidentally deleted ODL_Sybase's comment when trying to delete my own! Sorry!

ODL_Sybase's comment talked about how the symptom has returned, and it required ROW_COUNTS to be on to see the bad behavior.

Breck_Carter
Participant
0 Kudos

ROW_COUNTS has a place of honor in the 2008 article How To Make SQL Anywhere Slow, and it has always been a bad idea, in SQL Anywhere 16, in ASA 9 (see below), even in SQL Anywhere 5 (see below).

===== ASA 9 =====

ASA Database Administration Guide 
Database Options 
Alphabetical list of options

ROW_COUNTS option [database]

Function 
Specifies whether the database will always count the number of rows in a query when it is opened.

Allowed values 
ON, OFF

Scope 
Can be set for an individual connection or for the PUBLIC group. Takes effect immediately.

Default 
OFF

Description 
If this option is set to OFF, the row count is usually only an estimate. 
If this option is set to ON, the row count is always accurate.

Warning 
When ROW_COUNTS is set to ON, it may take significantly longer to execute queries. 
In fact, it will usually cause Adaptive Server Anywhere to execute the query twice, 
doubling the execution time.

===== SQL Anywhere 5 =====

ROW_COUNTS

Specifies whether the database will always count the number of rows in a query 
when it is opened. If this option is off, the row count, the row count is usually 
only an estimate.  If this option is on, the row count is always accurate but 
opening queries may take significantly longer.
Breck_Carter
Participant
0 Kudos

The question "How do I [satisfy some requirement] without turning ROW_COUNTS on?" has been asked in the past... and it could be asked again if you don't want to search.

Former Member
0 Kudos

Documentation say about "row_counts":
When row_counts is set to On, it may take significantly longer to execute queries. In fact, it will usually cause SQL Anywhere to execute the query twice, doubling the execution time.
It is Ok, but why all fields are evaluated to count rows?

Former Member
0 Kudos

Yes, we probably can manually execute "select count(*)" or "select count(distinct ...)" when GROUP BY used.
There is little effort to expand aliases used in WHERE clause, but still much less than rewriting queries totally.

Former Member
0 Kudos

By the way: cannot find this question. Can you post a link?

VolkerBarth
Contributor
0 Kudos

but why all fields are evaluated to count rows?

That's still a mystery to me, too - as you state, the bad performance has to do with computed columns, and computed columns should not be re-calculated just because they are queried, cf. this doc topic because that would make them simply superfluous...

Breck_Carter
Participant
0 Kudos

Please post this as a new question; this thread is becoming unmanageable 🙂

justin_willey
Participant
0 Kudos

Your overall symptoms (but not necessarily all the details) aren't too dis-similar form some of the difficulties we have been having in migrating from v10.0.1 to v16.0.0. Certainly they have revolved around unnecessary repeated execution of UDFs that doesn't happen in v10.0.1.

We have been seeing steady incremental improvements with these issues over the last year and so I'd strongly recommend trying your application on a recent version of 16.0.0.

I'm afraid I can't speak for v12 as while we encountered the same problems in v12.0.1 some time ago, we haven't been testing against any updates since v16 came out.

MarkCulp
Participant
0 Kudos

What is your optimization goal setting set to? i.e. what is output of

select connection_property( 'optimization_goal' );

If it is set to ALL-ROWS then the query engine may choose to materialize the entire result set before returning the first row back to the client.

Try setting the optimization_goal to FIRST-ROW.

Breck_Carter
Participant
0 Kudos

The OP did say "This is independent of Optimization_goal option."

MarkCulp
Participant
0 Kudos

Ok, missed that sentence. Ignore my answer.... it was a long shot guess 😉

Breck_Carter
Participant

Are there any cursor gurus lurking about? ...this question cries out for an Alpha Plus to answer it 🙂

VolkerBarth
Contributor
0 Kudos

That's an attempt to "prefetch" those query processing gurus, isn't it?

Can't tell whether it has been successful by now:(