cancel
Showing results for 
Search instead for 
Did you mean: 

Java Performance is awful in SQL Anywhere 12 - part 2

Former Member
3,715

I am not looking to re-code my Java functions as Transact SQL ...

I only brought that code sample as a demonstration of the simplest Java code I've written and embedded in the database.

I have much more complicated code that cannot be rewritten ...

The crux of my question is in the overhead cost of using Java functions in version 12 (with an external JVM) versus version 9 (with an internal JVM).

How can I improve THAT ?

::Siger Matt Edit below - added code from previous question::

Just to keep the code from the other question on the same page:

CREATE FUNCTION "dbo"."GetNumerics"( in inputString varchar(500) ) 
returns varchar(500)
external name 'com.gemtrak.general.StandardFunctions.getNumerics (Ljava/lang/String;)Ljava/lang/String;' language java

package com.gemtrak.general;

public class StandardFunctions
{
    public static String getNumerics(String inputString)
    {
        StringBuffer input = new StringBuffer(inputString);
        StringBuffer output = new StringBuffer();
        int theLength = input.length();
        char c;

        for ( int i=0; i<theLength; ++i )
        {
            c = input.charAt(i);
            if ( c >= '0' && c <= '9' )
            {
                output.append(c);
            }
        }

        return output.toString();
    }
}

There is probably an END statement that is missing, but it was missing in the original, this is just copy/paste.

Called by:

select dbo.getnumerics(isnull(telephone1,'')) t1, * into #x from locinfo where loctype = 'C'

Accepted Solutions (0)

Answers (1)

Answers (1)

Former Member

There were a number of very compelling reasons for us to move to a de-coupled Java VM beginning with SQL Anywhere version 10, some of which have been discussed above. To those, I would add that the reasons included the ability to run any version of the SUN (now Oracle) JRE to suit the needs of the application, and that, in the new architecture, any issue with the Java VM itself or with the Java function would not bring down or hang the server (or, minimally, a worker thread). Going to a de-coupled model, yet still supporting JDBC calls from the Java function back into the server, required us to develop some fairly sophisticated infrastructure inside the server kernel to support that; these identical mechanisms are used for CLR and other language environments as well. For Java procedures, the interface between the JVM and the server is JDBC and JDBC is a very heavyweight interface compared to the in-process, custom-made JRE we had built in ASA Version 6.

So the creation and deletion of that inter-process infrastructure is certainly more expensive than when the Java VM was inside the database server, there's absolutely no question about that. Indeed, the example Java program posted above is a perfect example to illustrate the expense of that overhead, as it would if similar logic was embedded in an SQL user-defined function (and even though the SQL UDF is actually executed "in process"); see my post on that subject for the gory details.

As Jeff mentioned, there may be some configuration settings that you can employ to make the performance marginally better. However, I would not expect hugely significant performance gains from doing so.

VolkerBarth
Contributor
0 Kudos

...as it would if similar logic was embedded in an SQL user-defined function...

Are you saying the overhead of a SQL UDF is - in general - somewhat similar to that of an external Java call? (That would come as a surprise to me.)

Or does it just mean any UDF is "impossible to optimize" and will therefore have bad influence on the query optimizer's choice of optimal plans - particularly when the UDF/function call is part of a SELECT list or WHERE clause and gets called multiple times per query?

Former Member
0 Kudos

Both. The cost of invoking a SQL user-defined function is orders of magnitude greater than the cost of, for example, reading a row off of page of disk during a sequential scan. That's what my 2008 blog post meant to illustrate. The optimization point is valid, too - it is (virtually) impossible to determine the cost/selectivity of a predicate involving a user-defined function. The server tries to estimate it by memoizing statistics from prior executions, but significant variance in the function's parameters, or the function's execution, have the potential to significantly impact the real cost of the chosen execution plan.

VolkerBarth
Contributor
0 Kudos

Ah, I see - and it seems very understandable that calling a function is much more overhead than reading a row.

However, I still would think that a SQL UDF (and stored procedure) should be significantly "cheaper" to call than any external function/procedure... At least that has been my experience over the years, and therefore I would generally prefer to write SQL functions over external ones if the SQL language features allow for that....

Former Member
0 Kudos

I will have to disagree on the point "I would generally prefer to write SQL functions over external ones if the SQL language features allow for that...." ... The power and elegance of Java over Transact-SQL is such that I certainly prefer to write in Java ... and it's such a shame that we have been essentially locked out of writing a sizable amount of Java due to this major performance degradation ... I think that Sybase should take a hard look at how to get the performance to a 20-30% degradation and NOT a 90% degradation as I've experienced ...

VolkerBarth
Contributor
0 Kudos

Well, I was talking about that programmer-friendly Watcom-SQL, not the Transact-SQL dialect:) - and about code that can be handled with SQL in a reasonable and maintainable way (so, perhaps not too much string manipulation and the like...)

For the Java performance, I certainly see your point but that's not my playground...