cancel
Showing results for 
Search instead for 
Did you mean: 

Slow performance calling ResultSet.next() for a count

Former Member
4,366

Hi,

I have a table with over 500k rows and when I execute a query that supposed to return me a count for a subset of the data (28k) the invocation of ResultSet.next() takes over 5 seconds to perform. I don't see any delay when calling executeQuery(), only when calling ResultSet.next() to get the actual count value. What's more interesting is that when I actually need to fetch the values and use

select top x start at y ...
the performance of ResultSet.next is much better though I'm fetching multiple columns.

I'm running UltraliteJ 12.0.1.3726 on Android OS using Galaxy Nexus device. The schema for the table in question is as follows:

CREATE TABLE "dba"."LIST_DATA_FLAT" ( 
"LIST_DATA_ID" integer NOT NULL 
, "LANGUAGE_ID" integer NOT NULL 
, "VALUE" varchar(512) NOT NULL 
, "DESCRIPTION" varchar(1024) NULL 
, "ATTR_0" varchar(512) NULL 
, "ATTR_1" varchar(512) NULL 
, "ATTR_2" varchar(512) NULL 
, "ATTR_3" varchar(512) NULL 
, "ATTR_4" varchar(512) NULL 
, "ATTR_5" varchar(512) NULL 
, "ATTR_6" varchar(512) NULL 
, "ATTR_7" varchar(512) NULL 
, "ATTR_8" varchar(512) NULL 
, "ATTR_9" varchar(512) NULL 
, "ATTR_OBJECT" long binary NULL 
, "TIMESTAMP_CREATED" timestamp DEFAULT CURRENT UTC TIMESTAMP NOT NULL 
, "TIMESTAMP_UPDATED" timestamp NOT NULL 
, "LISTS_ID" integer NOT NULL 
, "BACKEND_ID" varchar(128) NULL 
, "PARENT_ID" integer NOT NULL 
, "SORT_ORDER" integer NOT NULL 
, "LEVEL" integer NOT NULL 
, PRIMARY KEY ( "LIST_DATA_ID" asc
, "LANGUAGE_ID" asc ) 
)

FYI LISTS_ID is a foreign key.

The query that I'm running is this:

select count(distinct(ld.list_data_id))
from list_data_flat ld
where ld.lists_id = 384 AND ld.level = 0

Am I reaching the boundaries of Ultralite/UltraliteJ performance with that much data or is there something wrong with my schema?

Any hints/advice is much appreciated,

Ross

Accepted Solutions (1)

Accepted Solutions (1)

philippefbertrand
Participant
0 Kudos

Please confirm you posted the correct plan.

I'm guessing it is because of the distinct it has to build a temp table which, given the volume of rows, just beyond what the device can do.

If you wish to investigate this further, I suggest opening a support case.

Answers (1)

Answers (1)

philippefbertrand
Participant
0 Kudos

An index on lists_id would help.

Execute query just builds the query. The first next() is when it goes through the entire table and does the count. The first next on your top x query just has to get to the first row, which may or may not be real easy to find depending on your data.

You can use dbisql to play with your queries and view the resulting plans. Index scans are good - full table scans not good.

Former Member
0 Kudos

Hi Phillippe,

Thanks for the clarification. I have an index on lists_id already since it is a foreign key. I looked at the execution plan in dbisql and it does indeed do an index scan. This is what I get:

group-single[filter[index-scan(LIST_DATA_FLAT,LISTS)]]

philippefbertrand
Participant
0 Kudos

I'm not familiar with Galaxy Nexus device so I can't comment on its performance. How much memory is in use?

The plan does look very good. Perhaps an index on lists_id, level if there are many different levels. Really depends on the distribution of your data. Each extra index slows down insert/update/delete operations and requires more space. Are you using the default max hash size for indexes when you created your database?

Former Member
0 Kudos

The device has 1GB of memory with 300MB free, the app itself uses about 40MB. The hash size is default of 4 - haven't touched that as I'm not sure how it would affect the rest of the db performance. I've tried running the same scenario using Ultralite db on PC (same exact schema) using custom JNI and had the same result so I doubt it has something to do with the device. The table is mostly used for querying with very little updates done to it so I'll try to add the new index for (lists_id,level) and see if it does anything.

Former Member
0 Kudos

Also, the db is started with cache_size set to 4MB - not sure if that matters or not.

philippefbertrand
Participant
0 Kudos

Compare the count of rows when you remove the level=0 predicate and you will see how many extra rows you are processing without the index.

Former Member
0 Kudos

I've tried adding the index (lists_id,level) but it did not affect the query performance one bit. The data for lists_id = 384 has level = 0 so there are no extra entries that have any other level. Does it make sense to open a case with SAP/Sybase for this or is it just that my number of rows is a bit much for UlraliteJ?

philippefbertrand
Participant
0 Kudos

Given the distribution of the data, it makes sense the extra index didn't help. What happens if you try a count(*) instead of the count you have? (I don't expect a difference but its worth the experiment just to make sure)

What is the page size in your database?

Former Member
0 Kudos

The page size of the database is 4k. I've run some tests using different types of count queries and got pretty different values depending on the query:

select count(*) from list_data_flat ld where ld.lists_id = 384 and ld.level = 0 >>> 1632 ms
select count(ld.list_data_id) from list_data_flat ld where ld.lists_id = 384 and ld.level = 0 >>> 707 ms
select count(distinct(ld.list_data_id)) from list_data_flat ld where ld.lists_id = 384 AND ld.level = 0 >>> 7202 ms

select count(*) from list_data_flat ld where ld.lists_id = 384 and ld.level = 0 >>> 1592 ms select count(ld.list_data_id) from list_data_flat ld where ld.lists_id = 384 and ld.level = 0 >>> 761 ms select count(distinct(ld.list_data_id)) from list_data_flat ld where ld.lists_id = 384 AND ld.level = 0 >>> 10155 ms

The third query is what being executed right now by the app. Looks like second query is the fastest, I'll have to figure out how to account for non-distinct rows if I were to use it.

philippefbertrand
Participant
0 Kudos

But list_data_id is the primary key so it should always be distinct! (your original post didn't include the distinct part)

Former Member
0 Kudos

Wow I feel like an idiot. My apologies, I've managed to post the wrong schema AND SQL which of course from your context changes things dramatically. I've updated the schema with the proper one, the pk is LIST_DATA_ID and LANGUAGE_ID, that's why I'm using distinct(list_data_id) in my count query.