on 2014 Aug 19 9:14 AM
The work table is this:
SELECT client_ref AS account_ref INTO #latest_account FROM accounts WHERE client_ref<>'';
The ultimate query (simplified into performing COUNTs instead of listing the fields) is this:
SELECT count(ci2.COLLECTIONS_ITEM_ID) as ci_count, count(vcig.ALLOCATED_GROUP_ID) as vcig_count FROM #latest_account AS l JOIN collections_item AS ci2 ON ci2.collections_account_id=l.account_ref JOIN v_collections_item_group AS vcig on vcig.collections_item_id = ci2.collections_item_id
The view is defined as such:
SELECT CI.COLLECTIONS_ITEM_ID, SLAG.AUTOKEY as ALLOCATED_GROUP_ID, SLAG.COMPONENT_BALANCE as DEBT_BALANCE, SLAG.INTEREST_BALANCE as INTEREST_BALANCE, SLAG.GROUP_BALANCE as TOTAL_BALANCE FROM COLLECTIONS_ITEM as CI JOIN DBA.EVENT_TRANS as ET on ET.PROCEDURE_CODE = 'DR' and ET.JOB_NO = CI.COLLECTIONS_ACCOUNT_ID and ET.SOURCE_ITEM_ID = CI.COLLECTIONS_ITEM_ID JOIN SL_ALLOCATED_GROUP as SLAG on SLAG.ORIGINAL_TRANS_ID = ET.AUTOKEY WHERE ET.TRANS_TYPE = 'C'
If I change the first query to TOP xxx I get these timings:
I had to run the complete job yesterday for the end-users report (278,000 records in #latest_account) - here is my log: 18/08/2014 09:02:20 !Executing: SELECT client_ref AS account_ref...
20,000 = 1.81secs, ci_count=88,494 30,000 = 2.67secs, ci_count=124,607 31,000 = 2.61secs, ci_count=127,804 32,000 = aborted after 5 minutes 31,800 = 2.75secs, ci_count=130,126 31,825 = aborted after 5 minutes
18/08/2014 09:02:42 !277,896 rows affected.
18/08/2014 09:02:42 !Executing: CREATE INDEX ix ON #latest_account (CLIENT_REF, ACCOUNT_REF)
18/08/2014 09:02:43 !277,896 rows affected.
18/08/2014 09:02:43 !Executing: DROP TABLE IF EXISTS #report
18/08/2014 09:02:43 !-1 rows affected.
18/08/2014 09:02:43 !Executing: SELECT a.client_ref AS ... INTO #report
19/08/2014 03:45:26 !524,073 rows affected.Does anyone have any ideas why this is taking so long?
Request clarification before answering.
(see the attached plans fast.saplan and slow.saplan)
It looks like the optimizer has picked two different plans based on the estimated number of rows in #latest_account. When the estimated number of rows is 32,000 it puts #latest_account (l) higher in the tree, and does a full sequential scan on ci2... that's bad, because ci2 has 1141845 rows.
When the estimated number of rows in #latest_account is 31,000 it pushed that table down to the bottom, does a sequential scan on that table (which is OK because it is small) and does an index scan on ci2... much better than a sequential scan.
The optimizer picks plans dynamically, when the statement is executed, so the numbers of rows in different tables can have a big effect. Sometimes, it does the wrong thing.
One BRUTE FORCE technique that sometimes works is to force the optimizer to always use an index that it uses in the fast plan.
If you click on the ci2 box in the fast plan, you will see it says this...
Index Scan
Scan ci2 using index COLLECTIONS_ACCOUNT
Try changing this:
JOIN collections_item AS ci2
to this, to force that index to always be used:
JOIN collections_item AS ci2 WITH ( FORCE INDEX ( COLLECTIONS_ACCOUNT ) )
That technique is dangerous, because you might be telling it to do a bad thing in other cases (like maybe when #latest_account has millions of rows).
Other problems...
Your plans show only the estimates, not the actual results. Try clicking on the Statistics level dropdown to pick Detailed and node statistics in the future.
Your query seems to be using a view. It's hard to understand the plan without seeing the view. ...you did show us the view earlier 🙂
It is also hard to understand the plan without seeing the tables. There might be a completely different way to write the query, to make it always run fast, maybe even faster than the fast query.
A later EBF of 11.0.1 might help, or better yet, a newer version of SQL Anywhere... if you upgrade, be sure try taking out the FORCE INDEX clause, it may be making things worse.
fast ---- DROP TABLE IF EXISTS #latest_account; SELECT TOP 31000 file_no AS account_ref INTO #latest_account FROM collections_account WHERE client_ref<>''; SELECT count(ci2.COLLECTIONS_ITEM_ID) as ci_count, count(vcig.ALLOCATED_GROUP_ID) as vcig_count FROM #latest_account AS l JOIN collections_item AS ci2 ON ci2.collections_account_id=l.account_ref JOIN v_collections_item_group AS vcig on vcig.collections_item_id = ci2.collections_item_id; slow ---- DROP TABLE IF EXISTS #latest_account; SELECT TOP 32000 file_no AS account_ref INTO #latest_account FROM collections_account WHERE client_ref<>''; SELECT count(ci2.COLLECTIONS_ITEM_ID) as ci_count, count(vcig.ALLOCATED_GROUP_ID) as vcig_count FROM #latest_account AS l JOIN collections_item AS ci2 ON ci2.collections_account_id=l.account_ref JOIN v_collections_item_group AS vcig on vcig.collections_item_id = ci2.collections_item_id
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
FORCE doesn't work - it gives this error...
Could not execute statement.
Syntax error near 'FORCE' on line 5
SQLCODE=-131, ODBC 3 State="42000"
Line 1, column 1
As I've never forced hints before, I have tried the WITH before and after the ON and even without an ON...
Here is a bit more information about the tables involved:
DESCRIBE collections_item: Column,Type,Nullable,Primary Key 'COLLECTIONS_ITEM_ID','integer',0,1 'COLLECTIONS_ACCOUNT_ID','integer',0,0 'INVOICE_REF','varchar(30)',1,0 'SOURCE_REF','varchar(30)',1,0 'SOURCE_DATE','date',1,0 'SOURCE_TIME','time',1,0 'INVOICE_DATE','date',0,0 'ORDER_REF','varchar(30)',1,0 'NARRATIVE','varchar(50)',1,0 'NET_VALUE','numeric(19,4)',1,0 'VAT_RATE','numeric(18,2)',1,0 'VAT_VALUE','numeric(19,4)',1,0 'DELAY_DAYS','integer',1,0 'EFFECTIVE_DATE','date',1,0 'VAT_CODE','varchar(5)',1,0 'GROSS_VALUE','numeric(19,4)',0,0 'TIMESTAMP','timestamp',1,0 'DISPUTED','char(1)',1,0 'DISPUTE_CODE','varchar(10)',1,0 'STATUS','varchar(12)',1,0 'REMOVE_BALANCE','char(1)',1,0 'LITIGATION_SWITCH_ADDED','char(1)',1,0 'REMOVE_DEBT','numeric(19,4)',1,0 'REMOVE_INTEREST','numeric(19,4)',1,0 'DUE_DATE','date',1,0 'DISPUTE_DATE','date',1,0 'DISPUTE_NOTE','varchar(150)',1,0 'EXPECTED_CLIENT_BALANCE','numeric(19,4)',1,0 'EXPECTED_CLIENT_BALANCE_DATE','date',1,0
1,141,845 rows with index (FOREIGN) COLLECTIONS_ACCOUNT ON collections_account_id
DESCRIBE event_trans; Column,Type,Nullable,Primary Key 'AUTOKEY','integer',0,1 'EVENT_HISTORY','integer',0,0 'PROCEDURE_CODE','varchar(3)',0,0 'JOB_NO','integer',0,0 'REFERENCE','varchar(25)',1,0 'NARRATIVE','varchar(45)',0,0 'DEBIT','numeric(18,2)',1,0 'CREDIT','numeric(18,2)',1,0 'TRANS_TYPE','varchar(3)',0,0 'NEW_BALANCE','numeric(19,4)',1,0 'ALLOC_METHOD_USED','varchar(3)',1,0 'PAYMENT_TYPE','varchar(4)',1,0 'SOURCE_CURRENCY','varchar(3)',1,0 'SOURCE_DEBIT','numeric(19,4)',1,0 'SOURCE_CREDIT','numeric(19,4)',1,0 'TIMESTAMP','timestamp',1,0 'PRINCIPAL_BALANCE','numeric(19,4)',1,0 'FEES_BALANCE','numeric(19,4)',1,0 'COSTS_BALANCE','numeric(19,4)',1,0 'INTEREST_BALANCE','numeric(19,4)',1,0 'EXCHANGE_RATE','numeric(18,6)',1,0 'SORT_CODE','varchar(8)',1,0 'BANK_ACCOUNT','varchar(8)',1,0 'ALLOCATION_DONE','char(1)',1,0 'IMPORT_EXPORT_STATUS','varchar(1)',1,0 'PAYER_NAME','varchar(30)',1,0 'PAYMENT_IN_FULL','char(1)',1,0 'EFFECTIVE_DATE','date',1,0 'SOURCE_ITEM_ID','integer',1,0 'APPLY_TO','integer',1,0 'INVOICE_DATE','date',1,0 'CREDIT_TO_TRANS_TYPE','varchar(3)',1,0 'DISPUTED','char(1)',1,0 'CREATE_NEW_GROUP','varchar(1)',1,0 'REMIT_DONE','char(1)',1,0 'REMIT_COMMISSIONABLE','numeric(19,4)',1,0 'REMIT_COMMISSION_RATE','numeric(18,2)',1,0 'REMIT_DATE','date',1,0 'REMIT_COMMISSION','numeric(19,4)',1,0 'REMIT_AMOUNT','numeric(19,4)',1,0 'REMIT_RUN_ID','integer',1,0 'REMIT_VAT_RATE','numeric(18,2)',1,0 'REMIT_VAT_ON_COMMISSION','numeric(19,4)',1,0 'REMIT_AMOUNT_GROSS','numeric(19,4)',1,0 'REMIT_AMOUNT_NET','numeric(19,4)',1,0 'REMIT_RESIDUAL_INVOICE','numeric(19,4)',1,0 'MANUALLY_ALLOCATED','char(1)',1,0 'SKIP_BILL_CHARGES','char(1)',1,0 'INPUT_TRANS_ID','integer',1,0 'INCLUDES_VAT_RATE','numeric(18,2)',1,0 'INCLUDES_VAT_AMOUNT','numeric(19,4)',1,0 'INTEREST_RATE_CODE','varchar(5)',1,0 'COLLECTIONS_ONLY_FLAG','char(1)',1,0 'LAST_COLLECTOR','varchar(3)',1,0 'LAST_COLLECTOR_ACTION_DATE','date',1,0
9,097,271 rows with index (FOREIGN) job ON procedure_code,job_no and INDEX i_source_item_id ON source_item_id
latest_account will currently have 650k rows but will continue to grow I am using 2960 EBF which AFAIK is the last one for 11.01.
I am trying to generate better plans (Details and node statistics) and will post then when it eventually comes back...
When you tell about a syntax error, it may be of help if you show the exact statement that raises the error (and possibly those variants that you have tried).
Given that, I guess Breck's sample has somewhat mixed up the FORCE INDEX vs. WITH hint clauses and should read:
JOIN collections_item AS ci2 FORCE INDEX (COLLECTIONS_ACCOUNT)
or
JOIN collections_item AS ci2 WITH (INDEX (COLLECTIONS_ACCOUNT))
I am using 2960 EBF which AFAIK is the last one for 11.01.
Nope, there are several newer ones, up to 11.0.1.3158 (for Windows) - cf. Breck's EBF overview here.
Re Syntax error - sorry I ran out of characters! Didn't think to post a second comment - doh...but you were right - using FORCE INDEX executed the statement - but I'm still waiting for the result set (15 mins so far) - looks like I going to need to re-write it!
Re EBFS - I have tried to get access to the latest patches for 11.0.1 to see if there are any that may help but the website is refusing me access even though I registered with service.sap.com ages ago...
PS. I am also still waiting for "Detailed and node statistics to finish"... (4 hours so far)
I have found this helps: CREATE INDEX ix ON #latest_account (ACCOUNT_REF)
without the need to FORCE. Maybe the optimizer floated the temp table back to the top since it had an index or it could join to it . Nevertheless, it has reduced the time taken down from 18 hours to ~14 mins and then on the second run, 3.5 mins!).
I am interested in getting the seeing the patch change logs / updating our server to the latest one though. What do I need to do to do this?
CREATE INDEX ix ON #latest_account (ACCOUNT_REF)
Hm, that shows we have not yet really discussed the contents of that one-column table: As you are simply selecting without grouping etc. given your original description (copied below), will the temporary table contain duplicate values?
SELECT client_ref AS account_ref INTO #latest_account FROM accounts WHERE client_ref<>'';
If so, it might be of help to just copy unique values, and possibly to pre-order them with the help of an ORDER BY in the SELECT, and you could then use a "real" PK definition instead of an index...
I am interested in getting...
If you have problems to access the SAP Marketplace or the EBF page, I'd recommend to have a look at FAQs here on the tag "ebf" or to ask a separate question...
The one-column #latest_account is actually a 2 column table...I cut/simplified the query down to the bare minimum to illustrate the slowness problem - here's the proper query in production:
-- FOR A CLIENT (WHO HAS MULTIPLE CLIENT CODES), PULL OUT THE 'LATEST' ACCOUNT BY THEIR REF
SELECT
o.CLIENT_REF,
-- PREFER OPEN ACCOUNTS OVER CLOSED OR LAST CLOSED ONE
(SELECT TOP 1 OUR_REF FROM collections_account WHERE CLIENT_REF=o.CLIENT_REF AND COLLECTIONS_CLIENT_CODE IN ('2412','2455','2456','2426','2427','2428','2441','2442','2443','2451','2448') ORDER BY IF DATE_CLOSED IS NULL THEN 0 ELSE 1 ENDIF, FILE_NO DESC) AS ACCOUNT_REF
INTO #latest_account
FROM collections_account AS o
WHERE ISNULL(o.CLIENT_REF,'')<>'' AND o.COLLECTIONS_CLIENT_CODE IN ('2412','2455','2456','2426','2427','2428','2441','2442','2443','2451','2448') AND o.STATUS<>'CL6'
GROUP BY CLIENT_REF
...so as you can see, I do use GROUP BY...
I couldn't add a PRIMARY index to a TEMPORARY TABLE and I don't bother with the ORDER BY as AFAIK that is only used to display the end-result set and wouldn't be needed as later, I use it as the main FROM table with an ORDER BY (again I simplified the second part into a simple COUNT query to illustrate the problem).
Ah, I see. Well, the question whether account_ref is unique is still open - if so, you might add an UNIQUE index instead of a normal one which will give the optimizer more clues.
FWIW, you are right that ORDER BY is primarily used for display purposes but if you use SELECT ... INTO (or INSERT ... SELECT) it helps to insert new values in sorted order which may (or may not) lead to a clustered storage of these new rows - without having to use a further index to do so.
I couldn't add a PRIMARY index to a TEMPORARY TABLE
Of course you are correct, that is not possible for local temporary tables (including those created automatically by SELECT INTO #temp).
> Breck's sample has somewhat mixed up the FORCE INDEX vs. WITH hint clauses
I do that EVERY time... why isn't it in the list of characteristic errors? 🙂
Plus, posting code without testing it is an open invitation to Murphy's Law... sigh.
why isn't it in the list of characteristic errors?
Possibly as we all have learnt (particularly from Glenn, I bet) that query hints should better be avoided, if possible:
Two different syntactical variants of index hints help to assure you have to look after the exact syntax every time and will ask yourself if you reaaaaaally want/need to use them:)
Volker, thanks for your comments...
with CREATE INDEX it takes 3m 25s to SELECT INTO the final work file with CREATE UNIQUE INDEX it takes 3m 23s but as (only I would know) the account_ref is unique I'll use a UNIQUE index...
re: ...may or may not lead to a clustered storage...without having to use a further index... As it's not a definite yes, I'll stick with my CREATE INDEX [that takes 1 second] 😉
User | Count |
---|---|
50 | |
9 | |
8 | |
6 | |
5 | |
5 | |
5 | |
5 | |
4 | |
4 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.