on 2017 Nov 15 9:31 AM
User defined functions seem to be believing that NULL = NULL after being called multiple times.
I can reproduce this in all the builds of v16 I can find (including 2546), different windows versions, machines etc. The ANSINULL database option is ON (the default).
Run this script in isql:
CREATE TABLE Test1 (PK CHAR(20) NOT NULL, Col CHAR(12) NULL, PRIMARY KEY ( PK ASC ));insert into Test1 (PK) values ('Blah1'); commit;
CREATE INDEX Test1_Col ON Test1 ( Col ASC );
create or replace FUNCTION TestFn (in val char(12)) returns char(20) NOT DETERMINISTIC begin declare rv char(20); select first PK into rv from Test1 where Col = val; return(rv); end;
If you run this query:
select TestFn(null);
the first 10 times it produces the correct result, "NULL", since NULL should not equate to NULL. After that it matches the first record it finds - in this case there's only one record so you get 'Blah1' (ie it matches NULL = NULL).
This is as stripped down as I can get the reproducible (the original was spotted in a far more complex situation). You have to use a function, running a straight SELECT in isql does not have issues. The index appears to be the key - no index and everything works fine.
Different queries seem to fail after 10 calls, the number of records in the table doesn't seem relevant either - the example above with 20000 records still fails on the eleventh go.
Data type doesn't seem relevant either, the above example works the same with Col as char, int or date.
Request clarification before answering.
Thanks for reporting this. I've confirmed that this is indeed a bug with plan caching, as Volker suggested. In v16, plans for statements in procedures are built for 10 training executions and a reusable plan built on the 11th execution. The reusable plan that is being built on the 11th iteration appears to be semantically incorrect for an equality predicate on a nullable column that uses an index---I'm currently investigating why. In v17 the semantic incorrectness bug still exists, but the logic for deciding when to cache plans has changed which makes the repro less deterministic.
As a workaround, you should disable plan caching by setting option max_plans_cached=0 at the connection, user, or database level.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Yes and no.
When functions are marked as NOT DETERMINISTIC, the optimizer will not cache plans for the FUNCTION CALL---i.e. it impacts optimization of statements that REFERENCE expressions marked as non-deterministic.
In this case, the plan being cached is for a statement within the function definition itself. The execution of statements WITHIN the function are not impacted by the NOT DETERMINISTIC keyword. Granted, in this particular function it is the query itself that is the source of the non-determinism due to the FIRST keyword without an ORDER BY. Once we cache a particular plan, the query execution will be deterministic as long as we are running that cached plan. However, that is acceptable because the NOT DETERMINISTIC keyword is essentially a warning that execution MAY be non-deterministic---not a promise that it will be.
The bug is specific to reusable plans built by the optimizer bypass, so another workaround is to use HINT(FORCE OPTIMIZATION) to disable the optimizer bypass (and plan caching) for specific statements.
From a few simple experiments, I've reproduced the bug in v11.0.1 GA (early 2009) but not in v10.0.1.4157 (although it's possible it was present in v10 but with different repro conditions).
The fix for this bug has been submitted to the v16 (build 2606+) and v17 (build 4104+) codelines and should appear in future ebfs exceeding those build numbers.
User | Count |
---|---|
77 | |
30 | |
10 | |
8 | |
8 | |
7 | |
7 | |
5 | |
4 | |
4 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.