cancel
Showing results for 
Search instead for 
Did you mean: 

'insert on existing update' deletes rows when having unique index

4,445

How to swap two unique values in a table?
This might be a general problem. Doing this with a 'insert on existing update' statement leads to an very unexpected result, when there is a unique index on the column. See example below. Our original problem is to synchronize a list of values in one table with the list of values from another table. Deleting rows is done by a simple delete statement, which works fine. Inserting new rows or updating works also fine if the values are still unique.
But updating and inserting combined with exchanging unique values is a problem.

(SQL Anywhere 16)
Is this a known problem? Does anybody already have any experience with that? Is there another lean solution for this 'swap' problem?

Thanks for any comments.

Problem:

create table T1 (PK integer primary key, Name text );  
insert into T1 (PK, Name) values ( 1, 'A' );  
insert into T1 (PK, Name) values ( 2, 'B' );
insert into T1 (PK, Name) values ( 3, 'C' );

create table T2 (PK integer primary key, Name text );
insert into T2 (PK, Name) values ( 1, 'B' );
insert into T2 (PK, Name) values ( 2, 'A' );
insert into T2 (PK, Name) values ( 3, 'X' );

create unique index if not exists UniqueName on T1 (Name asc);

-- Swap the values of Name of teh first 2 rows in T1
insert into T1 (PK, Name) on existing update select PK, Name from T2;

select * from T1;
select * from T2;

-- --> Row 1 on T1 is missing. Was deleted silently without any error !!
-- --> Update of row 3 in T1 works.

Accepted Solutions (1)

Accepted Solutions (1)

Former Member

It is a bug in INSERT ... ON EXISTING UPDATE. I have filed a bug report (engineering case 761289).

As @Volker suggests, the MERGE statement works well as a workaround (shown below).

When an UPDATE statement causes a uniqueness violation error during processing it might be a problem resolved by later rows. The offending row is removed from the base table and moved to a "HOLD TABLE". After all rows are processed by the UPDATE, the rows in the HOLD TABLE are inserted into the base table and constraints are re-checked to see if the UPDATE resolved the problem after completing. See the help topics for Locks during updates and Data changes using UPDATE for more details.

When an INSERT ... ON EXISTING UPDATE statement encounters a uniqueness error, the row is also moved to the HOLD TABLE but there is no subsequent insert from the HOLD TABLE into the base table, leading to an unexpected delete.

A workaround is to use the MERGE statement, for example:

MERGE INTO T1 USING T2 ON PRIMARY KEY
WHEN MATCHED THEN UPDATE
WHEN NOT MATCHED THEN INSERT


HOLD TABLE processing of MERGE statements is not affected by this issue.

In version 8.0 through 9.0, the uniqueness violation during the INSERT...ON EXISTING statement would generate an error instead of deleting the row. In versions 10.0 and above the row is moved to the HOLD TABLE but not re-added to the base table, leading to the missing row. The MERGE statement is generally preferred to the INSERT..ON EXISTING as it clearer what the operation should do (e.g., which index is used) and offers more flexibility in defining the actions. I do admit I had to review the syntax in the documentation for MERGE while the INSERT..ON EXISTING is stuck in my mind, but I do prefer the MERGE.

VolkerBarth
Contributor

I do admit I had to review the syntax in the documentation for MERGE while the INSERT..ON EXISTING is stuck in my mind...

Wow, that's good to know - I'll feel less embarrassed then each time I have to look that up:)

VolkerBarth
Contributor
0 Kudos

When an UPDATE statement causes a uniqueness violation error during processing it might be a problem resolved by later rows. The offending row is removed from the base table and moved to a "HOLD TABLE". After all rows are processed by the UPDATE, the rows in the HOLD TABLE are inserted into the base table and constraints are re-checked to see if the UPDATE resolved the problem after completing. See the help topics for Locks during updates and Data changes using UPDATE for more details.

Is that the reason why a "unique value swap" is allowed by the server but will lead to problems for SQL Remote as the single UPDATE swap statement will be replicated as several one-row UPDATE operations that will then lead to UQ violations? (OK, that would be a better question on its own...). I'm thinking of an UPDATE (say, colUnique would be a candidate key) like

UPDATE myTable
  SET colUnique = 3 - colUnique
  WHERE colUnique BETWEEN 1 AND 2


In my understanding, that will be replicated as

UPDATE myTable SET colUnique = 2 WHERE pk = <first pk>;
UPDATE myTable SET colUnique = 1 WHERE pk = <second pk>;  -- error

Thanks for that answer, the explanation and for filing the bug report.
As an acceptable workaround, we found, that using an UPDATE statement instead of the INSERT...ON EXISTING UPDATE works fine. No deleted row.
So replacing the INSERT...ON EXISTING UPDATE by an UPDATE followed by a separate INSERT statement does it correctly.
I will try the MERGE statement also.
update T1 set T1.Name = T2.Name from T1 join T2 on (T1.PK = T2.PK);

Breck_Carter
Participant

> The MERGE statement is generally preferred to the INSERT..ON EXISTING as it clearer what the operation should do (e.g., which index is used) and offers more flexibility in defining the actions

If you need flexibility (say, doing complex ETL stuff) then MERGE is a candidate.

However, flexibility does NOT necessarily imply "generally preferred". In the general case (say, 80% of the time) folks need a simple UPSERT operation, and they are generally NOT confronted with rare challenges like swapping unique column values.

Generally speaking, simplicity is preferred over complexity, and MERGE is nothing if not complex. The only reason anyone has to look up UPSERT is because of the unfortunate syntax (INSERT ON EXISTING UPDATE)... having the benefit of time, HANA got it right.

Aside: The statement "HANA got it right" is open to debate. The examples for UPSERT in the HANA docs (currently found here) make UPSERT look good (e.g., UPSERT T VALUES (1, 1)) but the documentation is both unclear and contradictory. FWIW HANA doesn't have MERGE; it does have MERGE DELTA but that's a Monty Python feature ( something completely different 🙂

Former Member

I should say that MERGE is generally preferred by me. I do enjoy when the server takes care of tedious typing for me and does "what I want" but with INSERT ON EXISTING I feel it takes too many liberties and does things that are not what I expected. Further, the INSERT is really doing a hidden join between the input table and target table, and that join can only be executed by effectively an index-nested-loops algorithm as each input row is considered for insertion. The MERGE statement has an obvious join (but doesn't require you do a lot of typing if you just want primary key matching). If the statement is mostly going to update existing rows , then the MERGE can be substantially faster by using another join algorithm.

It might be interesting that the INSERT ... ON EXISTING syntax was added to expose an underlying "upsert" feature used only for SQL Remote. That underlying feature is still used and it has a few extra bells and whistles (such as detecting RECORDING_CONFLICT_DETECTED). The INSERT .. ON EXISTING is a good feature but I prefer the way the MERGE statement gives me the power without either requiring too much extra typing or guessing too much at my intention. By keeping the simple case in my commonly used tools, I'm ready when I do get to a trickier situation (what happens if there is a default or computed column that is part of the primary key? What happens when I don't want to join on the primary key?).

I prefer to use MERGE even in simple cases but your preferences my differ.

VolkerBarth
Contributor
0 Kudos

One might add that MERGE is SQL standard while INSERT...ON EXISTING is not, however, in my (certainly not too deep!) understanding it's only the SQL Anywhere specific additions and defaults that cause the really complex MERGE statement to "look easy" and thereby make it useful for simpler cases. Therefore the "it's a standard!" claim should not be overrated:)

Answers (0)