cancel
Showing results for 
Search instead for 
Did you mean: 

How do I delete duplicate rows using MERGE?

Former Member
5,355

I'm in the process of changing how we load client data into our web site database (v11.0.1.2044). We use staging tables (zz prefix) for the initial load. After that, I use the MERGE feature to insert new rows or update existing one. The client data may not contain rows that were previously inserted (i.e. the record was deleted in the client database). To deal with this, I first delete rows in the master table that no longer exist in the staging table. In the end, I am left with the following:

//delete rows not present in new data set
DELETE FROM fd_master 
WHERE fd_master.fd_entity = '0100'
  AND fd_master.fd_id NOT IN (SELECT f.fd_id 
                                   FROM zz_fd_master f
                               WHERE f.fd_entity = '0100') ;
COMMIT;

//insert and update funds
MERGE INTO fd_master
   USING zz_fd_master ON fd_master.fd_entity = zz_fd_master.fd_entity
                         AND fd_master.fd_id = zz_fd_master.fd_id 
                         AND zz_fd_master.fd_entity = '0100'
   WHEN NOT MATCHED THEN INSERT
   WHEN MATCHED THEN UPDATE;
COMMIT;

Is there a way to use MERGE so that I could combine these two statements? Thanks, Tom

Accepted Solutions (0)

Answers (1)

Answers (1)

MarkCulp
Participant

The MERGE statement can not modify rows in the target table that do not match any row in the source query - it can only insert new rows or update matching rows from the source table - and therefore you will not be able to delete rows from your target (aka master) table using MERGE directly.

If you really don't want to run two updates (delete followed by merge) and want to use a single merge statement, then you will need to construct a source table that consist of all rows that you wish to insert/update/delete. This can be done using a UNION ALL.

E.g. something like:

MERGE
 INTO fd_master ( _col-list_ ) as ttable
USING ( select 'new' as zz_src, * 
          from zz_fd_master
         union all
        ( select 'old' as zz_src, *
            from fd_master
           where fd_id NOT IN (
                  SELECT f.fd_id
                    FROM zz_fd_master f
                   WHERE f.fd_entity = '0100')
        )
      ) as stable
   ON ttable.fd_id = stable.fd_id

-- rows that exists in old but not in new   
 WHEN MATCHED AND stable.zz_src = 'old'
 THEN DELETE

-- rows that exists in old and new - update the row    
 WHEN MATCHED AND stable.zz_src = 'new'
 THEN UPDATE SET _fill-in-the-cols-that-need-updated_

-- rows that exists in new and not old - insert
 WHEN NOT MATCHED
 THEN INSERT ( _col-list_ ) values ( _col-list_ ) -- exists in old and new

You will need to fill in a few pieces according to your table schema and requirements.

I'm not sure this single statement would run any faster then running two separate queries? It will likely depend on the size of your tables.

Breck_Carter
Participant
0 Kudos

For one thing, it moves all the deletes and insert/updates into a single transaction, from two transactions before. Question: in the original implementation, what makes MERGE more attractive than INSERT ON EXISTING UPDATE?

MarkCulp
Participant

@Breck: If all you are doing in inserting/updating rows from a source into the target table and matching primary key, then INSERT ON EXISTING UPDATE ON EXISTING UPDATE is much easier to write and read and does the job (and is what I would use), but if you are doing anything more complicated then it can't be done... so that is where MERGE is very useful. I've used MERGE in a number of applications but still fall back to INSERT ON EXISTING UPDATE for the simple cases.

Former Member
0 Kudos

The staging tables we use (those with a zz prefix) do not have primary keys because our client data often has dupes. I don't think you can use INSERT ON EXISTING UPDATE if the table does not have a primary key.

MarkCulp
Participant
0 Kudos

@Tom: Correct. ON EXISTING UPDATE requires a primary key.