Tuesday, April 14, 2015

Using the MERGE Statement

Introduction

We are born naked, wet, hungry and dying. Then things get worse....

Did you ever need to take a list or table of rows and insert or update a target table depending on the prior existence of the row? We have all run into this at one time or another in our careers. If you have not had the pleasure of doing this you haven't lived yet! We call it an "upsert" for short and it stands for UPDATE or INSERT. Ever write a huge block of code like IF EXISTS(...)...ELSE...? There is an easier way. I now introduce to you the MERGE statement.


The What

So what is this magic that you speak of wizard? The MERGE statement was introduced into SQL in Server 2008. It will perform INSERT, UPDATE and even DELETE from a source to a destination. The definition of the structure of the merge statement can be scary to look at but you will find after using it a few times it's really quite easy to get the hang of using.


[ WITH <common_table_expression> [,...n] ]
MERGE 
    [ TOP ( expression ) [ PERCENT ] ] 
    [ INTO ] <target_table> [ WITH ( <merge_hint> ) ] [ [ AS ] table_alias ]
    USING <table_source> 
    ON <merge_search_condition>
    [ WHEN MATCHED [ AND <clause_search_condition> ]
        THEN <merge_matched> ] [ ...n ]
    [ WHEN NOT MATCHED [ BY TARGET ] [ AND <clause_search_condition> ]
        THEN <merge_not_matched> ]
    [ WHEN NOT MATCHED BY SOURCE [ AND <clause_search_condition> ]
        THEN <merge_matched> ] [ ...n ]
    [ <output_clause> ]
    [ OPTION ( <query_hint> [ ,...n ] ) ]    
;

<target_table> ::=
{ 
    [ database_name . schema_name . | schema_name . ]
  target_table
}

<merge_hint>::=
{
    { [ <table_hint_limited> [ ,...n ] ]
    [ [ , ] INDEX ( index_val [ ,...n ] ) ] }
}

<table_source> ::= 
{
    table_or_view_name [ [ AS ] table_alias ] [ <tablesample_clause> ] 
        [ WITH ( table_hint [ [ , ]...n ] ) ] 
  | rowset_function [ [ AS ] table_alias ] 
        [ ( bulk_column_alias [ ,...n ] ) ] 
  | user_defined_function [ [ AS ] table_alias ]
  | OPENXML <openxml_clause> 
  | derived_table [ AS ] table_alias [ ( column_alias [ ,...n ] ) ] 
  | <joined_table> 
  | <pivoted_table> 
  | <unpivoted_table> 
}

<merge_search_condition> ::=
    <search_condition>

<merge_matched>::=
    { UPDATE SET <set_clause> | DELETE }

<set_clause>::=
SET
  { column_name = { expression | DEFAULT | NULL }
  | { udt_column_name.{ { property_name = expression
                        | field_name = expression }
                        | method_name ( argument [ ,...n ] ) }
    }
  | column_name { .WRITE ( expression , @Offset , @Length ) }
  | @variable = expression
  | @variable = column = expression
  | column_name { += | -= | *= | /= | %= | &= | ^= | |= } expression
  | @variable { += | -= | *= | /= | %= | &= | ^= | |= } expression
  | @variable = column { += | -= | *= | /= | %= | &= | ^= | |= } expression
  } [ ,...n ] 

<merge_not_matched>::=
{
    INSERT [ ( column_list ) ] 
        { VALUES ( values_list )
        | DEFAULT VALUES }
}

<clause_search_condition> ::=
    <search_condition>

<search condition> ::=
    { [ NOT ] <predicate> | ( <search_condition> ) } 
    [ { AND | OR } [ NOT ] { <predicate> | ( <search_condition> ) } ] 
[ ,...n ] 

<predicate> ::= 
    { expression { = | < > | ! = | > | > = | ! > | < | < = | ! < } expression 
    | string_expression [ NOT ] LIKE string_expression 
  [ ESCAPE 'escape_character' ] 
    | expression [ NOT ] BETWEEN expression AND expression 
    | expression IS [ NOT ] NULL 
    | CONTAINS 
  ( { column | * } , '< contains_search_condition >' ) 
    | FREETEXT ( { column | * } , 'freetext_string' ) 
    | expression [ NOT ] IN ( subquery | expression [ ,...n ] ) 
    | expression { = | < > | ! = | > | > = | ! > | < | < = | ! < } 
  { ALL | SOME | ANY} ( subquery ) 
    | EXISTS ( subquery ) } 

<output_clause>::=
{
    [ OUTPUT <dml_select_list> INTO { @table_variable | output_table }
        [ (column_list) ] ]
    [ OUTPUT <dml_select_list> ]
}

<dml_select_list>::=
    { <column_name> | scalar_expression } 
        [ [AS] column_alias_identifier ] [ ,...n ]

<column_name> ::=
    { DELETED | INSERTED | from_table_name } . { * | column_name }
    | $action

I told you it was scary! There are all kinds of possible ways to use this. If you ask me though, the very definition is what intimidates people and scares them out of learning to use it. Let's look at some simple examples of how to use this next so it's not quite so scary.


The How

We're going to need two tables to test this out with and some data inside them. Start up SSMS and your handy TestDB and open a query window. Cut and paste this into it, run it and you will have a source and a target table. The source is the table of products that came in on the shipping truck and need to be added to inventory. The target is the table that holds the current inventory of your store.

USE [TestDB]
GO

-- Create some test tables
CREATE TABLE [dbo].[tblProduct] 
( [ID] int IDENTITY
 ,[ProductName] varchar(50)
 ,[Quantity] int
)  

CREATE TABLE [dbo].[NewShipment]
(
 [ID] int IDENTITY
 ,[ProductName] varchar(50)
 ,[Quantity] int
)

-- Add some inventory
INSERT INTO [dbo].[tblProduct]
( [ProductName]
 ,[Quantity]
)
VALUES
('Left Handed Monkey Wrench',1),
('Right Handed Hammer',2),
('Invisible Paint',5),
('Drop Cloth',2),
('Wratchet Wrench',1),
('1/2 Inch Drive Socket Set',4)

-- Add some product to the shipment that just came in.
INSERT INTO [dbo].[NewShipment]
( [ProductName]
 ,[Quantity]
)
VALUES
('Left Handed Monkey Wrench',3),
('Wratchet Wrench',2),
('1/2 Inch Drive Socket Set',1),
('Coffee Cup',10)

Now that we have a products table and a shipment table, let's update our product inventory with the latest shipment that came in. We have some additions to the existing inventory and we added a line of Coffee Cups with a quantity of 10. Here is the statement heavily commented to show what is going on.

USE [TestDB]
GO

--Show initial table content
SELECT * FROM [dbo].[tblProduct];

--Merge the shipment
MERGE [dbo].[tblProduct] AS [Products] --Target
USING [dbo].[NewShipment] AS [Shipment] --Source.
 ON [Products].[ProductName] = [Shipment].[ProductName] --Join condition
--Match Condition. Add to existing inventory.
WHEN MATCHED THEN  
  UPDATE SET [Products].[Quantity] = [Products].[Quantity] + [Shipment].[Quantity] -- 
--No Match Condition. Create new inventory record.
WHEN NOT MATCHED THEN  
  INSERT ([ProductName],[Quantity]) VALUES ([Shipment].[ProductName], [Shipment].[Quantity]);

--Show outcome
SELECT * FROM [dbo].[tblProduct];

You'll notice when the statement finishes there are two result sets. The first was the content of the inventory table before the merge and the second is the content of the table after the merge completed. You'll notice our counts increased as they should have and we have added the new coffee Cup inventory to our table as well.

What about DELETE? It will also delete items in the merge. I left it out for clarity. To remove items during a merge we use the condition WHEN NOT MATCHED BY SOURCE THEN DELETE. This causes any values not in the source table to be deleted from the target table. The statement WHEN NOT MATCHED BY TARGET THEN DELETE is also valid. If we have the statement below and we execute it against the tables, what do you think the outcome will be?

MERGE [dbo].[tblProduct] AS [Products] --Table to hold final output 
USING [dbo].[NewShipment] AS [Shipment] --Table values to merge into above.
 ON [Products].[ProductName] = [Shipment].[ProductName] --Join condition
--Match Condition. Add to existing inventory.
WHEN MATCHED THEN  
  UPDATE SET [Products].[Quantity] = [Products].[Quantity] + [Shipment].[Quantity] -- 
--No Match Condition. Create new inventory record.
WHEN NOT MATCHED THEN  
  INSERT ([ProductName],[Quantity]) VALUES ([Shipment].[ProductName], [Shipment].[Quantity])
--If not matched in SOURCE, delete it.
WHEN NOT MATCHED BY SOURCE THEN DELETE;


Conclusion

I hope this has helped to demonstrate the basic use of the MERGE statement for you and given you some ideas to improve future code you have to write or refine existing code from long ago. Just remember these few little gotcha's and you will be fine:

  • Duplicate rows in the target table will give you a duplicate row error. Use the GROUP BY clause or eliminate duplicates rows in the target before attempting a merge.If you need the duplicate rows in the target and use a GROUP BY clause it will apply the action to ALL duplicate rows in the target.
  • Avoid filtering rows with the ON clause of the join. Use only the necessary values to provide the needed join or you could end up with some strange outcomes. 
  • You can use additional conditional clauses on the WHEN operations of the merge. An example would be like: WHEN MATCHED AND ([Products].[Quantity] > 5) THEN... This would only operate on matches where the Quantity > 5.
Happy coding!




No comments:

Post a Comment