Hello I have to tables which have the same structure Table1 and Table2 (#Product_ID varchar(10), #Num_Year char(4), #Num_Month char(2), Qty_Forecast numeric). I would like to insert into Table1 all the rows from Table2 that it does not already has. Which of the following statements is better and why? First statement: insert into Table1 select Producto_ID, Num_Year, Num_Month, Qty_Forecast from Table2 where Product_ID + Num_Year + Num_Month not in ( select Product_ID + Num_Year + Num_Month from Table1 ) Second statement: insert into Table1 select Producto_ID, Num_Year, Num_Month, Qty_Forecast from Table2 left join Table1 on Table1.Product_ID=Table2.Product_ID and Table1.Num_Year=Table2.Num_Year and Table1.Num_Month=Table2.Num_Month where Table1.Product_ID is null and Table1.Num_Year is null and Table1.Num_Month is null Thanks a lot for you help.
I suggest running both statements with Show Execution Plan, compare what the server is doing and the query plan cost. My strong hunch is that the second one is better, because it can use indexes where the first cannot.
I would second this. RDBMS will mostly try to flatten queries into JOIN operations anyway. So if you can express express a query with in a JOIN way, it is (in most cases) the "better" choice. -- Frank Kalis Moderator Microsoft SQL Server MVP Webmaster:http://www.insidesql.de
A small change is required in your second statement. Use the Outer join for your query to succeed. The current query will insert zero records. insert into Table1 selectProducto_ID, Num_Year, Num_Month, Qty_Forecast from Table2 left Outer join Table1 on -- Changed... Table1.Product_ID=Table2.Product_ID and Table1.Num_Year=Table2.Num_Year and Table1.Num_Month=Table2.Num_Month where Table1.Product_ID is null and Table1.Num_Year is null and Table1.Num_Month is null I executed both queries in three iterations (ofcourse resetting to original state before next iteration) and found same performance results in terms of CPU, reads, durations. Any thoughts? Thanks and Regards, Bhushan http://www.metacafe.com/watch/365213/wonder_what_you_are
Better late than never, but one thing you should never do is concatenating columns for criteria ... where Product_ID + Num_Year + Num_Month not in (select Product_ID + Num_Year + Num_Month from Table1) ... because this causes the server to ignore all appropriate indexes. The syntax you're looking for is a correlated subquery, which is pretty much the same subquery, but with fields from the main query in the WHERE clause of the subquery: insert into Table1 select Product_ID, Num_Year, Num_Month, Qty_Forecast from Table2 where Table2.Product_ID not in (select x.ProductID FROM table1 x WHERE x.Num_Year = Table2.Num_Year AND x.Num_Month = Table2.Num_Month) ... or a NOT EXISTS correlated subquery: where NOT EXISTS (select x.ProductID FROM table1 x WHERE x.ProductID = table2.ProductId x.Num_Year = Table2.Num_Year AND x.Num_Month = Table2.Num_Month) Performance can vary depending on data distribution, but the LEFT JOIN syntax is probably a safe bet.
Where do the stars under the User name come from? For e.g. Adriaan has 5 stars. Thanks and Regards, Bhushan http://www.metacafe.com/watch/365213/wonder_what_you_are
That's the forum indication for how much a user has posted. -- Frank Kalis Microsoft SQL Server MVP Contributing Editor, Writer & Forum Moderatorhttp://www.sql-server-performance.com Webmaster:http://www.insidesql.de
Bhushan, I was not so much responding to your posting, as to the original poster. Sorry if I seemed to jump all over you.