SQL Server T-SQL Tuning – NOT IN and NOT Exists

Following on from the previous TSQL performance tuning article in this series, we now focus on the important topic of T-SQL “Not IN” And “Not Exists “

Not in and Not Exists commands :

In some circumstances we will need to select/update/delete orphan data – that is data does not exists in the major table but still exists in other interrelated tables

The response to such a circumstance will often be to utilize the “Not in” and “Not Exists” commands. This can however have a major impact on performance as the Not IN command requires a check to be made on each record individually resulting in a strain on resources and can also lead to locks particularly when updating or deleting a large volume of data .

Alternatives to NOT IN and NOT Exists

SQL Server 2008 introduced the Merge command for bulk insert / update / delete operations which can be used with the Except clause to minimize its performance hit.

Examples:

First we will create two tables :

use [Workshops]
Create table Mytable1 (companyname varchar (100) )
Create table Mytable2 (companyname varchar (100) )

Select/update/delete orphaned data using Not in command :

SELECT companyname FROM Mytable1 where companyname not in (select  companyname from Mytable2)
UPDATE Mytable1 SET companyname =N'Company_Name'  where companyname not in (select  companyname from Mytable2)
DELETE  Mytable1 FROM Mytable1 where companyname not in (select  companyname from Mytable2)

Improved performance using Merge and Except :

merge Mytable1 T using (select companyname from mytable1 except select companyname from mytable2 )S on t.companyname=s.companyname
when  matched  then update SET companyname=N'Company_Name' ;
merge Mytable1 T using (select companyname from mytable1 except select companyname from mytable2 )S on t.companyname=s.companyname
when  matched  then delete ;
SELECT * FROM MYTable1 S where   not exists (select 1 from  MYTable1 inner join MYTable2 on MYTable1.companyname=MYTable2.companyname and MYTable1.companyname=s.companyname)

Note that this also applies Not Exists :

SELECT * FROM MYTable1 S where   not exists (select 1 from  MYTable1 inner join MYTable2 on MYTable1.companyname=MYTable2.companyname and MYTable1.companyname=s.companyname)

Healthy T-SQL practices to Select orphaned data using Except Command

select companyname from mytable1 except select companyname from mytable2

In the next article of this series we will look at optimizing the performance of Views.



Related Articles :

33 Responses to “SQL Server T-SQL Tuning – NOT IN and NOT Exists”

  1. I Like the way you write your articles.
    thanks for writing your blog.

  2. I have used Merge succefully with one execption, delete. There are a few cases when Merge does not do exactly what you think it will.

    • Merge is one of the new rich 2008 T-SQL Codes that can apply for most of cases but let me know about some samples of these cases that don’t apply for Merge to be able to help in them

  3. valuable information…

  4. I really enjoy your blog. Much informative!!

  5. It would be nice to see some test results which back up your assertions, using sample data with hundreds of thousands of rows. You have really only provided some alternative syntax. Here’s one of mine …

    delete t1
    from Mytable1 t1
    where not exists (
    select 1
    from MyTable2 t2
    where t2.CompanyName = t1.CompanyName
    )

    I’d be very interested to find out whether MERGE beats this.

    • Thx for your comment …I didn’t add concession details about a certain realistic case to support the efficiency of Merge and Except commands rather than NOT IN or NOT exists commands but they actually differed much on both aspect resources Consumption and execution time once I tried them within heavy DWH and data migrations solutions running over HUGE TB DBs …..You can try them and let me know your feedback

      In addition, I consider largely DB stress powers so using Merge Command with Except can be much powerful if replacing Not IN or Not Exists and you can justify that using any DB stress tool like SQLQueryStress

  6. Yes, but why don’t you just use:
    SELECT Mytable1.companyname FROM Mytable1
    left join Mytable2 on Mytable1.companyname=Mytable2.companyname
    where Mytable2.companyname is null

    • It will not satisfy the same business logic precisely coz Mytable2.companyname is null is just seeking the data of MYtable2 having companyname is null which isn’t our case here actually but our case here seeks the data of MYtable2 having companyname values mismatching with companyname values of Mytable1

      Please let me know if any further help is needed

      • Not a fan of except Reply January 16, 2013 at 5:09 pm

        I would caution on the use of EXCEPT. On large sets of data, it can overflow your log file.

        As for LEFT JOIN, it is perfectly acceptable. I don;t see where it doesn’t meet the need.

        Using your NOT EXISTS example:
        – give me everything in table 1 that is not in table 2.
        SELECT
        *
        FROM
        MYTable1 S
        WHERE
        NOT EXISTS ( SELECT
        1
        FROM
        MYTable1
        INNER JOIN MYTable2
        ON MYTable1.companyname = MYTable2.companyname
        AND MYTable1.companyname = s.companyname )

        is equivilent to
        SELECT *
        FROM MyTable1 t1
        LEFT JOIN MyTable2 t2
        ON t1.CompanyName=t2.CompanyName
        OR — handle NULL on both sides.
        (t1.CompanyName IS NULL
        AND t2.CompanyName IS NULL)
        WHERE t2.CompanyName is null — no matching record for table1 in table2

        The left join is also cleaner to read.

  7. Your not exists example is overly complex, the equivalent to the except is

    SELECT * FROM MYTable1 as MT1 where not exists (select * from MYTable2 as MT2 where MT2.companyname=MT1.companyname)

    Which produces the same execution plan and performance as the example for except.

  8. You have not posted the improvement. I doubt that merge will improve the queries you have posted.

    • Thx for your comment …You are right , I didn’t post adequate performance benchmarks about a certain realistic case but it actually differed much on both aspect resources Consumption and execution time once I tried Merge & Except commands within heavy DWH and data migrations solutions running over HUGE TB DBs …..You can try them and let me know your feedback

      In addition, I consider largely DB stress powers so using Merge Command with Except can be much powerful if replacing Not IN or Not Exists and you can justify that using any DB stress tool like SQLQueryStress

      Once again …Thx for your follow and comments

  9. I am sorry
    *my clients still use SQL Server 2000/2005*

    • If so , you can use the below DML query instead of Merge ;

      Delete t from Mytable1 t inner join (select companyname from mytable1 except select companyname from mytable2 )S on t.companyname=s.companyname

      But it does worth much thinking about the upgrade to 2008 R2 to enjoy with the rich powers added by Merge commands that can afford efficiently bulk DML operations

      Let me know if any further help is needed

  10. I see claims of performance and would like to caution people that, unless there is a test harness for a good number of rows that proves the performance, “It Depends”.

  11. Extremely usefull. Thanks!!

  12. Please beware nulls are not handled the same in not exists and except
    select 1 as a,null b into #t1
    select 1 as a,null as b into #t2

    select * from #t1
    except
    select * from #t2

    select * from #t1 a where not exists
    (select 1 from #t2 b where a.a=b.a
    and a.b=b.b)

    drop table #t1
    drop table #t2

    • This justifies more the accuracy and simplicity of Except commands but even not exists can reach the same accuracy if using isnull or coalesce functions

      select 1 as a,null b into #t1
      select 1 as a,null as b into #t2

      select * from #t1
      except
      select * from #t2

      select * from #t1 a where not exists
      (select 1 from #t2 b where a.a like b.a
      and isnull (a.b,0) = isnull (b.b,0))

      select * from #t1 a where not exists
      (select 1 from #t2 b where a.a like b.a
      and coalesce (a.b,0) = coalesce (b.b,0))

      drop table #t1
      drop table #t2

      • That depends on how much of a purist you are but normally null is not equal to null as you don’t know its value and as this tends to be the ANSII standard then this is a bug ;).

  13. Very helpful article !!!!!!!! THX

  14. I really Interested in that!!! Thx

  15. just to make you all aware with the MERGE statement on SQL 2008 + 2008 R2

    http://www.sqlperformance.com/2013/02/t-sql-queries/another-merge-bug

  16. Very helpful and useful! THX

  17. very informative article !

  18. Very helpful!

  19. Much interesting article….

  20. very interesting and momentum blog .Thanks

  21. your blog is very interesting and momentum .Thanks

  22. Very good tip! Especially in dealing with very large amounts of data in a DW schema.

  23. Very helpful blog!

  24. I really appreciate your many helpful posts on TSQL. Thank you!

Software Reviews | Book Reviews | FAQs | Tips | Articles | Performance Tuning | Audit | BI | Clustering | Developer | Reporting | DBA | ASP.NET Ado | Views tips | | Developer FAQs | Replication Tips | OS Tips | Misc Tips | Index Tuning Tips | Hints Tips | High Availability Tips | Hardware Tips | ETL Tips | Components Tips | Configuration Tips | App Dev Tips | OLAP Tips | Admin Tips | Software Reviews | Error | Clustering FAQs | Performance Tuning FAQs | DBA FAQs |