NOT IN Clause | SQL Server Performance Forums

SQL Server Performance Forum – Threads Archive

NOT IN Clause

Hello Friends,<br />I have a query using NOT IN Clause. NOT IN should be avoided for performance.<br />Help me to rewrite this query(want to avoid NOT IN clause, not getting idea). I searched in the forum, but i am unable to find exact solution. Please Help me [<img src=’/community/emoticons/emotion-1.gif’ alt=’:)‘ />]<br /><br />I am giving simple example<br />—————————————————————–<br />SELECT <br />sum(sal) <br />FROM <br />emp <br />WHERE<br />deptid NOT IN(’10’,’20’,’30’,’40’,’50’,’60’,’70’)<br />—————————————————————–<br /><br />I tried like this<br />———————————–<br />SELECT sum(SAL)<br />FROM<br />emp<br />WHERE<br />deptid &lt;&gt; ’10′<br />AND deptid &lt;&gt;’20′<br />AND deptid &lt;&gt;’30′<br />AND deptid &lt;&gt;’40′<br />AND deptid &lt;&gt;’50′<br />AND deptid &lt;&gt;’60′<br />AND deptid &lt;&gt;’70′<br />————————————<br />but the performance is same. <br /><br />Please help me to avoid NOT IN clause. Thanks In Advance [<img src=’/community/emoticons/emotion-1.gif’ alt=’:)‘ />]<br />
The only way to improve performance here is to find out why you’re excluding these exact deptid values. I’m assuming you have a Dept table as well, so look at what makes the rows with deptid ’10’,’20’,’30’,’40’,’50’,’60’,’70’ different from all the other rows.<br /><br />If there is a column on the Dept table that has a certain value only for these specific deptid’s, and other values for all other deptid’s, then you can replace the NOT IN (&lt;value list&gt<img src=’/community/emoticons/emotion-5.gif’ alt=’;)‘ /> syntax with this<br /><br />NOT IN (SELECT deptId FROM Dept WHERE other_column = &lt;certain value&gt<img src=’/community/emoticons/emotion-5.gif’ alt=’;)‘ /><br /><br />This would work best if Emp has a significantly larger number of rows than Dept, and if both Emp.deptid and Dept.other_column are covered by indexes.<br /><br />By the way, if deptid is a column with a numeric data type, then drop the quotes arounsd the number criteria.
Hi Adiraan,<br />For my case, I should use exact deptid values. There is no dept table as you assumed.<br />Please don’t take it as employee and dept tables. I am giving simple example by putting employee table. actual in my case, i have a query similar to what I mentioned. <br />I should use exact deptid values.[<img src=’/community/emoticons/emotion-1.gif’ alt=’:)‘ />]<br />
That sounds like the system has a pretty horizontal table structure. The idea behind the relational database model (RDBMS) is that it both improves performance, and makes querying easier, compared to the horizontal type of table that your working with. Just as a different approach, you could try WHERE RIGHT(deptid, 1) > ‘0’ AND deptid BETWEEN ’10’ AND ’70’ … but I doubt that you can improve performance that way.

Will Not exists make some improvement?
Declare @t table(deptid int)
Insert into @t
Select 10 union all
Select 20 union all
Select 30 union all
Select 40 union all
Select 50 union all
Select 60 union all
Select 70
SELECT
sum(sal)
FROM
emp e
WHERE
NOT exists
(select deptid from @t where deptid=e.deptid)
Madhivanan Failing to plan is Planning to fail
How many departments do you have? Any chance you can change it to a IN list instead of a not in? What is the current execution plan?

… or SELECT sum(sal) FROM emp e
LEFT JOIN
(Select 10 as deptid
union all Select 20
union all Select 30
union all Select 40
union all Select 50
union all Select 60
union all Select 70) T
ON e.deptid = T.deptid
WHERE T.deptid IS NULL … but I don’t think it will be an improvement over "NOT IN" or "deptid<>10 AND deptid<>20".
Hi Madhivanan,
quote:
Will Not exists make some improvement? Declare @t table(deptid int)Insert into @tSelect 10 union all Select 20 union all Select 30 union all Select 40 union all Select 50 union all Select 60 union all Select 70 SELECT sum(sal) FROM emp eWHERENOT exists(select deptid from @t where deptid=e.deptid)
Madhivanan Failing to plan is Planning to fail

I tried what you said but no improvement in the performance. Hi dtipton
quote:
How many departments do you have? Any chance you can change it to a IN list instead of a not in? What is the current execution plan?

I have so many departments and there is no chance to use IN. Actuavally it is not departments, I have given simple example only.It has proper indexes no problem with indexes. all are index seek only.My bother about this NOT IN clause. I tried with Adriaan advise, but still no use.No improvement in performance.

You say you want to get rid of NOT IN, because people say its bad, and should be avoided. But in your specific scenario, is it actually performing badly ?
There are rarely any sweeping statements you can make about SQL saying do this or dont do this.
It all depends on your exact situation.

I think Adriaan gave the most helpful recommendation. Either it is not possible to improve the query because not in fillters out only small percent of rows or apply the logic that produced that list of ids in the where clause.

Lesson #1 in database design: ask a real person who knows about the real-world situation. Q1: why are we excluding deptid’s 10, 20, 30 40, 50, 60 and 70 ?
Q2: is there any other column on the table that tells us the same ?
Q3: is the list of deptid’s that should be included perhaps shorter than the list of deptid’s that must be excluded ?
Q4: why would you not want a Dept detail table ?

]]>