Need query help – current version very slow | SQL Server Performance Forums

SQL Server Performance Forum – Threads Archive

Need query help – current version very slow

OK, a bit of background on the current design of the database: Subscribers
———–
SubscriberID int (primary key)
ListID int (foreign key) SubscriberFieldData
——————-
DataID int (primary key)
FieldID int (foreign key)
SubscriberID int (foreign key)
OptionID int (foreign key) Basically, the db stores subscribers, who are grouped together in a list. This list can have custom fields (stored in another table), and the data for these fields is stored in the above table. Here’s an example of what the data would look like: DataID, FieldID, SubscriberID, OptionID
5, 55, 610, 12
6, 67, 610, 44
7, 55, 611, 12
8, 67, 611, 44
9, 55, 612, 25
10, 67, 612, 44 I would like to be able to create a query that gives me all subcribers that have:
– an OptionID of 12 when the FieldID is 55
(and)
– an OptionID of 44 when the fieldID is 67 So the results from this should be subscriber 610 and 611. Subscriber 612 doesn’t match because they have an OptionID of 25 for FieldID 55, when 12 is required. I’ve been able to get this query working using a something like this: SELECT COUNT(s.SubscriberID) FROM
SU_ActiveSubscribers s
WHERE
s.ListID = 123 AND
(s.SubscriberID IN (SELECT sfd.SubscriberID FROM SU_SubscriberFieldData sfd WHERE FieldID = 55 AND OptionID = 12)) AND
(s.SubscriberID IN (SELECT sfd.SubscriberID FROM SU_SubscriberFieldData sfd WHERE FieldID = 67 AND OptionID = 44)) But this is obviously really slow once I start working with decent amounts of data. Any ideas for a better solution that’s going to play well with a large number of rows?
Instead of the IN I’m using an inner join which is much faster (thanks for the tip Srinika), but I’d still luck to improve the performance if there is a better way to get the same results. For the example above, I’m using a query like this: SELECT COUNT(DISTINCT s.SubscriberID) FROM
ActiveSubscribers s WITH (NOLOCK)
INNER JOIN (SELECT DISTINCT sfd.SubscriberID FROM SubscriberFieldData sfd WHERE FieldID = 55 AND OptionID = 12) s1 ON s.SubscriberID = s1.SubscriberID
INNER JOIN (SELECT DISTINCT sfd.SubscriberID FROM SubscriberFieldData sfd WHERE FieldID = 67 AND OptionID = 44) s2 ON s.SubscriberID = s2.SubscriberID
WHERE
s.ListID = 123
try this too: if it improves SELECT COUNT(DISTINCT s.SubscriberID) FROM
ActiveSubscribers s WITH (NOLOCK)
INNER JOIN (SELECT DISTINCT sfd.SubscriberID FROM SubscriberFieldData sfd WHERE (FieldID = 55 AND OptionID = 12) or (FieldID = 67 AND OptionID = 44)) s1 ON s.SubscriberID = s1.SubscriberID
WHERE
s.ListID = 123 finally check for ITW

also you can try SELECT COUNT(DISTINCT s.SubscriberID)
FROM
ActiveSubscribers s WITH (NOLOCK)
INNER JOIN SubscriberFieldData sfd
on s.SubscriberID = s1.SubscriberID
WHERE
s.ListID = 123
and FieldID in ( 55, 67 )
and OptionID in ( 12, 44 )
and ( (FieldID = 55 AND OptionID = 12)
or (FieldID = 67 AND OptionID = 44) )
index on sfd( fieldid, optionid, subscriberid) reverse fieldid and optionid to put the most selective one first
and one on s( listid, subscriberid ) or reversed depending on selectivity of listid NOTE the in statements are there to give SQL a better chance at optimising the query, you could also try without it and see how sql copes with the or condition by itself Cheers
Twan
Thanks guys – I’ll try those when I get back into the office tommorrow, but they both look much better than my solutions.
I’ve been looking over those 2 queries and I don’t think they are going to work because of the OR. Remember, the results I’m looking for are: – an OptionID of 12 when the FieldID is 55
(and)
– an OptionID of 44 when the fieldID is 67 So both of those have to match. With these 2 queries, I all the results when only 1 of the criteria matches. The index worked very well Twan, and it’s speed up my current query quite significantly, but I’m still hoping there is a way that I don’t have to do multiple inner joins for each extra rule like my current query?
Ah, the usual problem with the concept of "OR". A column on a single row can have only ONE value at a time (we are talking about a normalized table, right?), not two – so you have to use OR, not AND. Your instinct says that "I want OptionId = 12 and I want OptionId = 44" is the same as "where OptionId = 12 and OptionId = 44" – but you will not find a row where OptionId is 12 and 44 at the same time. That’s where the OR word comes in. Twan’s query really is the correct one for the criteria – well, except that you can drop the IN clauses, which are already covered by the last part of the criteria: WHERE s.ListID = 123
and ( (FieldID = 55 AND OptionID = 12)
or (FieldID = 67 AND OptionID = 44) )
Hi ya, yep Adriaan is quite right and the in part of the query that I suggest was only in case SQLServer won’t optimise the query properly for the OR condition on two fields Cheers
Twan
Thanks Adriaan, but I still think there’s a problem with the query. Having another look at this: WHERE s.ListID = 123
and ( (FieldID = 55 AND OptionID = 12)
or (FieldID = 67 AND OptionID = 44) ) Now, looking at my sample data again: DataID, FieldID, SubscriberID, OptionID
5, 55, 610, 12
6, 67, 610, 44
7, 55, 611, 12
8, 67, 611, 44
9, 55, 612, 25
10, 67, 612, 44 Won’t subscriber 612 be returned, because they have a fieldID of 55 and optionID of 12 (which they don’t) OR a fieldID of 67 and optionID of 44 (which they do)? So subscriber 612 is being returned when 1 of the critera matches, but we only want subscribers where both critera match.
Hi ya, in that case the query needs to be modified slightly, sorry I didn’t spot that originally in the query you gave.
Code:
SELECT COUNT(DISTINCT s.SubscriberID)
FROM   ActiveSubscribers s WITH (NOLOCK)
INNER JOIN SubscriberFieldData sf1      on s.SubscriberID = sf1.SubscriberID AND    sf1.FieldID = 55 AND sf1.OptionID = 12
INNER JOIN SubscriberFieldData sf2      on s.SubscriberID = sf2.SubscriberID
AND    sf2.FieldID = 67 AND sf2.OptionID = 44
[code]
you could play around with changing the query slightly for performance
[code]
SELECT COUNT(DISTINCT sf1.SubscriberID)
FROM   SubscriberFieldData sf1  WITH (NOLOCK)
INNER JOIN SubscriberFieldData sf2    on sf1.SubscriberID = sf2.SubscriberID
AND    sf2.FieldID = 67 AND sf2.OptionID = 44
INNER JOIN  ActiveSubscribers s       on sf1.SubscriberID = s.SubscriberID
WHERE  sf1.FieldID = 55 AND sf1.OptionID = 12
[code]
Cheers
Twan 

]]>