SQL Server Performance Forum – Threads Archive
Problem sorting out a complicated recordsetHello all!<br /><br />If anyone has a few seconds to spare, I would be grateful if you could read through my problem <img src=’/community/emoticons/emotion-1.gif’ alt=’‘ /><br /><br />I have thousands of cars in a table called CARS<br />This table contains:<br />- 50 Blue Cars,<br />- 300 Red Cars<br />- 150 Green Cars,<br />- 100 Purple Cars,<br />- 150 Yellow Cars<br />- 100 Gold Cars.<br /><br />What I need is to obtain a recordset with:<br />1 – The Golden cars first<br />2 – The Blue, Red and Green cars, at equal frequency (R, G, B, R, G, B, R, G, B, …) and respecting an ORDER BY clause<br />(once there are no Blue Cars left, R, G, R, G, R, G and once there are no Green Cars left, R, R, R, R….)<br />3 – The Purple and Yellow Cars at equal frequency.<br /><br />I have a column called IMPORTANCE, which is an integer [0-100], and up to now, all I did was a:<br />"SELECT TOP 200 * FROM CARS (WHERE CLAUSE) ORDER BY IMPORTANCE DESC" and cache 10 20-result pages.<br />But since Blue, Red and Green all had the same IMPORTANCE, the order in which they would be displayed was totally random. Now I have to treat equally all cars that have the same importance, which means I can’t TOP 200 anymore, I need to get all records and shape up the recordset so that I show each color once until I went though all colors and go aruond again.<br /><br />Of course, I can start by testing if there are at least 200 Gold Cars, in which case I don’t have to worry about the rest <img src=’/community/emoticons/emotion-1.gif’ alt=’‘ /><br /><br />So the question is, how much of all this can I do SQL Server side ?<br />I know I could do most of the processing in my component but that would mean retrieving every single record first, and that’s what I’m trying to avoid because this kind of query is called upon many times each minute.<br /><br />Is it possible to get most of this done SQL Server side?<br /><br />Thanks,<br /><br />Thomas.
Hi Thomas, I’m not sure it’s worth the effort, but hey …. You’ll need to build a routine that assigns an order number starting from 1 for Gold, increasing by 1, and that will store the order number in a temporary table along with the unique key for the record in your CARS table. Next you add the Blue, Red and Green cars. You’ll need to know how many Blue cars, Red cars and Green cars you have. The starting point for the second group is the highest order number for Gold. For Blue, add 1 to MAX(Gold) and increase by 3, but if the current record is number n for Blue and the total number for either Red or Green is smaller than n, increase by 2 instead of 3, and if both Red and Green are smaller than n then increase by one. For Red, add 2 to MAX(Gold) and increase by 3, with the same increment rules like for Blue, with regards to Blue and Green. For Green, add 3 to MAX(Gold) and increase by 3, with the same increment rules like for Blue, with regards to Blue and Red. You should be able to figure out the rest from there …
It would be possible to do but you would need to use a cursor which hurts performance. Why don’t you return multiple recordset for each color and if you are using ADO.Net return them all into one Dataset and work with it there parsing the DataTable Collection.
Not sure you can avoid using a cursor for this type of sorting. You could pull different datasets into the client application, and just add one record from each dataset in turn. The key factor is that you need to have an order numbering within each color, and then calculate from varying starting points. Let’s try again within T-SQL: You have @max_gold (same as count_gold) as the starting point, and you also need @count_blue, @count_red and @count_green upfront. Create a temp table for each color, with an identity field — IDENTITY(1,1) — and insert your car_id field for the CAR records with that color. SELECT CAR.car_id, <calculated expression>
FROM CAR INNER JOIN temp_blue ON CAR.car_id = temp_blue.car_id <calculated expression> should go like this for blue:
@max_gold + temp_blue.id + (CASE WHEN @count_red < temp_blue.id THEN -1 END) + (CASE WHEN @count_green < temp_blue.id THEN -1 END) Next, UNION SELECT joining on temp_red table with expression:
@max_gold + temp_blue.id + 1 + (CASE WHEN @count_blue < temp_red.id THEN -1 END) + (CASE WHEN @count_green < temp_red.id THEN -1 END) Next, UNION SELECT joining on temp_green table with expression:
@max_gold + temp_blue.id + 2 + (CASE WHEN @count_blue < temp_green.id THEN -1 END) + (CASE WHEN @count_blue < temp_green.id THEN -1 END) Use this whole SELECT … UNION SELECT … UNION SELECT … thing as a derived table, and SELECT from that, ordering by the calculated expression. Okay, the calculations may need some double-checking, but as long as you’re testing …
Or easier still: SELECT CAST(car_id AS INT) AS car_id, IDENTITY(int, 1,3) AS order_id
WHERE CARS.Color = ‘blue’ SELECT CAST(car_id AS INT) AS car_id, IDENTITY(int, 2,3) AS order_id
WHERE CARS.Color = ‘red’ SELECT CAST(car_id AS INT) AS car_id, IDENTITY(int, 3,3) AS order_id
WHERE CARS.Color = ‘green’ SELECT CARS.* FROM
(SELECT car_id, order_id FROM #temp_blue
UNION SELECT car_id, order_id FROM #temp_red
UNION SELECT car_id, order_id FROM #temp_green) AS TMP
INNER JOIN CARS ON TMP.car_id = CARS.car_id
ORDER BY TMP.order_id
Thank you all, I’ll start testing this right away and keep you posted. Have a nice day, Thomas.