SQL Server Performance Forum – Threads Archive

Group by & Distinct Function comparision

For the same output resultset if we fetch it through a distinct function and a group by function, performance wise which will be better.

Hi, Check the execution plans for both and see what you get. It depends on what you’re trying to do and why really.

quote:Originally posted by srinibash For the same output resultset if we fetch it through a distinct function and a group by function, performance wise which will be better.

Karl Grambow www.sqldbcontrol.com

It should be the same. Compare the execution plans: they should be identical.

But I doubt that for a table that has large number of records, group by may be better than Distinct as it unnecessarily order the Records by ASC Madhivanan Failing to plan is Planning to fail

quote:Originally posted by Madhivanan But I doubt that for a table that has large number of records, group by may be better than Distinct as it unnecessarily order the Records by ASC Madhivanan Failing to plan is Planning to fail

Incorrect, there is no automatic ordering for DISTINCT or GROUP BY, only if you add WITH CUBE or WITH ROLLUP to a GROUP BY clause.

Well. But DISTINCT always order the Records by ASC by default Madhivanan Failing to plan is Planning to fail

I think the optimiser semantically optimises both statement to an identical execution plan. Even if there is a sorting order without an ORDER BY, this is then only implementation defined and as such not guaranteed to work always this way. IIRC, MS changed their algorithm for GROUP BY between 6.5 and 7.0 While 6.5 returned an sorted resultset and many developers relied on this behaviour, the changed algorithm broke many existing code. As always, when you need an sorted resultset there’s no way around ORDER BY. —
Frank Kalis
Microsoft SQL Server MVP
http://www.insidesql.de
Heute schon gebloggt?http://www.insidesql.de/blogs

Madhivanan, I can confirm that I can run GROUP BY and DISTINCT queries for a single column from a single table where the results are not ordered. If the column that you are querying is not the first column of a clustered index, and not the first column of a composite primary key, then the results are just not ordered. This is valid for both GROUP BY and DISTINCT. CREATE TABLE #MyTable
(Ident INT IDENTITY(1, 1) PRIMARY KEY NONCLUSTERED, Col1 VARCHAR(1), Col2 VARCHAR(1))
CREATE UNIQUE NONCLUSTERED INDEX idx1 ON #MyTable (Col1, Col2) INSERT INTO #MyTable (Col1)
SELECT ‘a’ UNION SELECT ‘b’ UNION SELECT ‘c’ UNION SELECT ‘d’ UNION SELECT ‘e’ UNION SELECT ‘f’ UNION SELECT ‘g’ UPDATE #MyTable SET Col2 = ‘a’ INSERT INTO #MyTable (Col1, Col2)
SELECT T1.col1, T2.col2
FROM #MyTable T1
INNER JOIN
(SELECT ‘b’ AS Col2 UNION SELECT ‘c’ UNION SELECT ‘d’ UNION SELECT ‘e’ UNION SELECT ‘f’ UNION SELECT ‘g’) AS T2
ON T1.col1 = T2.Col2 INSERT INTO #MyTable (Col1, Col2)
SELECT DISTINCT T2.col2, T1.col1
FROM #MyTable T1
INNER JOIN
(SELECT ‘h’ AS Col2 UNION SELECT ‘i’ UNION SELECT ‘j’ UNION SELECT ‘k’ UNION SELECT ‘l’ UNION SELECT ‘m’) AS T2
ON T1.col1 <> T2.Col2 INSERT INTO #MyTable (Col1, Col2)
SELECT DISTINCT T2.col2, T1.col1
FROM #MyTable T1
INNER JOIN
(SELECT ‘n’ AS Col2 UNION SELECT ‘o’ UNION SELECT ‘p’ UNION SELECT ‘q’ UNION SELECT ‘r’ UNION SELECT ‘s’) AS T2
ON T1.col1 <> T2.Col2 INSERT INTO #MyTable (Col1, Col2)
SELECT DISTINCT T2.col2, T1.col1
FROM #MyTable T1
INNER JOIN
(SELECT ‘t’ AS Col2 UNION SELECT ‘u’ UNION SELECT ‘v’ UNION SELECT ‘w’ UNION SELECT ‘x’ UNION SELECT ‘y’ UNION SELECT ‘z’) AS T2
ON T1.col1 <> T2.Col2 INSERT INTO #MyTable (Col1, Col2)
SELECT DISTINCT T2.col2, T1.col1
FROM #MyTable T1
INNER JOIN
(SELECT ‘1’ AS Col2 UNION SELECT ‘2’ UNION SELECT ‘3’ UNION SELECT ‘4’ UNION SELECT ‘5’ UNION SELECT ‘6’ UNION SELECT ‘7’
UNION SELECT ‘8’ UNION SELECT ‘9’ UNION SELECT ‘0’) AS T2
ON T1.col1 <> T2.Col2 INSERT INTO #MyTable (Col1, Col2)
SELECT T1.col2, T1.col1
FROM #MyTable T1
WHERE NOT EXISTS (SELECT * FROM #MyTable T2 WHERE T2.Col1 = T1.Col2 AND T2.Col2 = T1.Col1) select Col1 from #MyTable group by col1
select Col2 from #MyTable group by col2 select distinct Col1 from #MyTable
select distinct Col2 from #MyTable drop table #MyTable

… and if the column IS the first column of a clustered index or a primary key, then SQL Server is reading data from the corresponding index, so there is no additional processing involved for the ordering.

Well Adriaan. I didnt aware on these points. Thanks [<img src=’/community/emoticons/emotion-1.gif’ alt=’

‘ />]<br /><br />Madhivanan<br /><br />Failing to plan is Planning to fail

]]>