Performance Impact of Using SELECT COUNT(*) In Queries

I am not sure how this statement came into common usage, however, it seems to be regularly used by developers. Count(*) will use a table scan or Clustered index scan and so will eventually impair performance.

In the case of a large table, a table scan or clustered index scan will impair your query performance, but let’s examine whether count(*) will always scan through the entire table.

Take for example the Sales.Customer table in the AdventureWorks2012 database.

It has four indexes :

sp_helpindex '[Sales].[Customer]' will give you the index properties of the table.

From running, DBCC IND, the following are the page counts for each index.

DBCC IND ('Adventureworks2012', '[Sales].[Customer]', 1);

Index Id

Index Name

Column

Data Type

Page Count

1

PK_Customer_CustomerID

CustomerID

Int

123

2

AK_Customer_rowguid

rowguid

uniqueidentifier

66

3

AK_Customer_AccountNumber

AccountNumber

Varchar(10)

61

5

IX_Customer_TerritoryID

TerritoryID

int

37

Since PK_Customer_CustomerID is the clustered index, it contains the all the pages for the table which has the highest count. Of the other three indexes, IX_Customer_TerritoryID is the smallest index in terms of the number of pages.

Therefore, it should be much faster to use the IX_Customer_TerritoryID index.  Let us see the query plan for COUNT(*) .

So as expected, Count(*) is using least costly index which is the IX_Customer_TerritoryID not the Clustered Index Scan.

There is another important issue to bear in mind. Let us say, by the time you query COUNT(*), all the pages of the table are loaded into the cache. So rather than using the non-clustered index, it may be better to use the Clustered Index since they are already in the cache. However, this is not the case for SQL Server. No matter what is in the cache,  SQL Server will still evaluate using the least costly mechanism for COUNT(*).

DBCC DROPCLEANBUFFERS -- Clean the Cache
SET STATISTICS IO ON -- to display query stats
SELECT * FROM [Sales].[Customer]
SELECT Count(*) FROM [Sales].[Customer]

Let us verify the query plan.

This has not changed despite the all pages are in the cache.Note the query stats :

You can see , it has used physical reads which is the non-clustered index. Thus, when using COUNT(*) verify whether correct indexes are being used and do not believe that COUNT(*) will use Clustered index.

]]>

Leave a comment

Your email address will not be published.