Speeding up the Performance of Table Counts in SQL Server 2005

The UDF Function Code
IF EXISTS (SELECT name FROM dbo.sysobjects WHERE id = Object_id(N'[dbo].[row_count]’) )
DROP FUNCTION [dbo].[row_count]
GO

 

 

 

 

 

CREATE FUNCTION dbo.row_count (@table_name sysname)
— @table_name we want to get count

RETURNS bigint
/*
——————————————————-
— Function Name: row_count
— Author: Mohamed Hassan
— Email: moh_hassan20@yahoo.com
— Development Date: 08/11/2008
— Version: 1.0
— Description: Return row count of the whole table, as a replacement for count(*) , give extra performance  at least 70% over , than count(*) for large tables with millions of rows
— SQL Server: SQL server 2005
— Usage Example: select dbo.row_count (‘Sales.SalesOrderDetail’)
— Copyright:
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU Lesser General Public License as
published by the Free Software Foundation, either version 3 of the
License, or any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
——————————————————-
*/
AS
BEGIN
DECLARE @nn bigint — number of rows

IF @table_name IS NOT NULL 
BEGIN
 SELECT  @nn = sum(  p.rows )  
 FROM sys.partitions p
 LEFT JOIN  sys.allocation_units a ON  p.partition_id = a.container_id
 WHERE
  p.index_id  in(0,1) — 0 heap table , 1 table with clustered index
       and p.rows is not null
  and a.type = 1  — row-data only , not LOB
  and p.object_id = object_id(@table_name)    
 END 
RETURN (@nn)
END
GO

The function usage
The function row_count is called by passing the full name of the table schema.table name

Example1:
Select dbo.row_count (schema.[table name]), as in the following example:
use AdventureWorks
go
select dbo.row_count (‘Sales.SalesOrderDetail’)

The query analyzer return  the result of that query with 121317 rows in the table, which is the same as count(*) but quicker and achieve the best performance than count(*).

Example 2:
SELECT top 5 TABLE_SCHEMA, TABLE_NAME, (TABLE_SCHEMA +’.’+TABLE_NAME) ‘Full table name’,
dbo.row_count(TABLE_SCHEMA +’.’+TABLE_NAME) rows
FROM INFORMATION_SCHEMA.TABLES
where TABLE_TYPE =’BASE TABLE’
ORDER BY rows desc

The result of the query is in the following table:

Schema

Table Name

Full Table Name

rows 

Sales SalesOrderDetail Sales.SalesOrderDetail  121317
Production  TransactionHistory  Production.TransactionHistory 113443
Production  TransactionHistory  Production.TransactionHistoryArchive 89253
Production WorkOrder Production.WorkOrder 72591
Production WorkOrderRouting Production.WorkOrderRouting 67131

The function is tested carefully by adding /deleting rows , truncating table , bulk insert /bulk delete , and return accurate results as given by count(*).

Performance Evaluation
A  batch is executed to compare the performance of count(*) and UDF row_count ,as in  the following code:

select dbo.row_count (‘Sales.SalesOrderDetail’)
go
select count (*) from Sales.SalesOrderDetail
go

When the execution plan is reviewed, we found the first query (row_count) cost relative to the batch is 7%  and estimated subtree cost is 0.03 as in the following figure:

 

The second  query (count(*)) cost relative to the batch is 93%  and estimated subtree cost is 0.37 as in the following figure:

 

The performance of the UDF function row_count () is the best, and consume low resources of the database. The cost ratio between count () and the new UDF “row_count is” is 93: 7, which means that performance of row_count is better ten times more than count (*).

These numbers may vary based on the rows in the table, but in general it gives an indication how the new UDF is better than Count (*).

The UDF function can be efficient in the case you need to get the count of the whole table.

Conclusion
The built-in function count (*) is very consuming time to get the count of the whole table, especially in tables with millions of row.

The new method  of the user defined function “row_count”  which depends on Object Catalog Views  give accurate results and quicker and achieve best performance than count(*).

This function can be used in SQL Server 2005, because these new views are available only on these versions.

]]>

Leave a comment

Your email address will not be published.