Speeding up the Performance of Table Counts in SQL Server 2005 – Page 2

The UDF Function Code
IF EXISTS (SELECT name FROM dbo.sysobjects WHERE id = Object_id(N'[dbo].[row_count]’) )
DROP FUNCTION [dbo].[row_count]
GO

CREATE FUNCTION dbo.row_count (@table_name sysname)
— @table_name we want to get count

RETURNS bigint
/*
——————————————————-
— Function Name: row_count
— Author: Mohamed Hassan
— Email: moh_hassan20@yahoo.com
— Development Date: 08/11/2008
— Version: 1.0
— Description: Return row count of the whole table, as a replacement for count(*) , give extra performance at least 70% over , than count(*) for large tables with millions of rows
— SQL Server: SQL server 2005
— Usage Example: select dbo.row_count (‘Sales.SalesOrderDetail’)
— Copyright:
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU Lesser General Public License as
published by the Free Software Foundation, either version 3 of the
License, or any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
——————————————————-
*/
AS
BEGIN
DECLARE @nn bigint — number of rows

IF @table_name IS NOT NULL
BEGIN
SELECT @nn = sum( p.rows )
FROM sys.partitions p
LEFT JOIN sys.allocation_units a ON p.partition_id = a.container_id
WHERE
p.index_id in(0,1) — 0 heap table , 1 table with clustered index
       and p.rows is not null
and a.type = 1 — row-data only , not LOB
and p.object_id = object_id(@table_name)
END
RETURN (@nn)
END
GO

The function usage
The function row_count is called by passing the full name of the table schema.table name

Example1:
Select dbo.row_count (schema.[table name]), as in the following example:
use AdventureWorks
go
select dbo.row_count (‘Sales.SalesOrderDetail’)

The query analyzer return the result of that query with 121317 rows in the table, which is the same as count(*) but quicker and achieve the best performance than count(*).

Example 2:
SELECT top 5 TABLE_SCHEMA, TABLE_NAME, (TABLE_SCHEMA +’.’+TABLE_NAME) ‘Full table name’,
dbo.row_count(TABLE_SCHEMA +’.’+TABLE_NAME) rows
FROM INFORMATION_SCHEMA.TABLES
where TABLE_TYPE =’BASE TABLE’
ORDER BY rows desc

The result of the query is in the following table:

Schema	Table Name	Full Table Name	rows
Sales	SalesOrderDetail	Sales.SalesOrderDetail	121317
Production	TransactionHistory	Production.TransactionHistory	113443
Production	TransactionHistory	Production.TransactionHistoryArchive	89253
Production	WorkOrder	Production.WorkOrder	72591
Production	WorkOrderRouting	Production.WorkOrderRouting	67131

The function is tested carefully by adding /deleting rows , truncating table , bulk insert /bulk delete , and return accurate results as given by count(*).

Performance Evaluation
A batch is executed to compare the performance of count(*) and UDF row_count ,as in the following code:

select dbo.row_count (‘Sales.SalesOrderDetail’)
go
select count (*) from Sales.SalesOrderDetail
go

When the execution plan is reviewed, we found the first query (row_count) cost relative to the batch is 7% and estimated subtree cost is 0.03 as in the following figure:

The second query (count(*)) cost relative to the batch is 93% and estimated subtree cost is 0.37 as in the following figure:

The performance of the UDF function row_count () is the best, and consume low resources of the database. The cost ratio between count () and the new UDF “row_count is” is 93: 7, which means that performance of row_count is better ten times more than count (*).

These numbers may vary based on the rows in the table, but in general it gives an indication how the new UDF is better than Count (*).

The UDF function can be efficient in the case you need to get the count of the whole table.

Conclusion
The built-in function count (*) is very consuming time to get the count of the whole table, especially in tables with millions of row.

The new method of the user defined function “row_count” which depends on Object Catalog Views give accurate results and quicker and achieve best performance than count(*).

This function can be used in SQL Server 2005, because these new views are available only on these versions.

]]>

Leave a comment Cancel reply