The UDF Function Code
IF EXISTS (SELECT name FROM dbo.sysobjects WHERE id = Object_id(N'[dbo].[row_count]’) )
DROP FUNCTION [dbo].[row_count]
GO
CREATE FUNCTION dbo.row_count (@table_name sysname)
— @table_name we want to get count
RETURNS bigint
/*
——————————————————-
— Function Name: row_count
— Author: Mohamed Hassan
— Email: moh_hassan20@yahoo.com
— Development Date: 08/11/2008
— Version: 1.0
— Description: Return row count of the whole table, as a replacement for count(*) , give extra performance at least 70% over , than count(*) for large tables with millions of rows
— SQL Server: SQL server 2005
— Usage Example: select dbo.row_count (‘Sales.SalesOrderDetail’)
— Copyright:
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU Lesser General Public License as
published by the Free Software Foundation, either version 3 of the
License, or any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
——————————————————-
*/
AS
BEGIN
DECLARE @nn bigint — number of rows
IF @table_name IS NOT NULL
BEGIN
SELECT @nn = sum( p.rows )
FROM sys.partitions p
LEFT JOIN sys.allocation_units a ON p.partition_id = a.container_id
WHERE
p.index_id in(0,1) — 0 heap table , 1 table with clustered index
and p.rows is not null
and a.type = 1 — row-data only , not LOB
and p.object_id = object_id(@table_name)
END
RETURN (@nn)
END
GO
The function usage
The function row_count is called by passing the full name of the table schema.table name
Example1:
Select dbo.row_count (schema.[table name]), as in the following example:
use AdventureWorks
go
select dbo.row_count (‘Sales.SalesOrderDetail’)
The query analyzer return the result of that query with 121317 rows in the table, which is the same as count(*) but quicker and achieve the best performance than count(*).
Example 2:
SELECT top 5 TABLE_SCHEMA, TABLE_NAME, (TABLE_SCHEMA +’.’+TABLE_NAME) ‘Full table name’,
dbo.row_count(TABLE_SCHEMA +’.’+TABLE_NAME) rows
FROM INFORMATION_SCHEMA.TABLES
where TABLE_TYPE =’BASE TABLE’
ORDER BY rows desc
The result of the query is in the following table:
Schema |
Table Name |
Full Table Name |
rows |
Sales | SalesOrderDetail | Sales.SalesOrderDetail | 121317 |
Production | TransactionHistory | Production.TransactionHistory | 113443 |
Production | TransactionHistory | Production.TransactionHistoryArchive | 89253 |
Production | WorkOrder | Production.WorkOrder | 72591 |
Production | WorkOrderRouting | Production.WorkOrderRouting | 67131 |
The function is tested carefully by adding /deleting rows , truncating table , bulk insert /bulk delete , and return accurate results as given by count(*).
Performance Evaluation
A batch is executed to compare the performance of count(*) and UDF row_count ,as in the following code:
select dbo.row_count (‘Sales.SalesOrderDetail’)
go
select count (*) from Sales.SalesOrderDetail
go
When the execution plan is reviewed, we found the first query (row_count) cost relative to the batch is 7% and estimated subtree cost is 0.03 as in the following figure:
The second query (count(*)) cost relative to the batch is 93% and estimated subtree cost is 0.37 as in the following figure:
The performance of the UDF function row_count () is the best, and consume low resources of the database. The cost ratio between count () and the new UDF “row_count is” is 93: 7, which means that performance of row_count is better ten times more than count (*).
These numbers may vary based on the rows in the table, but in general it gives an indication how the new UDF is better than Count (*).
The UDF function can be efficient in the case you need to get the count of the whole table.
Conclusion
The built-in function count (*) is very consuming time to get the count of the whole table, especially in tables with millions of row.
The new method of the user defined function “row_count” which depends on Object Catalog Views give accurate results and quicker and achieve best performance than count(*).
This function can be used in SQL Server 2005, because these new views are available only on these versions.
]]>