Site sponsored by: Idera Try Idera’s new SQL admin toolset
SQL Server Performance

  • Home
  • Articles
  • Forums
  • Tips
  • Quiz
  • FAQ's
  • Blogs
  • Software
  • Books
  • About Us
RSS Feeds
Sign in | Join


Article Topics

All Articles
Peformance Tuning
Audit
Business Intelligence
Clustering
Reporting Services
Developer
General DBA
ASP.NET / ADO.NET

SQL Server 2008 - Worth the Wait

SQL Server’s first significant upgrade in three years features a number of envelope-pushing enhancements and improvements. Which will have the greatest impact on SQL administration and development? More...
Latest Articles

Slowly Changing Dimensions in SQL Server 2005
Audit Data Modifications
SQL Server 2008’s Management Data Warehouse
Same Report but Different Methods in SQL Server Reporting Services ...

More     
 
Latest FAQ's

How to Integrate Performance Monitor and SQL Profiler
SSIS Lookups are Case Sensitive
Convert Number to Words in SSRS
After installing SP2 on SQL Server 2005 x64, when trying to ...

More     
   
Latest Software Reviews

SQL Server DBA Dashboard
SwisSQL DBChangeManager
SQLMesh - SQL Server Search Tool
SoftTreeTech SQL Assistant

More     

articles >> peformance tuning >> Are UDFs Harmful to SQL Server Performance? ...

Are UDFs Harmful to SQL Server Performance?

By : Matthew L. Wigdahl
Sep 07, 2005
Printer friendly

Well, not really. But it is vitally important to understand when to use them and when to avoid them, as well as how best to construct them so as not to shoot your foot off.

SQL Server 2000 introduced user-defined functions (UDFs), and they were immediately hailed as a great tool for encapsulating repetitive code, as well as allowing you to perform more complicated processing directly in an SQL expression. On its face, that claim is valid. You can certainly improve readability and maintainability with UDFs. But cleaner code will be cold comfort if your queries bog down and lock up your server.

As almost every article on UDFs will tell you, there are three different flavors of UDF in SQL Server 2000: scalar, tabular, and multistatement tabular. I will concentrate on the dangers of scalar UDFs, as they are the most likely to be used innocently in ways that can compromise performance.

A scalar UDF takes 0 to n parameters and returns a single scalar value. Often, this type of function is used to perform complicated mathematical transformations on input data. Other common uses include encapsulating a very complicated logical test so that a WHERE clause can be shortened and repeated uses can be consolidated into one place. There are various limitations on what types of operations you can perform in the function, but you have the ability to: call other internal functions, UDFs and external stored procedures; use the SELECT statement; and use looping and branching constructs. This is a lot of power, even restricted as it is.

Here’s an example you can try in a SQL Server 2000 database to illustrate how a scalar UDF is created and invoked:

CREATE FUNCTION fn_null(@id int)
RETURNS int
BEGIN
	RETURN @id
END
GO
	
CREATE TABLE #Tmp
(
	id int NOT NULL PRIMARY KEY,
	value int NOT NULL
)
GO
	
DECLARE @i int
SET @i = 1
	
WHILE (@i < 1000)
BEGIN
	INSERT INTO #Tmp (id, value) VALUES (@i, @i+1000)
	SET @i = @i + 1
END
GO
	
SELECT COUNT(*)
FROM #Tmp t1
	CROSS JOIN #Tmp t2
WHERE t1.id < t2.id
GO
	
SELECT COUNT(*)
FROM #Tmp t1
	CROSS JOIN #Tmp t2
WHERE dbo.fn_null(t1.id) < dbo.fn_null(t2.id)
GO
	
SELECT COUNT(*)
FROM #Tmp t1
	CROSS JOIN #Tmp t2
WHERE t1.id = 5
GO
	
SELECT COUNT(*)
FROM #Tmp t1
	CROSS JOIN #Tmp t2
WHERE t1.id = dbo.fn_null(5)
GO
	
SELECT COUNT(*)
FROM #Tmp t1
	CROSS JOIN #Tmp t2
WHERE dbo.fn_null(t1.id) = 5
GO
		    

This large mass of SQL will create a simple UDF, create a temporary table, populate the table, and invoke several sample queries against it. When run with the SQL Server Profiler active, the results are interesting:

Query CPU (ms) Reads Duration (ms)
1 235 3486 236
2 107265 5991103 111153
3 0 58 0
4 0 72 16
5 63 3076 63

The first query is a simple triangular join and serves as a benchmark. It took about a quarter of a second to execute, which doesn’t seem unreasonable. The second query is passed through two executions of the fn_null() UDF for every one of the million rows returned by the CROSS JOIN. This second query takes around 450 times more CPU power to execute than the first one does, for the exact same resultset. It really isn’t possible to make a UDF that is simpler than this, so it should be quite clear that UDFs that are invoked with nonconstant parameters are a potential source of trouble.

Looking at what happened in more detail: if the optimizer is not confused, you will execute a UDF that appears in the WHERE clause once for every potential record returned by the join that is not eliminated by the sargable predicates in the WHERE clause.

What this means is that if you are able to specify enough sargable predicates to substantially reduce the number of records before the UDFs are evaluated, the UDFs will degrade your performance less. If you cannot (and in the second example query there was no sargable predicate in the WHERE clause) you should be prepared for potentially extreme performance degradation.

Queries 3, 4, and 5 illustrate that the SQL Server optimizer is not completely hopeless, and that a UDF that causes problems when invoked with nonconstant parameters may perform acceptably when invoked with constant parameters. In this case, Query 3 is the benchmark. There is no UDF involved in the WHERE clause, and it performs near-instantaneously. Query 4 introduces the UDF, but with a constant parameter. Performance is the nearly the same as the version without the UDF. For comparison, Query 5 contains the same function, but the parameter is now nonconstant. As you can see, performance is significantly degraded compared to the original version.

Of course, this UDF is about as simple as it gets. If we complicate the UDF a bit such that the optimizer has a tougher time with it, we will see different results:

CREATE FUNCTION fn_null(@id int)
RETURNS int
BEGIN
	DECLARE @val int
	SELECT @val = COUNT(*) FROM Tmp WHERE id != @id
	RETURN @val + 7
END
GO
	
CREATE TABLE Tmp
(
	id int NOT NULL PRIMARY KEY,
	value int NOT NULL
)
GO
	
DECLARE @i int
SET @i = 1
	
WHILE (@i < 1000)
BEGIN
	INSERT INTO Tmp (id, value) VALUES (@i, @i+1000)
	SET @i = @i + 1
END
GO
	
SELECT COUNT(*)
FROM Tmp t1
	CROSS JOIN Tmp t2
WHERE t2.value != 1005
GO
	
SELECT COUNT(*)
FROM Tmp t1
	CROSS JOIN Tmp t2
WHERE t2.value != dbo.fn_null(1005)
GO
		    

This example is pretty contrived, and notice that I had to make the table a first-class table rather than a temporary table. This is because one of the restrictions on UDFs is that you cannot access temporary tables. The results of this query are:

Query CPU (ms) Reads Duration (ms)
1 0 62 0
2 1281 26089 1296

Note that it would be trivial in the case of Query 2 to declare a variable, populate it with the results of your UDF invoked with the constant parameter, and then use the variable in the comparison. This would achieve performance equivalent to Query 1.

Even though the fn_null() function includes a SELECT statement in this case, it is possible that even a function without a SELECT will be unoptimizable and will degrade your performance in a similar manner.

SQL Server user-defined functions are powerful tools, but as with all powerful tools they can be misused. Make sure you examine the performance of your queries closely with the Profiler and the Show Execution Plan option in Query Analyzer, maintain sargability and optimizability, and you will be able to reap the benefits of UDFs without suffering the drawbacks.



Comments:
Your Name  
Email    
(Emails will not be displayed on the site or used for promotional purposes)
Comment  


Type characters in the image
 
 (case sensitive)

 
 
 

        








Home | Peformance Articles | Audit Articles | Business Intelligence Articles | Clustering Articles | Developer Articles | Reporting Services Articles | DBA Articles | ASP.NET / ADO.NET Articles | DBA FAQ's | Developer Peformance FAQ's | DBA Peformance FAQ's | Developer FAQ's | Clustering FAQ's | Error Messages | Audit Tool Reviews | Backup Tool Reviews | Coding Tool Reviews | Compare Tool Reviews | Documentation Tool Reviews | Design Tool Reviews | Monitoring Tool Reviews | Log Tool Reviews | Reporting Tool Reviews | Clustering Tool Reviews | Security Tool Reviews | Change Management Tool Reviews | Remote Access Tool Reviews | Book Reviews | Security Tool Reviews | QDPMA Performance Tuning | ADO.NET / ASP.NET | Administration | Analysis/OLAP Services | Application Development | Configuration | Components | ETL | Hardware | High Availability | Hints | Index | Misc | Operating Systems | Performance Tuning | Replication | T-SQL | Views