SQL Server T-SQL Tuning -TVF and Scalar Functions
In the first of this series we looked at TSQL tuning for temp tables, table variables and the Union command , we now turn our attention to User Defined Functions – either Table-Valued Functions or Scalar Functions.
A UDF is very convenient for centralising business logic as we can specify a set of business logic in one UDF which references multiple stored procedures and ad-hoc queries. However, they can lead to significant performance degradation due to their demands on the CPU
Table-Valued Functions (TVF)
As a general concept, when using a TVF an inner join with an object without sufficient indexing will result in a TVF scan which is very similar to an Index Scan or a Table Scan.
As an alternative we could create temp tables with the appropriate clustered indexes or non-clustered indexes as below:
- Create the appropriate temp tables.
- Create sufficient clustered and non clustered according to your T-SQL Query.
- Insert all data coming from the TVF into temp tables.
- Replace each TVF by this temp table and all relevant columns as well.
- Delete the temp tables at the end of the query.
Note the performance advantage of temp tables over table variables in this scenario since (as noted in the previous article) table variables do not support indexes.
Examples : :
Create a TVF :
use [Workshops] go alter FUNCTION Dep_Salaries1 ( @empid int ) RETURNS @table table ( Department int, Salary_Max int, Salary_Min int ) AS BEGIN declare @Department int = (select S.deptid from Employees s where s.empid=@empid) insert into @table SELECT S.deptid , max (Salary) , MIN(Salary) FROM Employees s inner join Departments T ON S.deptid =T.deptid group by S.deptid having S.deptid =@Department RETURN END GO
Poor T-SQL Practice1 using a TVF :
alter procedure Unperformant_SP1 @empid int as begin select T.deptid as department_name , s.* from Dep_Salaries1 (@empid )S inner join Departments T ON S.Department =T.deptid end
Replacing TVF using Temp tables :
go alter procedure Performant_SP1 @empid int as begin create table #table ( Department int, Salary_Max int, Salary_Min int ) create clustered index #table_index1 on #table (Department) insert into #table select * from Dep_Salaries1 (@empid ) select T.deptid as department_name , s.* from #table S inner join Departments T ON S.Department =T.deptid end
You should still performance test both approaches with your specific queries and data to determine which the best for your case.
Scalar Functions Within Selected Columns
Scalar functions are very convenient for determining aggregative values ,cumulative values and differentiated values within stored procedures or ad-hoc queries, but they come with a performance hit especially when used with large volumes of data since a scalar function will be executed for each record.
Alternatives to Scalar Functions
Alternatives to Scalar Functions
Temp tables can be used, although the situation is a little different from the TVF scenario as we are looking to dispense entirely with the scalar function and instead use their internal T-SQL code directly.
Persisted deterministic computed columns
Persisted deterministic computed column values are not recomputed every time they are selected but are computed one time when they are created. Thus they can add a distinct performance advantage for T-SQL queries as they reduce the processing overhead.
This feature can be added by following the below steps
- Adding a new computed column to store the results of scalar function.
- Enabling the persisted feature of this computed column.
- Place the appropriate indexes on the column either as key column or include column.
But bear in your mind that persisted feature has some restrictive limitations such as:
- Computed columns should not to use any aggregate functions of other records.
- Computed columns should not use functions that call external system procedures.
- Computed columns should not use any functions of other fields of other tables.
- It is generally better to use System provided functions such as Convert , Cast ,Replace etc and not UDFs created by developers since UDFs often contradict the deterministic feature.
This only applies to the persisted feature , but to able to add an index on these computed columns , they should
be also Precise columns by making sure they are calculated of precise data types like Int ,Bigint and
Datetime and not imprecise columns such as decimal types. If the datatypes are not precise ,you can
just add these columns to the ‘include columns’ part of the index and not ‘keys column’ part.
Use Scheduled Update Jobs
If it is not possible to use persisted deterministic computed columns, you can create normal columns and create scheduled update jobs on them to update them with the output of scalar functions and then use these columns in your T-SQL query instead of a scalar function as below:
Create a Scalar Function :
use [Workshops] go create FUNCTION Salary_Tax ( @empid int ) RETURNS float AS BEGIN declare @salary int = (select (S.salary-100) from Employees s where s.empid=@empid) RETURN @salary END GO
--Poor T-SQL Practice using the Scalar function Select empid ,dbo.Salary_Tax (empid) as 'SalaryWithTax' from Employees
Replace a Scalar function with a Temp table :
Create Table #temp (Empid int primary key clustered , Salary_Tax float) Create nonclustered index #temp_Index1 on #temp (Empid ) include (Salary_Tax ) insert into #temp select Empid ,(Salary-100) as salary_Tax from Employees select * from #temp
Replace a Scalar function using persisted deterministic computed columns :
ALTER TABLE dbo.Employees ADD Salary_Tax AS Salary-100 PERSISTED Create nonclustered index Employees_Index1 on Employees (Empid, Salary_Tax ) select empid ,Salary_Tax from Employees
Replace a Scalar function using Scheduled jobs :
ALTER TABLE dbo.Employees ADD Salary_Tax1 float, update_flag bit ALTER TABLE dbo.Employees ADD CONSTRAINT DF_Employees_update_flag DEFAULT 0 FOR update_flag
Schedule the below DML update by an appropriate frequency according to your workload
Update Employees set Salary_Tax1=Salary-100 WHERE UPDATE_Flag=0
Then you can include the below select query within your stored procedure.
select empid , Salary_Tax1 from Employees
In the next article of this series we will look at optimizing Not In and Not exists in T-SQL Queries