Unicode Compression in SQL Server 2008 R2

Introduction

Data Compression was first introduced in SQL Server 2005 with Service Pack 2 when Microsoft introduced a new storage format called VARDECIMAL to store decimal and numeric data types as variable length columns. In SQL Server 2008, Microsoft enhanced Data Compression feature to compress fixed length data types such as INTEGER, CHAR and FLOAT. Microsoft further enhanced data compression feature in SQL Server 2008 R2 (Interim Release) to compress fixed length data types such as NCHAR(N) and NVARCHAR(N). If you are new to the Data Compression Feature of SQL Server then you can refer to my earlier article titled Data Compression in SQL Server 2008.

The main advantage of using Data Compression is that it reduces the storage cost and increases the query performance by reducing I/O. DBAs need to be very careful before using Database Compression as it is a very CPU intensive activity. If you are not facing a CPU crunch and you have available I/O then it is a good idea to implement data compression. The data compression feature is currently available in Enterprise, Data Center and Developer Editions of SQL Server 2008 R2.

What is Unicode Compression in SQL Server 2008 R2?

In SQL Server 2008 R2 Microsoft introduced a new algorithm known as Standard Compression Scheme for Unicode (SCSU) to compress Unicode data that is stored in Row or Page compressed objects. This algorithm reduces the amount of disk space used by Unicode characters. If you have a table which has NCHAR(N) or NVARCHAR(N) data types then once ROW or PAGE level compression is implemented, the SQL Server Database Engine automatically enables Unicode compression. Unicode provides support for data stored in English, German, Hindi, Turkish, Vietnamese and Japanese locales. SQL Server Database Engine internally stores Unicode data as 2 bytes regardless of the locale chosen by the user. The below table gives you an overview of the space saving that can be achieved for different languages.

Locale	Compression Percent
English	50%
German	50%
Hindi	50%
Turkish	48%
Vietnamese	39%
Japanese	15%

(Reference Books Online)

Estimating Storage Space Savings

There are two different ways by which database administrator can estimate the savings in storage space for tables and indexes.

1. Using sp_estimate_data_compression_savings inbuilt system stored procedure.
2. Using Data Compression Wizard.

Estimate the Savings in Storage Space Using Inbuilt System Stored Procedure

DBAs can estimate the storage space savings using the sp_estimate_data_compression_savings inbuilt system stored procedure. The below TSQL Query return the space saved when Row Level Compression is enabled for Person.Address table of the AdventureWorks2008 sample database.

Use AdventureWorks2008

EXEC sp_estimate_data_compression_savings

@schema_name = ‘Person’

,@object_name = ‘Address’

,@index_id = NULL

,@partition_number = NULL

,@data_compression = ‘ROW’

Estimate the Savings in Storage Space Using Data Compression Wizard

DBAs can leverage the Data Compression Wizard to estimate the storage space savings when Row or Page level compression is enabled for a particular table or index within a database.

1. Connect to SQL Server Database Engine using SQL Server Management Studio.
2. Expand Databases | AdventureWorks2008 | Tables | right click Person.Address | select Storage and then choose Manage Compression… option from the drop down list as shown in the snippet below.

3. In the Welcome to Data Compression wizard screen, click Next to continue.
4. In the Select Compression Type wizard screen, choose Row in the Compression Row drop down list and then click on Calculate to estimate the space savings.

5. If you are interested in implementing Row level compression, click the Next button to continue with the rest of the steps within the wizard to implement data compression.

Continues…

Leave a comment Cancel reply