SQL Server 2008 - Worth the Wait
What is meant here is the Microsoft Access feature to create unique indexes on columns that can contain two or more NULLs. SQL Server does not allow a duplicate entry in an index that is created as UNIQUE. Consider this:
CREATE TABLE Table1 (cID INT PRIMARY KEY , c1 INT NULL) CREATE UNIQUE NONCLUSTERED INDEX ix_test ON Table1(c1) INSERT INTO Table1 SELECT 1,1 UNION ALL SELECT 2,NULL UNION ALL SELECT 3,NULL DROP TABLE Table1
This yields:
Server: Msg 2601, Level 14, State 3, Line 5 Cannot insert duplicate key row in object 'Table1' with unique index 'ix_test'. The statement has been terminated.
Using a UNIQUE constraint is not even better.
CREATE TABLE Table1 (cID INT PRIMARY KEY , c1 INT NULL CONSTRAINT c_c1 UNIQUE) INSERT INTO Table1 SELECT 1,1 UNION ALL SELECT 2,NULL UNION ALL SELECT 3,NULL DROP TABLE Table1
Server: Msg 2627, Level 14, State 2, Line 4 Violation of UNIQUE KEY constraint 'c_c1'. Cannot insert duplicate key in object 'Table1'. The statement has been terminated.
Since we cannot use an already existing functionality, we have to develop our own here. Basically we have 4 different methods to choose from. Let's have a look at them.
Use an Indexed View
SET NOCOUNT ON CREATE TABLE Table1 (cID INT PRIMARY KEY , c1 INT NULL) GO CREATE VIEW Table1_Unique_Non_NULL WITH SCHEMABINDING AS SELECT c1 FROM dbo.Table1 WHERE c1 IS NOT NULL GO CREATE UNIQUE CLUSTERED INDEX uc1 ON Table1_Unique_Non_NULL(c1) INSERT INTO Table1 SELECT 1,1 UNION ALL SELECT 2,NULL UNION ALL SELECT 3,NULL SET NOCOUNT OFF SELECT * FROM Table1_Unique_Non_NULL DROP VIEW dbo.Table1_Unique_Non_NULL DROP TABLE Table1
c1 ----------- 1
(1 row(s) affected)
As you can see, the second NULL value is ignored and gets inserted. However, try to insert a row like:
INSERT INTO Table1 SELECT 4,1
Server: Msg 2601, Level 14, State 3, Line 1 Cannot insert duplicate key row in object 'Table1_Unique_Non_NULL' with unique index 'uc1'. The statement has been terminated.
Use a Constraint on a Computed Column
SET NOCOUNT ON CREATE TABLE Table1 (cID INTEGER PRIMARY KEY , c1 INTEGER NULL , foolme AS (CASE WHEN c1 IS NULL THEN cID END) , CONSTRAINT cc1 UNIQUE (c1,foolme) ) INSERT INTO Table1 SELECT 1, 1 UNION ALL SELECT 2, NULL UNION ALL SELECT 3, NULL SET NOCOUNT OFF SELECT * FROM Table1 DROP TABLE Table1
Again, this method lets you insert multiple NULL values into c1. However, trying to insert the same row as in our first method also yields a constraint violation error.
Server: Msg 2627, Level 14, State 2, Line 1 Violation of UNIQUE KEY constraint 'cc1'. Cannot insert duplicate key in object 'Table1'. The statement has been terminated.
Use a Trigger
SET NOCOUNT ON CREATE TABLE Table1 (cID INTEGER PRIMARY KEY , c1 INTEGER NULL ) GO CREATE TRIGGER IgnoreNulls ON Table1 FOR UPDATE, INSERT AS IF EXISTS (SELECT * FROM Inserted AS I JOIN Table1 AS t1 ON I.c1 = t1.c1 AND I.cID<>t1.cID) BEGIN ROLLBACK TRAN RAISERROR('Duplicates are not allowed!',16,1) END GO INSERT INTO Table1 SELECT 1, 1 UNION ALL SELECT 2, NULL UNION ALL SELECT 3, NULL SET NOCOUNT OFF SELECT * FROM Table1 DROP TABLE Table1
Again, multiple NULLs are allowed, but every non-NULL value needs to be UNIQUE. (You can test this out by yourself. It is not shown here.)
Split the Table
Every now and then, you might see someone suggesting that we split this table in two. To do this, we would move the column in question to the new table and make it PRIMARY KEY of that table. Then we would create a 1:1 relation between both tables. This approach is not discussed here. While it might be a feasible solution when implemented right from the beginning in your database and in your application, it is clearly a laborious task when you want to implement such a feature once your database and application have gone into production.
Which Method to Choose?
The first method (Use an Indexed View) is very fast and offers in addition the advantage of being reusable for other purposes. The second method (Use a Constraint on a Computed Column) will also offer very good performance as it enforces integrity at a very low level, thus uses highly optimized machine code. The third method (Use a Trigger) is likely to be the slowest. A decision between the first and the second method depends on your specific requirements. Now that you've seen what solutions you have at hand, do some testing in your own environment to determine which of these two methods to use.