Do not Allow Nulls for performance? | SQL Server Performance Forums

SQL Server Performance Forum – Threads Archive

Do not Allow Nulls for performance?

I have heard somewhere that you can improve query performance by not allowing nulls at the column level if it can be avoided. It involved something about how the data is searched if nulls exist vs. if they do not (heap, seeks, scans?). Anyone have the details on this?
i believe the key point is that there has to be a 1 byte value indicating whether a null is stored or not.
so more logic is involved in finding the location of nullable or variable length columns
in general, the primary key columns should be fixed length, not null.
also avoid variable lenght for short strings (8-15 char max)
i would just try to avoid nullable columns, i think Joe Celko said that a good database has no more than a dozen or so nullable columns over the entire database, which i interpret as only applying to key tables, (who cares about nullable columns in infrequently used tables)

This might turn out an almost religious discussion. [<img src=’/community/emoticons/emotion-1.gif’ alt=’:)‘ />]<br />The question whether there is a place for the NULL concept in databases has been raging since Codd invented the wheel. We are not going to settle this here. Nor do I think anyone can settle this at all. <br />Ignoring the logical implications for now, the impact of the NULL bitmap and calculating the column offset for a NULLable column is negligible. NULLable columns (VARCHARs) can save you a considerable amount of storage space as they only occupy the actual space needed, while CHARs always use their full space, not matter what or what not is stored there. Not using storage space means more rows fit on a single data page, thus reducing IO operations needed to retrieve the data. The SQL Server storage engine is well optimised to deal with NULL. <br />However, what surely can improve performance is the fact that in avoiding NULLs you can simplify your query logic greatly. When NULLs are involved you have to deal with the NULL ternary logic which can lead to unneccesary complex queries which might drag down performance. I think the general approach is to avoid NULLs whenever possible and to only use them when you *really* need this special NULL logic and *really* need to model some unknown data. <br /><br />I agree with Joe that key columns should be of fixed length. And from a strict performance point of view, they should be of a numeric data type as these are easier to sort and search than character data types.<br /><br />–<br />Frank Kalis<br />Microsoft SQL Server MVP<br /<a target="_blank" href=http://www.insidesql.de>http://www.insidesql.de</a><br />Ich unterstütze PASS Deutschland e.V. <a target="_blank" href=http://www.sqlpass.de>http://www.sqlpass.de</a>) <br />
Thanks for the information Joe and Frank! Matt
]]>