SQL Server Performance Forum – Threads Archive
Denormalizing birthdate to age. Good or bad?Hi again! I’m back with a question we are discussing at my office.
Lets say you have a large table (100.000 ++) with user information, one of the columns include the users birthdate as a datetime datatype. The users age are needed in multiple and complex select queries to find users within a given age-range. What do you think would be best?
A. To create a column in the user table to hold age, then running a procedure every night to update that column.
B. To create a udf to calculate age on the fly when needed. My guess is that it is best to do alternative A. Then I could even create index on age which would be very useful.
However my co-worker disagrees (he is a normalization freek), and says he think the night running query will be slow and that it is better to index the datetime, convert the input age-range to a datetime range, search for the given datetime-range and when showing the output use the udf to get age. What do you think? Is a night running age-update query going to take really long on a table this size?
why not use computed column in a view? you may want to use the indexed views. actually, 100K is not that much. However, if you have a lot of queries, consider using indexed views. May the Almighty God bless us all!
You can take advantage of index on dateTime column using dateAdd function. select *
from dbo.users u
where u.birthDate > dateAdd(year, -18, getDate()) –can drink However, for BirthDate I would use smallDateTime instead of dateTime. It takes half space dateTime takes, so queries using index on birthDate smallDateTime column would be faster.
More about smallDateTime, hopefully its range between 1900 and 2079 is acceptable for your needs.
Yes, that is not a bad idea. smalldatetime is acceptable for my needs. (at least until the Y2079 bug <img src=’/community/emoticons/emotion-1.gif’ alt=’‘ />)<br /><br />So in your opinions there is really nothing at all to gain by denormalizing the age in a column of its own?<br />indexed views and dateadd queries will preform just as fast?
IMO, from performance point of view, you just need an index on date column and sargable queries like one I posted.
hi ya sargable queries like one I posted that being the key bit… an indexed view would not help in this case since age is not deterministic (i.e. every time you query the table you could potentially get a different answer, even if the underlying data doesn’t change) Cheers
Yes, i agree with twan. getdate is nondeterministic and since indexed view uses actual storage by allocating clustered index, you might as well add a column containing age. create a trigger to update the age everytime the birthdate is changed. <br /><br />i was thinking of creating a getdate function that is deterministic (that is, it returns the date in mm/dd/yyyy format) but since getdate itself is nonderministic and is not allowed in the function, you might have to create a function calling a stored procedure just to get the date using nondetermistic function! what a drag <img src=’/community/emoticons/emotion-1.gif’ alt=’‘ />.<br /><br />May the Almighty God bless us all!<br />www.empoweredinformation.com
Deterministic or not, there is no need for either view or age column.
for operational reports, check what queries are running against the table. If they have a lot of common tables, the view might be a good solution. However, the UDF usually works if you don’t need to create a view. If you’re in a Data warehouse environment, you may need to create the age column based on certain requirements. May the Almighty God bless us all!