Denormalization in SQL Server for Fun and Profit
Future-proofing an application via denormalization can also involve the removal of tables, or rather reverse-decomposing multiple tables into one. Imagine a book store chain has individual stores, each of which carries specific genres of books, and handles only specific publishers. Due to contractual obligations, if a store carries a genre, it must do so for all publishers It deals with.
Fully normalizing this relationship requires three tables. With sample data, they might appear like this:
|
Table: StoreGenres |
|
|
Store |
Genre |
|
Books R US |
Mystery |
|
Books R US |
Science Fiction |
|
Downtown Books |
Textbooks |
|
Downtown Books |
Horror |
|
Book Nook |
Self Help |
|
Table: StorePublishers |
|
|
Store |
Publisher |
|
Books R US |
Acme Publishing |
|
Books R US |
TGA Distributors |
|
Downtown Books |
TGA Distributors |
|
Downtown Books |
Arkham House |
|
Book Nook |
Acme Publishing |
|
Book Nook |
TGA Distributors |
|
Table: PublisherGenres |
|
|
Publisher |
Genre |
|
Acme Publishing |
Mystery |
|
Acme Publishing |
Self Help |
|
TGA Distributors |
Textbooks |
|
TGA Distributors |
Mystery |
|
TGA Distributors |
Science Fiction |
|
TGA Distributors |
Self Help |
|
Arkham House |
Horror |
The list of genres a store carries from each publisher is generated with the following query:
SELECT sp.Store, sp.Publisher, pg.Genre
FROM StorePublishers sp
JOIN PublisherGenres pg ON sp.Publisher = pg.Publisher
JOIN StoreGenres sg ON sg.Store = sp.Store AND sg.Genre = pg.Genre
Planning Ahead. What happens when the sales contract expires, and stores are free to carry only specific genres from individual publishers? In that case, the above schema won’t work – if Downtown Books wants to carry mysteries from Acme, they’ll be forced into carrying Acme’s self-help books also.
To handle this contingency, you reverse-decompose the three tables into one:
|
StorePublisherGenres |
||
|
Books R US |
Acme Publishing |
Mystery |
|
Books R US |
TGA Distributors |
Mystery |
|
Books R US |
TGA Distributors |
Science Fiction |
|
Downtown Books |
TGA Distributors |
Textbooks |
|
Downtown Books |
Arkham House |
Horror |
|
Book Nook |
Acme Publishing |
Self Help |
|
Book Nook |
TGA Distributors |
Self Help |
Note: this table data is nothing more than the results of the above query. But by physically materializing it, you allow individual genres from each publisher to be inserted or deleted. You’ve future-proofed the application to allow each to vary independently. (You’ve also improved performance somewhat as well—you’ve recast a three-way JOIN into a single table query).
As in the prior examples, your denormalization has created a minor update anomaly. If Downtown Books adds Acme Publishing to their list of publishers, you have to insert two new records, not one – at least until that contract expires. Of course, you can always ensure consistency through a trigger on the table or an update stored procedure. When the contract expires, simply update the trigger, rather than replacing all three tables, and the queries that reference them.
In these simple examples, you could say we’re not really denormalizing as much as we’re simply “pre-normalizing”. In other words, while we’re technically violating normal form according to the current business requirements, once the anticipated changes take place, our schema will again be normalized. However, for more complex cases, this isn’t always true.
To see this, let’s return to our example of teachers and students, with the current business rule that each student is assigned to a single teacher and classroom, and the future possibility that backup teachers will be assigned to each class. What if there was another possibility- that the school might instead “split” each classroom into two, and assign each subclass its own teacher and students. Can we create a schema that allows for the possibility of both this and the teacher/backup teacher we already considered?
One possibility is to add a second foreign key into the students table:
|
Table: Students |
|
Student (PK) |
|
Teacher (Nullable FK) |
|
Classroom( Nullable FK) |
One of the FKs relates to teachers, the other to classrooms. In the case of students assigned to a class with multiple teachers, the second FK is used and the first is left null. If students are assigned to a teacher in a “subclass”, the reverse is true.



No comments yet... Be the first to leave a reply!