Denormalization in SQL Server for Fun and Profit

This is just a downright ugly approach and one that leads to
some unpleasant UNION queries to splice together both halves of the query
chain.   Another method I’ve often seen used is the “dynamic FK” approach:

Table: Students

Student (PK)

ParentKey (FK)

ParentKeyType

Column ParentKey can either be a Teacher Key or a
Classroom Key; the column ParentKeyType contains a code or other value
to indicate which type ParentKey really is.   This is an even uglier
approach that will eventually drive you to a dependence on illegal street drugs
and an early grave.  Let’s try for something better.

The key here is to more closely model the real situation. 
If the school may split classes into subclasses, they’re effectively creating a
new “virtual” sub-classroom.  So let’s create a table for that entity, and attach
all our other tables to it:

Table: Classrooms

Classroom  (PK)

Teacher  (FK)

Table: Subclasses

Subclass  (PK)

Classroom  (FK)

Table: Teachers

Teacher  (PK)

Subclass  (FK)

Table: Students

Student  (PK)

Subclass  (FK)

How well does this schema handle the current business rules,
as well as each of the two possible future variants? 

Current Rule:  Each teacher assigned to
single classroom; each student assigned to single teacher.
   In this case
we create a UNIQUE constraint enforcing one subclass per class, and a second
UNIQUE constraint enforcing one teacher per subclass.  To make the model a little
easier to use, we might also want a trigger on Classrooms to automatically
create and manage the matching subclass row for each class.    Even though
students aren’t directly assigned to teachers or classes, the constraints
create a 1:1 relationship that allow us to enforce that relationship

Variant #1: Each student assigned to a single
classroom; one or more teachers per class.
  In this case, we drop the
UNIQUE constraint on teachers, allowing multiple teachers to be assigned to
each subclass.

Variant #2: Each student assigned to a
single teacher; one or more teachers per class.
   Here, we leave the UNIQUE
constraint on teachers, but drop the constraint on subclasses, and the
classrooms trigger that synchronizes subclasses.    When multiple teachers are
assigned to a class, we insert a new subclass row and assign the second teacher
to it.  This allows us to enforce the one-teacher-per-student rule, but still
allow teachers to share classrooms.

So there you have it: a data model that flexibly handles
three entirely different sets of business logic, without being fully normalized
for any of them.    You can now rapidly address future requirement changes
without having to do more than drop a constraint.

Conclusion

I hope this article has shown you that thoughtful, targeted
denormalization can often be a significant benefit to future-proofing your
applications.    Developers often plan ahead  by increasing column lengths or
including additional ones, but  the changes these prevent are minor compared to
the significant restructuring that you can prevent through denormalization.  So
plan ahead, and think for the future.

However, the more common reason to denormalize is to
increase application performance, not future-proof.    In the second half of
this article, I’ll show the magnitude of performance gains you can achieve from
denormalization.  They’re larger than you might think.

]]>

Leave a comment

Your email address will not be published.