SQL Server Performance Forum – Threads Archive

Duplicating data in a detail table

Hello, I am designing a database, and I am having a disagreement with a collegue. We have a master and detail table (i.e. Invoice and Invoice_Details). He wants to have data from the master table duplicated in the detail table. This way we won’t have to join with the master table to do most reports on the detail table. I think that this just duplicates data unnessesarily, and almost removes the need for the master table (there are still a few columns that will just be in the master table. These aren’t in every report like the duplicated data would be). He thinks that the join to the master table will be a performance hit when selecting from the tables for a report. The primary key of the master table (one unique identifier – an int) will be the foreign key to the detail table (and combined with another unique identifier, also an int, to make the primary key on that table). Both fields will be indexes, so my argument is that the join won’t cause much of a performance hit, but he seems to think it will. Which scenario is better? Thanks,
Heather

First of all
what kind of relationship is your table ? one to one or one to many. why have two table in the first place. example one to many
invoice
id
1 invoice_details id otherdetails
1 a
1 b
1 c then joining table is the best since it is one of the NORMAL FORM we have normal form since we need to breakup the table into normal form and avoid duplicate and unnessessary data. in this case to have duplicate then your collegue is TOTALLY wrong and go against the design in the first place.
May the best cheaters win

In my oppinion, yours scenario is better.
Two mayor reason: Normalize and maitenance. And one question to your college: why no design all system with one table?
Luis Martin
Moderator
SQL-Server-Performance.com One of the symptoms of an approaching nervous breakdown is the belief that one’s work is terribly important
Bertrand Russell
All postings are provided â€œAS ISâ€ with no warranties for accuracy.

Hello, The relationship between the tables is one to many. The collegue is not an expert on database design, or SQL Server, but he is an expert on writing efficient code, so that is why he is concerned with the performance of a join that will be repeated a lot. Thanks for your replies, Heather

If you need to run aggregate queries for parent values, life does become a lot easier when you denormalize a bit. We have a system where a File can have multiple entries on Activity, each of which will have multiple entries on Action. Each table has both identity and natural key columns. If you normalize to the effect that only FileId is repeated on Activities, and only ActivityId on Action, then try to find the first upcoming Action for each File, and include it with Action details in a report query along with a TEXT column from the File table – it becomes a lot easier if you duplicate FileId on Action.

You can have your table design and you can introduced another view so that it will be much easiler to write code if your partner really needs that type of a table

quote:Originally posted by hkroening Hello, I am designing a database, and I am having a disagreement with a collegue. We have a master and detail table (i.e. Invoice and Invoice_Details). He wants to have data from the master table duplicated in the detail table. This way we won’t have to join with the master table to do most reports on the detail table. I think that this just duplicates data unnessesarily, and almost removes the need for the master table (there are still a few columns that will just be in the master table. These aren’t in every report like the duplicated data would be). He thinks that the join to the master table will be a performance hit when selecting from the tables for a report. The primary key of the master table (one unique identifier – an int) will be the foreign key to the detail table (and combined with another unique identifier, also an int, to make the primary key on that table). Both fields will be indexes, so my argument is that the join won’t cause much of a performance hit, but he seems to think it will. Which scenario is better? Thanks,
Heather

Hi,
I feel its better to normalize data as ull have a chance to add other
column in master table which u dont want to repeat in detail table.
as the master is related to invoice then in future if u want to add column for taxes or other expenses which u dont want in detail. Its like using crystal reports.
the column which repeates we keep them in detail section where as the header values we keep in group section.
Otherwise i think u can go for suggestion by Dinesh for a view.

if the cluster keys are set properly, the join between a master and detail table should be very efficient, depending on the type of report you want to run.
post the table def, indexes, and report query

I have faced many situations where people ask fors the de-normalized data specially programmes and my solution is introducing the view
that is why i have specified the view. And I agreed that Joining does not reduce the efficiency infact it is increasinf it

quote:Originally posted by joechang if the cluster keys are set properly, the join between a master and detail table should be very efficient, depending on the type of report you want to run.
post the table def, indexes, and report query

The final word would be if you dont’ care about redundancy, normalize or will adding more table later on then USE ONE TABLE no need for two. May the best cheaters win

i my opinion i would never denormalize a production DB for reporting purposes. i think you should have your production DB as efficent as possible. If you company can afford it, get a reporting database and then denormalize your tables in that DB for reporting purposes.

]]>