The graphs below show the number of reads and CPU (on the Y
axis) for all five of these queries for different numbers of rows in the
PriceHistories table (on the X axis). Note that both the axes in the graphs are
It appears that for small sets the string manipulation and
the rownumber methods are the most optimal, but as the sets grow larger the
cross apply and the subquery(top) perform much better.
Why is this? An explanation of the execution plans
The string manipulation and rownumber queries scan the
PriceHistories table, and then join back to Products. Because the
PriceHistories table is scanned, the expense of these queries grows more or
less linearly with the number of records in the PriceHistories table.
Additionally, the string functions in the string manipulation query perform a
significant amount of CPU as the amount of data grows larger.
The subquery(top) and cross apply both perform a scan on
Products and then a seek on PriceHistories for each product. While at first it
is more expensive to do 100 index seeks (200 reads) than do a single scan (6
reads), the expense for index seeks grows logarithmically with the number of
records in PriceHistories. As the number of PriceHistories records increases,
the performance of these two queries improves relative to the string
manipulation and rownumber queries.
The subquery(max) query is consistently one of the worst in
terms of reads. It scans the PriceHistories table, like the string manipulation
and rownumber queries. However, it then uses an inner loop join to combine the
results of the scan with the Products table, forcing 100 index seeks on
Products. This is an error on the part of the optimizer, since when a merge
join is forced, a single scan is performed on the Products table instead of the
index seeks and the number of reads drops by an order of magnitude.
The optimal method to use when retrieving data from a
history table depends on the number of historical records per row of current
data. In our case, this was the number of rows in the PriceHistories table per
row in the Products table. The subquery(max) method consistently performed
poorly. The rownumber and string manipulation queries performed best when there
were a smaller number of records in the PriceHistories table. The subquery(top)
and cross apply queries scaled much better than the other methods.
The subquery(top) and cross apply queries begin to perform
better approximately at the point at which performing a seek on the historical
table for each row of current data becomes less expensive than performing a
single scan on the historical table. Performing a scan on the historical table
requires NumHistRecs / RowsPerPage reads. Performing an index seek costs 1 +
log IndexRowsPerPage (NumHistRecs / IndexRowsPerPage ). So the
breaking point is approximately when NumHistRecs / RowsPerPage > NumCurRecs
* (1 + logIndexRowsPerPage(NumHistRecs/IndexRowsPerPage)). In the
above tests, RowsPerPage in PriceHistories is 8000/(4+4+5+8) = 380, NumCurRecs
= 100, IndexRowsPerPage = 4+8 = 12, so the performance of subquery(top) or
cross apply queries will be better approximately when NumHistRecs / 380 >
100 * (1 + log 12 (NumHistRecs / 12)). This works out to
approximately when NumHistRecs > 185,000, or when there is an average of
more than 1,850 historical records for each current record. Note, however, that
this is an optimized structure. In real world situations, the ratio of the
breaking point tends to be significantly lower.
These tests were run on SQL Server 2005 RTM. As always, you
should test this on your own systems, since the results may vary due to
variations in hardware and version of SQL Server. However, the basic principle
remains the same: methods that perform an index seek on the historical table
will outperform those that perform a scan as the number of historical records