Processor Performance, 2002

Processor Architecture

Figure 23 shows the performance for the Pentium III 1.40GHz processor on the ServerWorks HE-SL (SDRAM) chipset, the Xeon 2.0GHz on the ServerWorks GC-LE chipset (DDR) and the Pentium 4 2.0 GHz on the Intel 850 chipset (RDRAM). All processors have 512K L2 cache and all results use the version 5.0.1 compiler. The Xeon and Pentium 4 processors have the same Pentium 4 core.

Figure 23. Pentium III 1.4GHz, Xeon 2.0GHz, and Pentium 4 2.0GHz.

The difference in results are probably mostly due to the difference in chipsets and memory. The GC-LE uses dual DDR channels with 3.2GB/sec, but use registered ECC DIMMs. The registered memory allows the motherboard to support four DIMMs on each memory channel, but adds latency. The net effect is that the Xeon with the GC-LE chipset has 5% lower overall performance than the same processor with the 850 chipset. PERLBMK and TWOLF performance are better with the GC-LE than the 850, but all the others perform better with the 850.

The 2.0GHz Xeon has 5% better overall performance than the 1.4GHz Pentium III processor. Both have 512K cache. The Xeon 2.0GHz outperforms the Pentium III 1.4GHz on six of the SPECint_base2000 benchmarks and the Pentium III is better on the other six. However, 1.4GHz will be the highest frequency for the Pentium III on the 0.13u manufacturing process, while the Xeon/Pentium 4 will probably scale past 3.0GHz on the same 0.13u manufacturing process.

Figure 24 compares the Intel Pentium 4 2.53GHz processor with the AMD Athlon 2600+ (2.133GHz). It is not unusual for one processor architecture to perform better on some types of applications and not as well on others compared to a second processor of a different architecture.

Figure 24. Intel Pentium 4 2.53GHz and AMD Athlon 2600+ performance.

AMD positions the 2.133GHz Athlon as being comparable to an Intel Pentium 4 at 2.6GHz in selected applications. The Pentium 2.60GHz 400MHz FSB results are not available, but the 2.53GHz with a 533MHz FSB should be comparable, having 3% lower frequency, but the faster FSB was shown to provide approximately 4% performance gain.

The Pentium 4 2.53GHz is 11% faster overall than the Athlon 2600+ on the SPECint_base2000 benchmark. This is probably in line with expectations. The applications selected by AMD are popular productivity applications. Most of these applications do not use the latest compilers. The Pentium 4 has radically different processor architecture and benefit more from new compilers which take full advantage of its capabilities. The Athlon 2600+ might have comparable performance to a Pentium 4 2.60GHz on applications that used an older compiler.

Figure 25 compares the SPECint_base2000 performance between the Intel Pentium 4 at 2.8GHz, the Alpha 21264 at 1.25GHz, the Intel Itanium at 1.0GHz, the IBM Power4 at 1.3GHz and the Ultra SPARC III at 1.05GHz.

Figure 25. Recent high-end microprocessor performance.

The Pentium 4 has the highest overall integer performance and the highest component performance in eight of the twelve integer benchmarks. All of the other processors have much larger cache than the Pentium 4. The four components (VPR, MCF, BZIP2 and TWOLF) that the Pentium 4 does not lead are known to be sensitive to cache size, MCF especially so. The 21264, Itanium 4 and Power4 are fairly close in performance. The Ultra SPARC III is seriously outclassed.

Processor Performance Summary

The SPEC CPU2000 benchmarks are useful in analyzing processor performance because results are available for so many processor and platform variations. Also of value are the individual component benchmark results, which show the range of variation that can be expected with changes in frequency, cache size and memory bandwidth.

As with the individual SPEC component programs, each type of SQL Server operation has a particular response to processor frequency, cache size and memory bandwidth. It has been observed that a large cache reduces the startup cost of several SQL Server operations. This benefits workloads with many small operations.

Workloads that with high row count operations benefit from raw processor frequency. The Xeon/Pentium 4 platforms have been observed to have much better table scan performance than Pentium III platforms. It is not clear whether this is due to just the higher bus bandwidth of the Pentium 4 or also to other architecture enhancements in the Pentium 4.

Published with the express written permission of the author.
Copyright 2002 Joe Chang.

]]>

Leave a comment

Your email address will not be published.