SQL Server Processor Performance, 2006


Figure 9: SPEC CPU 2000 Integer for Pentium M on 90nm and Core 2 on 65nm.

Figure 10 shows the performance of Woodcrest relative to Dothan at nearly the same frequency. There is a 3% increase from 2.26 GHz and 2.33 GHz and a 34% increase in performance. Some can be attributed to cache, memory bandwidth, and compiler, but the bulk of the gain is from architectural improvements. Considering that processor core probably did not double the number of logic transistors, this is a very impressive performance gain.


Figure 10: Core 2 2.33 GHz performance relative to Pentium M 2.26 GHz.

Figure 11 compares the range of available Woodcrest processors (Xeon 51×0) to Dempsey (Xeon 5080) 3.73 GHz. Even the 1.86 GHz Woodcrest has better overall performance than the 3.73 GHz Dempsey core. Considering that Prescott performance was limited, the Woodcrest 3.0 GHz SPEC CPU 2000 integer base performance of 3,057 should not be unexpected.


Figure 11a: Woodcrest performance compared to Dempsey.

There is not normally such a large performance jump from the top frequency of one processor line to the top launch frequency of the next line. Historically, Intel has achieved a 30% increase in frequency from launch to process maturity when thermal limitations were not present. Figure 11b compares the SPEC CPU integer component performance for Woodcrest at 1.86 GHz and 2.0 GHz to Xeon 5080 3.73 GHz. There is a very large gain in mcf, contributing to the high overall scores for Woodcrest. A few components of the 1.86 GHz Woodcrest are below the 3.73 GHz Dempsey. All components of the 2.0 GHz Woodcrest are better or essentially equal to the 3.73 GHz Dempsey.


Figure 11b: Woodcrest component performance compared to Dempsey.

In the SPEC CPU 2000 Integer benchmark, the performance progression goal of Moore’s Law has been successfully achieved up to the current generation Woodcrest. Going forward, it may be difficult to continue this on a single-threaded benchmark. Look for developments to enable employing multi-cores on current single threaded applications. Technically, there were only two architecture generation transitions between Pentium III and Core 2 via Pentium M. The performance gains however are comparable to three generations.

SQL Server 2000 and 2005 TPC-C Performance

As useful as the SPEC CPU integer benchmarks are, database benchmarks are still required for a valid multi-threaded and multi-processor performance assessment. The most well known are TPC-C and TPC-H (http://www.tpc.org/) for transaction processing and decision support, respectively. The TPC benchmarks are applications running on top of the SQL Server or other database engines. The configurations are often at maximum memory and disk I/O so deficiencies in the system architecture can be exposed. For multi-processor systems, this is a demonstration of the ability to scale up. Both TPC-C and TPC-H are useful, but there are more results available on the TPC-C transaction processing benchmark for meaningful comparisons.

Some characteristics of the TPC-C benchmark are as follows. The rules for a valid result are that there can be a maximum of 12.5 tpm-C per warehouse. Each warehouse constitutes approximately 84 MB of data. A 125 KB tpm-C result requires a minimum of 10 KB warehouses or 840 GB data. A smaller data size will have lower disk I/O. The objective is to employ the smallest number of warehouses to meet the reporting requirement for transaction performance. Comparing TPC-C results is a tricky matter because the benchmark is I/O intensive. The memory to data size ratio can significantly influence disk I/O activity and in turn performance. The comparison is most clear when the tpm-C performance to memory ratio and the tpm-C per disk are reasonably close. The most common objective in the TPC-C benchmark is to achieve the highest score for each system. The price/performance must also be reported with each result. Some results are oriented to this aspect. The total system cost can be impacted by the choice of components, especially storage, that have no bearing on performance. So rather than comparing price/performance, the tpm-C per disk is noted below.

Table 3 shows selected TPC-C results for two socket systems. Table 4 shows selected TPC-C results for four socket systems. The results span both SQL Server 2000 and SQL Server 2005. SQL Server 2000 was already very efficient at transaction processing. It is unclear whether SQL Server 2005 made any gains for two or four socket systems. It is possible the SQL Server 2005 64-bit results benefit from more efficient use of memory, not having to use AWE to access more than 4 GB memory. SQL Server 2005 does have NUMA aware features that contribute to transaction processing performance most noticeable in systems with 16 or more processors.

Processor

Freq. GHz

Cache

Mem. GB

No. of Disks

tpm-C

tpm-C /GB

tpm-C /Disk

Report Date

Pentium III

1.0

256 KB

4

120

17,336

4,334

144

09/26/01

Pentium III

1.26

512 KB

4

104

22,007

5,502

212

10/25/01

Xeon

2.2

512 KB

8

143

33,768

4,221

236

02/25/02

Xeon

3.06

512 KB

12

214

44,942

3,745

210

05/29/03

Xeon

3.06

1 MB L3

12

214

52,468

4,372

245

07/15/03

Xeon

3.2

1 MB L3

12

256

54,096

4,508

211

10/13/03

Xeon

3.2

2 MB L3

12

290

60,364

5,030

208

03/02/04

Xeon

3.6

1 MB L2

16

104

68,010

4,251

654

11/01/04

Xeon

3.6

2 MB L3

16

203

74,298

4,644

366

02/11/05

Opteron

2.6

1 MB

16

174

71,413

4,463

410

02/14/05

Opteron

2.8

1 MB

32

296

76,214

2,382

257

09/30/05

Opteron DC

2.4

2×1 MB

32

296

109,633

3,426

370

11/08/05

Opteron DC

2.6

2×1 MB

32

380

113,628

3,551

299

03/07/06

Xeon

3.73

2×2 MB

32

518

125,954

3,936

243

05/01/06

5160 DC

3.0

1×4 MB

32

520

140,264

4,383

270

06/26/06

5160 DC

3.0

1×4 MB

64

590

169,360

2,646

285

05/22/06

Table 3: Selected TPC-C results for two socket systems.

Continues…

Leave a comment

Your email address will not be published.