Site sponsored by: Idera Try Idera’s new SQL admin toolset
SQL Server Performance

  • Home
  • Articles
  • Forums
  • Tips
  • Quiz
  • FAQ's
  • Blogs
  • Software
  • Books
  • About Us
RSS Feeds
Sign in | Join


Article Topics

All Articles
Performance Tuning
Audit
Business Intelligence
Clustering
Reporting Services
Developer
General DBA
ASP.NET / ADO.NET

Write for Us

Share you SQL Server knowledge with others and raise your profile in the community More...
Latest Articles

Characterizing I/O Workload
Server Audit Specifications in SQL Server 2008
Server and Database Auditing in SQL Server 2008
So, you find yourself On-Call

More     
 
Latest FAQ's

ALTER TABLE SWITCH statement failed because table '%.*ls' has %d columns ...
SQL Server Reporting Server (SSRS) service is failing to start ...
Cannot Start SQL Server Service
Users are able to connect to report manager but not able ...

More     
   
Latest Software Reviews

Spotlight on ApexSQL Doc 2008
ApexSQL Enforce
Embarcadero Change Manager
SQL Server DBA Dashboard

More     

articles >> performance tuning >> High Call Volume SQL Server Applications on ...

High Call Volume SQL Server Applications on NUMA Systems

By : Joe Chang
Oct 07, 2005

Page 3 / 4

The Microsoft KB article (support.microsoft.com/default.aspx?scid=KB;EN-US;Q252867) on the Interrupt Affinity tool describes Windows 2000 as assigning interrupts to any available processor, and that performance improvement may be possible by assigning each network adapter to a specific processor. It is possible that Windows Server 2003 changed the default behavior as suggested and assigns each interrupt to specific processor. Excerpts from this KB article are in Appendix B.

Figure 4 shows the individual CPU utilization from Windows Task Manager on a Unisys 16-way Xeon MP system running Windows Server 2003 while sustaining 17K calls per second. Note that CPU 10 (counting up from 0) is at near 100% utilization. It is suspected that this is the processor handling the network interrupt, but the necessary steps to prove this were not conducted. There was no disk activity in this test. There were no other processes running and nothing else generating network activity. If this interpretation is correct, then the call handling capability of the 16-way system is saturated even though the other processors are not even close to fully loaded. An actual production server (16-way Itanium 2) running SAP exhibited essentially the same characteristics shown in Figure 4. Applying any addition network traffic to the connection handling SQL Server calls resulted in call volume performance degradation, but generating traffic on a different network connection not used by the active SQL Server clients did not degrade performance.

Figure 4 16 x 3.0GHz Xeon MP system at 17K SQL Server RPC calls/sec

It is possible that distributing the network interrupt over more processors could improve call volume performance. It could also be speculated that excluding the SQL Server process affinity from one or more processors and binding the network interrupt to excluded processor(s) might help, but the net gain is not clear. Another point to note is that the CPU cost per call on the 16-way system is much higher than that of the 4-way system. So even if the CPU load could be evenly distributed, the performance with all 16 processors saturated may be no better than the 4-way call volume performance. It could be that there is substantial cost in having one processor handle the interrupt, then hand off the call to a SQL Server thread running on a processor in a different node.

Figure 5 shows the call volume scaling characteristics on an 8-way Itanium 2 system (HP rx8620, 1.5GHz processors). There are 4 processors in each of 2 cells. The call volume test was conducted with the system booted to 1, 2, 4, and 8 processors using the NUMPROC option in the EFI OS loader (equivalent to the boot.ini file in 32-bit systems).

Figure 5 Call volume performance for HP rx8620 booted to 1, 2, 4 and 8 processors.

It was not determined in the 2 & 4 CPU test whether all processors were in a common cell. Call volume scaling shows only marginal improvement from 1 to 2 processors (13.5K to 16.5K), no gain from 2 to 4 processors, and some degradation from 4 to 8 processors. It is possible that the one or both of the 2 & 4 processor tests.


<< Prev Page     Next Page>>    








Home | Peformance Articles | Audit Articles | Business Intelligence Articles | Clustering Articles | Developer Articles | Reporting Services Articles | DBA Articles | ASP.NET / ADO.NET Articles | DBA FAQ's | Developer Peformance FAQ's | DBA Peformance FAQ's | Developer FAQ's | Clustering FAQ's | Error Messages | Audit Tool Reviews | Backup Tool Reviews | Coding Tool Reviews | Compare Tool Reviews | Documentation Tool Reviews | Design Tool Reviews | Monitoring Tool Reviews | Log Tool Reviews | Reporting Tool Reviews | Clustering Tool Reviews | Security Tool Reviews | Change Management Tool Reviews | Remote Access Tool Reviews | Book Reviews | Security Tool Reviews | QDPMA Performance Tuning | ADO.NET / ASP.NET | Administration | Analysis/OLAP Services | Application Development | Configuration | Components | ETL | Hardware | High Availability | Hints | Index | Misc | Operating Systems | Performance Tuning | Replication | T-SQL | Views


              © 1999-2008 by T10 Media. All rights reserved