Performance Question | SQL Server Performance Forums

SQL Server Performance Forum – Threads Archive

Performance Question

We have two servers, production and test. Production is a more robust server setup than test. We noticed a performance problem in production where a query was running much much slower in production than in test. We subsequently restored a copy of the test database onto the production server and still see the hangup and the query has different execution plans. We ran update statistics on the tables and recompiled the procedure in question on both servers. The queries still get different execution plans between the test and production servers. So identical databases, identical data, identical objects, identical statistics different execution plans and different execution speeds. If I force an execution plan on the query with (OPTION HASH JOIN) on the slower running but more powerful production server it runs in 3 seconds. If I leave it to devise it’s own execution plan it runs in 3:30 seconds. We went through and checked the memory and query settings on both servers they are identical. Both are set to use all CPUs. Are there any settings or physical configurations that would cause the server to come up with a different execution plan on two servers?
What about reindex?
Luis Martin
Moderator
SQL-Server-Performance.com All in Love is Fair
Stevie Wonder
All postings are provided “AS IS” with no warranties for accuracy.
We tried reindex. We also put the database on a third server. The third server has the slow execution plan also. I doubt it’s a physical fragmentation problem. Any other ideas? Dan
In both servers, do you have set paralellism with all processors? Luis Martin
Moderator
SQL-Server-Performance.com All in Love is Fair
Stevie Wonder
All postings are provided “AS IS” with no warranties for accuracy.
Yep.
Does the test server have a smaller number of CPUs than the production server? You might be experiencing the problem where too many CPUs can get involved with simple queries. You can test this by adding this at the very end of your SELECT query (it would come after the ORDER BY clause): OPTION (MAXDOP 1) This problem was supposed to be solved in a Service Pack for SQL 2000, but I believe it still exists even in SQL 2005.
Agree Adriaan.
Luis Martin
Moderator
SQL-Server-Performance.com All in Love is Fair
Stevie Wonder
All postings are provided “AS IS” with no warranties for accuracy.
I tried with OPTION (MAXDOP 1) on both servers. In the test server it runs in 7 seconds. After 1 minute I gave up on the production server. The only option that gives similar results is OPTION (HASH JOIN).
Could you post both hardware servers?
Luis Martin
Moderator
SQL-Server-Performance.com All in Love is Fair
Stevie Wonder
All postings are provided “AS IS” with no warranties for accuracy.
I’ve been doing most of my testing on the third server(Staging) as our production server is off bounds. Hardware fast server(TEST): VDATA: Dell PowerEdge 6650
Raid 1 33.87 GB
Raid 5 273 GB
15.7 GB RAM
4 Xeon 3.0 Ghz Processors with HyperThreading Hardware slow server(STAGING): Dell Power Edge 2850
Raid 1 33.87 GB
Raid 5 273.24 GB
8 GB RAM
4 Xeon 3.2 Ghz Processors I didn’t get the hardware for the production server. they didn’t get back to me yet.
your staging server can only have 2 physical processors, try examining the plans on the production server with
OPTION (MAXDOP 4)
OPTION (MAXDOP 2) also look in the execution plan details for the Hash Join, does it show an IO cost component or just a CPU component
do this for both the test, and prod (with forced hash join)
Also see if additional index (or column in existing index) can help.
We’ve actually narrowed it down further. We discovered a difference in service pack levels between our Test box and Staging box. We applied the service pack to the staging server and it started working with the correct execution plan. We tried it on another box and again with the service pack the query speed was repaired. The Production server already had the service pack, but we reapplied it and it didn’t solve the problem. We’ve opened a ticket with Microsoft as it seems a problem with their service-pack not working on the Production server’s specific server configuration, or Microsoft having buggy software. They’re sifting through installation data and dll info as we speak. One of our theories is that the Production server is our only clustered server and perhaps there’s some issue with clustering and this specific service pack.

]]>