I have been battling what I believe is a memory issue and I have been trying to figure out why the page life expectancy on this server is all over the place. We are running sql 2000/sp4 on windows 2003 sp2. The server HP DL585 with 4 dual core cpu's, 48GB RAM up from 32GB a week agao (currently maxserver mem = 44GB) mirrored OS, RAID 10 (6 drives) for log files, EMC SAN LUN for data files (RAID 5 w/5drives), LUN for tempdb and separate LUN for paging file. AWE is enabled and the patch for ...not all memory available...has been applied. We also use Full-text search extensively and use 15 databases that drive our websites. There are a few others but they are not used much. On any given day pagelife expectancy can vary from a low of around 150 to a max of around 6000. The Buffer Cache Hit Ratio is a rock solid 99.5+ SQLServer:Cache Manager(_Total)Cache Hit Ratio is 85+ AVG. PhysicalDisk(SAN DATA Files)% Idle Time over a 24hr period is around 80% Total Server Memory 46145440 Target Server Memory 46145456 I've run profile traces looking for long running queries but nothing really sticks out. We also get occassional CPU spikes in the 80% - 95% range that may last longer than 5min. However there is no real pattern to when the spikes occur. We run reindexing on the tables at least once a week and one of the databases that gets written heavily to is done nightly. Any suggestions on where to start looking or what to try next. As always thanks.
Do you have any databases options set such as AUTOCLOSE or chaing the recovery model on the user databass?
Hi, It is quite common for Pagelife expectancy to be all over the place - even dipping to zero occassionally - even if there is no memory issue. You will have to look at few other counters as well and correlate to see if they are all pointing in the same direction. Not all memory consumption will reflect in Buffer Cache Hit Ratio because not all, although most, memory is about data and index pages; you have things such as workspace memory or memory consumed by the optimizer for instance. So it is possible to have a memory issue and still have a high cache hit ratio (with this I am not saying that you are having a memory issue). What other symptoms do you have? Have a look at the following counters as well: SQL Server Buffer Manager: Lazywrites/sec Checkpoint/sec Free List stalls/sec Free Pages/sec Pagelife/sec Page Writes/sec Page Reads/sec SQL Server Memory Manager: Memory Grants Outstanding Memory Grants Pending Total Server Memory SQL Server Statistics: Autoparam attempts/sec Batches/Sec Compilations/sec Recompilations/sec SQL Server General Statistics: Transactions /sec Memory: Available MB Process: Private Bytes sqlservr.exe Working Set sqlservr.exe Logical Disk: Disk Time% Current Queue Length Avg secs /read Avg secs /write When you are profiling, bring in the Errors and Warnings: SORT warning and HASH warning events. The reason they are interesting is that they can consume large workspace memory (if done on large datasets)and they are CPU intensive operations. You might want to consult the Wait statistics that SQL Server accumulates as well: DBCC SQLPERF (WAITSTATS) It accumulates since server restart so if the SQL Server service has been up and running for a long time, you might want to clear it and then wait for a while (until enough statistics is gathered to reflect your workload): DBCC SQLPERF (WAITSTATS, 'clear') You can download this script that will run the DBCC SQLPERF command and also pretty up the output for you http://sqldev.net/download/misc/sp_waitstats.sql. /Elisabeth