I am posting a part of the log that describes our error. The configuration is 8 processors, 8GB memory, AWE enabled. The server crashes (hangs) after this error is reported in the log. So far we have tried increasing the pagefile size. We have also checked for tables pinned into memory (no such). The technet articles mention a similar problem but it is only for a database with more than 100,000 tables, and we have 275. Has anyone experienced this problem, and does anyone have any ideas? Thank you in advance. Stanley, Mobiltel EAD 2002-12-08 14:46:24.40 server Error: 17803, Severity: 20, State: 14 2002-12-08 14:46:24.40 server Insufficient memory available.. 2002-12-08 14:46:26.62 spid4 LazyWriter: warning, no free buffers found. 2002-12-08 14:46:26.67 spid4 Buffer Distribution: Stolen=19263 Free=0 Procedures=124 Inram=368556 Dirty=62 Kept=0 I/O=0, Latched=219, Other=0 2002-12-08 14:46:26.67 spid4 Buffer Counts: Commited=388224 Target=388224 Hashed=368837 InternalReservation=541 ExternalReservation=7520 Min Free=512 2002-12-08 14:46:26.67 spid4 Procedure Cache: TotalProcs=13 TotalPages=124 InUsePages=124 2002-12-08 14:46:26.67 spid4 Dynamic Memory Manager: Stolen=11549 OS Reserved=1768 OS Committed=1729 OS In Use=1651 Query Plan=217 Optimizer=7 General=11172 Utilities=1302 Connection=502 2002-12-08 14:46:26.67 spid4 Global Memory Objects: Resource=2095 Locks=6224 SQLCache=73 Replication=11 LockBytes=2 ServerGlobal=46 Xact=33 2002-12-08 14:46:26.67 spid4 Query Memory Manager: Grants=2 Waiting=0 Maximum=143560 Available=128216 2002-12-08 14:46:46.62 spid4 LazyWriter: warning, no free buffers found. 2002-12-08 14:46:46.67 spid4 Buffer Distribution: Stolen=19252 Free=0 Procedures=128 Inram=368556 Dirty=61 Kept=0 I/O=0, Latched=227, Other=0 2002-12-08 14:46:46.67 spid4 Buffer Counts: Commited=388224 Target=388224 Hashed=368844 InternalReservation=561 ExternalReservation=6424 Min Free=512 2002-12-08 14:46:46.67 spid4 Procedure Cache: TotalProcs=14 TotalPages=128 InUsePages=128 2002-12-08 14:46:46.67 spid4 Dynamic Memory Manager: Stolen=10431 OS Reserved=1768 OS Committed=1729 OS In Use=1651 Query Plan=229 Optimizer=0 General=10042 Utilities=1302 Connection=509
Is this a dedicated SQL server ? What other processes if any, are running, and do they appear to consume much mem in task mananger? Is sql configured to have free reign over all accessible RAM? Also what version of the server are you running, and what SP level? Try using the DBCC MEMORYSTATUS command to see how memory is allocated within the various subsystems of sql server, perhaps this will give some ideas. Is this reproducible, and if so try tracing and a memory perfmon, to see if any sp's or jobs are eating away at the memory. Also it might be worth looking at the -g startup parameter mentioned in SQL7 SP4 readme ..http://support.microsoft.com/defaul...om/support/servicepacks/SQL/7.0/sp4ReadMe.asp
10x for the help Chappy. Yes, it is a dedicated SQL server. It is also a distributor and other processes that take memory are the distrib jobs (26X6.5MB = appx. 150MB), so nothing special. The server is under a constant heavy load. The proc cashe is appx. 170MB (22,000 pages) and the data pages in buffer are 3.6GB. This is not unusual, as I have enabled the AWE and it means that the server holds in memory as many data as it could. The max server memory setting is 4GB at this stage. We used to run this server with only 4GB of phys RAM (no AWE and max server mem 2GB) for quite a long time and then this error occured. Now I have replaced the memory with 8GB and the error again occured after a few days (but the max memory is still 4GB - there is an argument here whether to increase it or not). No cursors are in memory, no pinned tables but just a very heavy load from stored procedures and applications. I'm sending the DBCC MEMORYSTATUS results. Another weird thing is that in DBCC SQLPERF(lrustats) the number of free buffers varies widely from 50000 to 5 in a short period of time. Buffer Distribution Buffers ------------------------------ ----------- Stolen 47707 Free 1055 Procedures 73366 Inram 0 Dirty 14217 Kept 0 I/O 0 Latched 190 Other 375465 (9 row(s) affected) Buffer Counts Buffers ------------------------------ ----------- Commited 512000 Target 512000 Hashed 389872 InternalReservation 699 ExternalReservation 28544 Min Free 511 (6 row(s) affected) Procedure Cache Value ------------------------------ ----------- TotalProcs 20643 TotalPages 73366 InUsePages 35106 (3 row(s) affected) Dynamic Memory Manager Buffers ------------------------------ ----------- Stolen 100081 OS Reserved 2560 OS Committed -201 OS In Use 2435 General 17468 QueryPlan 78617 Optimizer 0 Utilities 5306 Connection 1125 (9 row(s) affected) Global Memory Objects Buffers ------------------------------ ----------- Resource 4627 Locks 248 XDES 40 SQLCache 730 Replication 12 LockBytes 2 ServerGlobal 49 (7 row(s) affected) Query Memory Objects Value ------------------------------ ----------- Grants 1 Waiting 0 Available (Buffers) 154040 Maximum (Buffers) 203576 (4 row(s) affected) Optimization Queue Value ------------------------------ ----------- Optimizing 0 Waiting 0 Available 32 Maximum 32 (4 row(s) affected)
What are your memory settings in the boot.ini files, and can you send the output of sp_configure for this server? ------------------ Brad M. McGehee Webmaster SQL-Server-Performance.Com
/PAE /3GB are in the boot.ini sp_configure returns: affinity mask-2147483648214748364700 allow updates0100 awe enabled0111 c2 audit mode0100 cost threshold for parallelism03276700 cursor threshold-12147483647-1-1 default full-text language0214748364710331033 default language0999900 fill factor (%)010000 index create memory (KB)704214748364700 lightweight pooling0100 locks5000214748364700 max degree of parallelism03200 max server memory (MB)4214748364740004000 max text repl size (B)021474836476553665536 max worker threads32327670255 media retention036500 min memory per query (KB)5122147483647512512 min server memory (MB)02147483647512512 nested triggers0100 network packet size (B)5126553640964096 open objects0214748364700 priority boost0100 query governor cost limit0214748364700 query wait (s)-12147483647-1-1 recovery interval (min)03276700 remote access0111 remote login timeout (s)021474836472020 remote proc trans0100 remote query timeout (s)0214748364700 scan for startup procs0111 set working set size0100 show advanced options0111 two digit year cutoff1753999920492049 user connections03276700 user options03276711921192
I am curious why you have the max server memory set to 4000, when you have 8GB of RAM in your server. You have said that there is an argument about this setting? If I was you, I would allocate much more RAM to SQL Server and see if that helps. Can you perhaps outline the pros and cons of this arguement? Also, the "cost threshold for parallelism 0 32767 0 0" is currently set to a value of 0, the default value is 5. Why have you chosen a setting of 0? Below is a tip I have for this setting on my website that you might find useful: "Using parallelism to execute a SQL Server query has its costs. This is because it takes a little additional overhead to run a query in parallel than to run it serially. But if the benefits of running a query using parallelism is higher than the costs, then using parallelism is a good thing. As a rule of thumb, if a query can run serially very fast, there is no point in even considering parallelism for the query, as the extra time required to evaluate it for possible parallelism might be longer than the time it takes to run the query serially. By default, if the Query Optimizer determines that a query will take less than 5 seconds to execute, parallelism is not considered by SQL Server. This 5 second figure can be modified using the "cost threshold for parallelism" SQL Server option. You can change this value anywhere from 0 to 32767 seconds. So if you set this value to 10, this means that the Query Optimizer won't consider parallelism for any query that it thinks will take less than 10 seconds to run. In most cases, you should not change this setting. But if you find that your SQL Server runs many queries with parallelism, and if the CPU rate is very high, raising this setting from 5 to a higher figure (you will have to experiment to find the ideal figure for your situation), will reduce the number of queries using parallelism, also reducing the overall usage of your server's CPUs, which may help the overall performance of your server. Another option to consider is to reduce the value from 5 seconds to a smaller number, although this could hurt, rather than help performance in many cases. One area where a smaller value might be useful is in cases where SQL Server is acting as a data warehouse and many very complex queries are being run. A lower value will allow the Query Optimizer to use parallelism more often, which can help in some situations. You will want to test changes to the default value thoroughly before implementing it on your production servers. If SQL Server only has access to a single CPU (either because there is only one CPU in the server, or because of an "affinity mask" setting, parallelism is not considered for a query. If you find in your audit that the cost threshold for parallelism is being used, find out why. If you can't get an answer, move it back to the default value." It is possible that a setting of 0 is producing unnecessary overhead on your server, contributing to some performance issues on your server. Also, I noticed that your min memory per query (KB) is set to 512K, instead of the default 1024K. I am curious as to why this setting has been made. Generally speaking, this setting is increased, not decreased, to boost performance. Also, if you have not done so, read this article at:http://www.sql-server-performance.com/awe_memory.asp ------------------ Brad M. McGehee Webmaster SQL-Server-Performance.Com
Hi Brad, apparently you're the only one taking this problem seriously. I really appreciate your help. The hints you've given me so far are good and helpful but they idicate that I still haven't managed to describe the real problem. Yes, I just noticed the parallelism and max memory per query settings and tuned them to more reasonable values (I have no access to the DBA who has set this server up, so I have no idea what he's meant by these settings). However, the problem is that at a certain point I have no free buffers available. The AWE is enabled, I have ~150MB of proc cache in memory and ~3.5GB of datapages. If I execute DBCC SQLPERF(lrustats) repetitively, I notice that the number of free buffers varies from 500,000 to 1. At a given moment it becomes 0 and the server hangs reporting an error (see my first posted message). The argument about the memory size is that for a very long period of time this server had been working with 4GB RAM, AWE not enabled (basically 2GB then) and at a given time it just crashed with this error (at that point I got involved into this problem). Another argument is that each processor needs at least 512MB for proper operation and 8*512 = 4GB (I still have problems believing this!?!?). Anyway, my problem is: why SQL Server doesn't free buffers when such are needed (I have no tables pinned in memory) and what might have caused this increased hunger for buffers. We are still investigating if it is a HW failure. So far we have eliminated the RAM but it might be also the controllers. Do you have any other ideas? 10x Stanley
I am not familiar with the "argument is that each processor needs at least 512MB for proper operation". This is new too me, and also hard to believe. If this were true, how could it have run when you only at 4GB on the server. The logic doesn't seem right. If this were my server, I would bump up the maximum memory to at least 6GB for SQL Server and see what happens. Of course, depending on the cause of your problem, this may or may not help. If it is solely a memory problem (not enough of it), then it will fix it. But if it is bug-related, it won't. I have been doing some research on how SQL Server frees up buffers, and it can occur two different ways. It can be done by the Lazywriter process or by a user thread. In either case, SQL Server is supposed to dynamically allocate free space in the buffers. According to Microsoft, the minimum amount of free space in the buffer cache is supposed to range from a minimum of 128KB to a maximum of 4GB. Based, on this, your system should never have less than 128K of free space. Once the buffer cache below the currently established minimum, SQL Server automatically goes out and frees pages to make more room. There are no settings for this option, it happens automatically. That is why I am suggesting that you might have run into a bug, as what you have described appears to be different that how SQL Server is supposed to behave. I don't know if it would be worth your while, but at this URL: http://msdn.microsoft.com/library/default.asp?url=/library/en-us/adminsql/ad_perfmon_0ugk.asp is a list of counters you can use to more closely watch the AWE memory behavior. It might give you a clue as to what is happening. Also, it is remotely possible, as you have said, that it might be hardware or driver related, and you should consider this possibility. But if after you try bumping up the maximum memory, and check out these counters, and you can't still find the answer, this is the time I (personally) would call Microsoft Support. I always try everything I can before I call, but I end up calling Microsoft Support about 4-6 times a year, and it almost every case, it is due to a bug-related issue. ------------------ Brad M. McGehee Webmaster SQL-Server-Performance.Com
Did you set the MaxMemory for each SQLServer instance ? afaik sqlserver attempts to take all when awe is enabled.
To the previous post:I have just one SQL instance installed. By this moment I haven't reproduced the error with the new settings applied and I hope I have solved the problem. Thanks to everyone for the help! Cheers, Stanley