SQL Server Performance

Slow-running / sleeping process

Discussion in 'Performance Tuning for DBAs' started by richmondata, Nov 7, 2003.

  1. richmondata New Member

    Occasionally, we run into problems with a long-running process turning into an extremely long-running process because SQL Server puts the process to sleep for long periods. Every so often, the process with wake up and get a little bit more CPU time, then it goes back to sleep.

    This seems to happen mostly with some of our cursor-driven processes when the number of rows is pretty large. It also seems to be somewhat load-dependent.

    I think I saw something about this once somewhere in the knowledge base, but I would like to know more about what SQL Server is doing and how to make it leave the process alone. Frequently, the process would complete in 10-15 minutes if left to run, but the periodic hibernation causes the job to run for a couple hours.

    As a note, with smaller processes, I have gotten SQL Server to leave processes like this alone by switching from a cursor to a pseudo-cursor, e.g. a while loop with select top 1, etc.

    Thanks.
  2. ykchakri New Member

    What is the status of 'Wait Type' while the process is sleeping ? Normally, you see a process as sleeping when the process completed it's assigned work and waiting for the user to assign some other work, or if it is waiting on another resource such as a lock on an object or for disk I/O etc.
  3. richmondata New Member

    This hasn't happened in a while, and I don't recall the waittype. I do know that it wasn't blocked. It's also fairly consistent in the hibernate times. I've had this happen when the server was busy but also when the process in question was the only one running. It does seem like I read something once about SQL Server estimating the total amount of work a query will require and if it exceeds some threshold (undocumented), it processes it in something equivalent to a "background" mode. I've observed this on SQL Server 6.5, 7.0, and 2000.
  4. satya Moderator

    Are the cursors belong to server-side or client-side?
    Also how about indexes and maintenance on these columns?

    _________
    Satya SKJ
    Moderator
    SQL-Server-Performance.Com
  5. richmondata New Member

    The cursors are server-side, SQL cursors declare csr_getsomerows cursor for...., etc.
    There are good supporting indexes for the relevant columns. When the server is under a very light load, the process can run in less than half an hour, but if the load picks up on the server, even with unrelated activity (no blocks, etc), then SQL Server seems to put the process on the back-burner, so that it sleeps for a while, wakes up for 1 or 2 cpu seconds, then goes back to sleep again. It's still making headway and eventually runs, it just takes several hours instead of 20 minutes.
  6. Luis Martin Moderator

    No locks when is sleeping?


    Luis Martin

    ...Thus mathematics may be defined as the subject in which we never know what we are talking about, nor whether what we are saying is true.
    Bertrand Russell
  7. satya Moderator

    Good point, check the current activity during this peak and off-peak period and assess.

    _________
    Satya SKJ
    Moderator
    SQL-Server-Performance.Com
  8. peggyr New Member

    We have seen a similar problem with a store procedure that is pulling from one table, performing some transformation and inserting into another. Some items that have been noted: -IO drops to virtually nothing, processor moves up to 25% and hangs there, memory use remains steady, no locks, no blocks, no deadlocks. This is the only process running on the box. 2 CPU with hyperthreading enabled. 2GB memory available. "sleep" not related to checkpoints. Query performance in the process is great until the "sleep" period in which the query performance degrades from 15ms to as much as 375+ms - same queries. Query plans have been reviewed and are acceptable. Much analysis just confirms the symptoms, doesn't highlight any clues to cause. Have checked other activity during both active and "stall" time - nothing correlates. Looks like SQLServer just decides to rest for a while.
  9. bradmcgehee New Member

    The idea of SQL Server "sleeping" doesn't make much sense to me. In other words, if SQL Server has the CPU and other necessary resources available, SQL Server will perform the task at hand. Sure, there is a "sleep" state, but this is not a choice SQL Server makes, it is because of some other outside source.

    There are two ways that I know of that may make you think SQL Server is "sleeping", and that is when it is waiting for a client to return data (I see this a lot), or when all the resources on the server, such as CPU and memory, or virtually all used up (and in this case, SQL Server is just waiting for available resources).


    richmondata, I think you issue could be some combination of the above, and getting rid of cursor is a good first step.


    -----------------------------
    Brad M. McGehee, MVP
    Webmaster
    SQL-Server-Performance.Com
  10. peggyr New Member

    Completely agree that the concept of SQLServer "sleeping" doesn't make much sense. Technically, it is actually processing rows. You can see inserts being done into the target table - just VERY slowly. I have tried to identify something else (ie that "outside source") that is taking up the resources, but cannot - I just find symptoms. Originally, we had no problems with this process. It would handle an average of 80K inserts within about 6 minutes. Sometimes it still will. Why would the cursor potentially free up resources when at times the same process (with the cursor) shows no issues? Very baffled and trying not to obsess...
  11. bradmcgehee New Member

    One way to identify "slow outside sources" is to use Profiler, and compare the CPU time to the duration time. If they are way off, this generally means that SQL Server is having to wait for something, and by looking at the query, you can often figure out what.

    -----------------------------
    Brad M. McGehee, MVP
    Webmaster
    SQL-Server-Performance.Com
  12. richmondata New Member

    http://support.microsoft.com/default.aspx?scid=kb;en-us;168551&Product=sql

    I found an article (link above) on the MS Knowledge Base that may point to the problem. The article refers to known performance problems with dynamic cursors under SQL Server 6.5 (the release where I first encountered the periodic extraordinarily brief bursts of activity phenomenon). There are no updates in the KB relating to the problem, so it wouldn't surprise me if it's still an issue. I do know that their suggested workaround is what has worked for me before, i.e. going from a cursor to a "virtual cursor" (select top 1 @key = key where key > @last_key order by key, etc). I am also pretty sure that the KB had at least one article specific to this problem back in the 6.5 days.
  13. AngeloJS New Member

    I have recently joined a team that is having all kinds of database blocking issues with their application. We are using a similar configuration as described by peggyr, and I have been trying to get at the root of the problem but with no success. Our scenario shows a blocking situation that does not resolve itself without manual intervention. The lead blocker is sleeping with transaction level > 0 -- sometimes 2 indicating perhaps its executing inside a trigger. I say that because for the most part transactions are discrete, that is, they are issued in auto-commit mode. The lead blocker is not blocked or awaiting any resource, and all the blocked processes are sleeping as well. Although I have been designing and troubleshooting databases for 20 years, I am a relative newcomer to SQL Server so all opinions would be appreciated. From what I understand, sleeping indicates that the process believes that its work is completed and is awaiting a new command/batch, although the transaction appears to be doing a massive update or delete because it is showing many U locks. Since we are accessing SQL Server through ODBC, I thought perhaps the statement timed out, and the error code on the client was not executing an explicit rollback. Results on this are pending. I also noticed a couple of connections were opend in SERIALIZABLE mode although the only evidence I have seen is the lead blocker "sometimes" holds locks with a mode of 14 (technically a RangeS-S lock) although Enterprise Manager never indicates this, the syslockinfo shows this. This application is not well tuned, and some indexes are missing, so I am hoping that we can perform our way out of this situation, but it seems strange how it is happening. This is not a deadlock situation, but the results of the blocking are deadly. Any help would be appreciated.

    Scott Angelo
    Consultant
  14. Luis Martin Moderator

    Scott:

    Have you Monitored SQL counters?. Like Physical Disk Average Queue Length, Proccesor Queue Length, Full Scan, Cache Hit Ratio.

    This counters give you good information about performance.

    Run also Profiler to find lower SQL or SP.

    One example: if Physical Disk Average Queue Length is > 2 for 10 minuts or more, you have problem with I/O, and this may be cause of bloking.



    Luis Martin

    ...Thus mathematics may be defined as the subject in which we never know what we are talking about, nor whether what we are saying is true.
    Bertrand Russell
  15. AngeloJS New Member

    Thank you for your response Luis. Yes I have been doing some performance monitoring, and the I/O looked very good actually. Typically over a 4 hour period, capturing data every 15 seconds, the median physical disk busy time was 6% with a queue length that was always 0. The median CPU % time with dual processors with hyperthreading was 13% and the queue length was also zero. I also ran the same tests and measurements with a single CPU and the numbers were horrible: median CPU % time was 40% and median queue length was 6 with a spike to 128. Like I said previously, they have not tuned their application at all and there are some queries and SPs that don't run well, but they do not seem to be specifically involved with the blocking incidents. I suppose what I have not been able to adequately answer is 1) What does it really mean when I see a process with a status of sleeping, and 2) How or when does a process get into a status of sleeping. ALL of the processes involved in the blocking are marked as sleeping and all except the ones that are waiting on Share locks have a transaction level > 0. Somehow, I am convinced that the fact that they are sleeping is the root cause of the problem. The path I am now taking is to tune the application as much as possible in hopes that the contention and blocking are minimized to the point where the conditions in which this can happen goes away. One more point to mention and I'm wondering if anyone has any experience with this is that there is one table involved in all the blocks that is very small -- 1 database page. This table unfortunately is not normalized and has a very high volume of selects and updates -- which means its locked almost all the time. Can this be the root cause and is normalization where we move the updates to another table as inserts and deletes the solution? I have considered this, but given the application, this would not be easy.


    Scott Angelo
    Consultant
  16. Luis Martin Moderator

    Scott:
    1) Process sleeping without block means nothing. Some user wake up the application, log in, and do nothing.
    I think the problem is the table itself. Tables no normalizer cause a lot of problem.
    If all counters are ok, I think the only way is normalize the table.
    Because is saturday, I suggest to wait week end to get more post from others members oppinions.



    Luis Martin

    ...Thus mathematics may be defined as the subject in which we never know what we are talking about, nor whether what we are saying is true.
    Bertrand Russell
  17. satya Moderator

  18. peggyr New Member

    We have solved our problem regarding this issue. After analyzing the right pieces of data, this came down to a specific query. It did not affect the entire process because of the order in which the records are processed and not all records process this specific quuery. There was also a significant increase in data that would have affected the query since the previous performant runs of the process. I am reminded of two VERY important database processing "rules". (1) miliseconds DO matter. Even though when you run a query and it appears to run quickly, 300 miliseconds is significantly slower than 5 miliseconds. (2) key lookups should be against integers whenever possible. This query was looking at a field that is a character, but could logically be an integer. Upon adding a duplicate field of type integer (and even before creating an index), the performance on the query moved from 300 to 500 miliseconds to less than 5 miliseconds.

    Thank everyone for your input.
  19. Singaravelan New Member

    We are having similar problems after migrating the WinNt server to Win2K3 Server. But the SQL2K remains. We could see that the systems has slowed down a lot and the "Sleeping" processes count has increased much.

    Could the operating system change be a problem? Feedback is really appreciated.

    Thanks
    Singaravelan
  20. satya Moderator

    On the operating system part its not advisable to upgrade rather install a fresh Win2K3 and upgrade SQL from WinNT machine.

    Satya SKJ
    Moderator
    http://www.SQL-Server-Performance.Com/forum
    This posting is provided “AS IS” with no rights for the sake of knowledge sharing.

Share This Page