SQL Server Performance

SQL Server Agent Stops Unexpectedly

Discussion in 'SQL Server 2005 Clustering' started by DBADave, Dec 22, 2008.

  1. DBADave New Member

    We've had problems recently where our SQL Server Agent service has stopped unexpectedly in both our 32-bit and 64-bit clusters. The following messages appear in the Event Viewer logs.
    The description for Event ID ( 17052 ) in Source ( MSSQLSERVER ) cannot be found. The local computer may not have the necessary registry information or message DLL files to display messages from a remote computer. You may be able to use the /AUXSOURCE= flag to retrieve this description; see Help and Support for details. The following information is part of the event: SQLServerAgent Monitor: SQLServerAgent has terminated unexpectedly..
    The description for Event ID ( 17052 ) in Source ( MSSQLSERVER ) cannot be found. The local computer may not have the necessary registry information or message DLL files to display messages from a remote computer. You may be able to use the /AUXSOURCE= flag to retrieve this description; see Help and Support for details. The following information is part of the event: SQLServerAgent Monitor failed to restart SQLServerAgent after SQLServerAgent terminated unexpectedly (reason: SQLSCMControl() returned error 5040, 'The cluster node already exists.')..
    [sqagtres] CheckServiceAlive: Service is dead
    [sqagtres] CheckServiceAlive: Service is dead
    [sqagtres] CheckServiceAlive: Service is dead
    SQLServerAgent service successfully started.
    Any ideas?
    Thanks, Dave
  2. satya Moderator

    It looks like you may have removed the local admin group from the server if that was the case then your cluster will always fail as this group is what the cluster service account uses to logon and ensure that SQL is up and running.
    You need to add the cluster service account back into sql, it does not need to be in the sa role it just needs to be a regular user I give it read write to tempdb and sets its default database to tempdb as well.

  3. DBADave New Member

    Hi Satya,
    The local admin group still exists on both nodes. I didn't even know it could be removed. In our case the cluster errors have only happened twice; once on a 32-bit cluster and once on a 64-bit cluster. If the local admin group was removed wouldn't the cluster service errors be continous? And wouldn't the same also be true if the cluster service account had incorrect permissions? I'll verify permissions tomorrow to make sure everything is ok.
    Thanks, Dave
  4. satya Moderator

    The local computer may not have the necessary registry information or message DLL files to display messages from a remote computer
    This message occurs if the service account is unable to get relevant permissions such as ADMIN on the system, check it out.
  5. DBADave New Member

    I asked one of our server admins to see why I can't view the application log message using either my id or the SQL Server service account. He is investigating the issue, but it is only error 17052 that fails to show all of the details. All other errors and warnings I can see without a problem. I don't see a connection between the Event Viewer problem and why the SQL Server Agent restarted on its own. This has happened only on two of our three SQL clusters so I am wondering if this is a cluster issue. Have you ever heard of this problem before?
  6. satya Moderator

    Dave
    I remember to see such issues when one of the account used for SQLAGent has been disabled (mistakenly) by security team and also once due to password change. If you have ruled out the above 2 options then I feel you should monitor all the logs.
  7. DBADave New Member

    It happened again. We are going to call Microsoft in the morning, but do you see anything in the cluster log.
    00000424.00001428::2009/01/23-00:41:07.222 INFO [Qfs] GetDiskFreeSpaceEx Q:MSCS, status 0
    00001570.000015bc::2009/01/23-00:41:23.519 ERR SQL Server Agent <SQL Server Agent>: [sqagtres] CheckServiceAlive: Service is dead
    00001570.000015bc::2009/01/23-00:41:23.519 ERR SQL Server Agent <SQL Server Agent>: [sqagtres] CheckServiceAlive: Service is dead
    00000424.000015cc::2009/01/23-00:41:23.519 INFO [FM] NotifyCallBackRoutine: enqueuing event
    00000424.000015cc::2009/01/23-00:41:23.519 INFO [FM] Calling RmNotifyChanges in monitor 1570.
    00000424.00001458::2009/01/23-00:41:23.519 INFO [CP] CppResourceNotify for resource SQL Server Agent
    00000424.00001424::2009/01/23-00:41:23.519 WARN [FM] FmpHandleResourceTransition: Resource Name = 32f19e97-c3a5-4d91-808c-475bd0913467 [SQL Server Agent] old state=2 new state=4
    00000424.00001424::2009/01/23-00:41:23.519 INFO [GUM] GumSendUpdate: Locker waiting type 0 context 8
    00000424.00001424::2009/01/23-00:41:23.519 INFO [GUM] Thread 0x1424 UpdateLock wait on Type 0
    00000424.00001424::2009/01/23-00:41:23.519 INFO [GUM] GumpDoLockingUpdate: lock was free, granted to 1
    00000424.00001424::2009/01/23-00:41:23.519 INFO [GUM] GumpDoLockingUpdate successful, Sequence=20800 Generation=0
    00000424.00001424::2009/01/23-00:41:23.519 INFO [GUM] GumSendUpdate: Locker dispatching seq 20800 type 0 context 8
    00000424.00001424::2009/01/23-00:41:23.519 INFO [GUM] GumSendUpdate: Dispatching seq 20800 type 0 context 8 to node 2
    00000424.00001424::2009/01/23-00:41:23.519 INFO [GUM] GumSendUpdate: Locker updating seq 20800 type 0 context 8
    00000424.00001424::2009/01/23-00:41:23.519 INFO [GUM] GumpDoUnlockingUpdate releasing lock ownership
    00000424.00001424::2009/01/23-00:41:23.519 INFO [GUM] GumSendUpdate: completed update seq 20800 type 0 context 8
    00000424.00001424::2009/01/23-00:41:23.519 INFO [FM] FmpPropagateResourceState: resource 32f19e97-c3a5-4d91-808c-475bd0913467 failed event.
    00000424.00001424::2009/01/23-00:41:23.519 INFO [FM] FmpHandleResourceFailure: taking resource 32f19e97-c3a5-4d91-808c-475bd0913467 and dependents offline
    00001570.000015d8::2009/01/23-00:41:23.519 ERR SQL Server Agent <SQL Server Agent>: [sqagtres] CheckServiceAlive: Service is dead
  8. satya Moderator

  9. Saurabh Srivastava New Member

    Dave- Did you find solution for this issue?
  10. DBADave New Member

    Here's an update. We opened a case with Microsoft in late January. They still have not been able to determine the cause of the problem. I did notice that every time the SQL Server Agent restarts automatically we are experiencing a large number of deadlocks. We got the WMI team involved, but they feel WMI is ok. I'm still not sure. Yesterday during WMI testing our server locked up while running WMIDiag and UserDump 8.1 at the same time. I'm not sure which if any caused the server to become unresponsive, but my guess is it was UserDump 8.1 in conjunction with some other factor. But that's a discussion for another day. Anyhow, we are trying to get SQL Server Agent to produce a dump the next time it restarts due to this bug. We may install ADPlus and hope to produce a dump with that utility. I'll update everyone when I know more, but I'm curious if anyone else experiencing this problem has also noticed deadlocks occuring at the same time.
    Dave
  11. satya Moderator

    Do you have any application (third party) that link up with Scheduler or agent?
  12. charlii New Member

    SQL server stopped in case when you change your password or when you delete your account mistakenly.Remember is there anyone of the above issue run unexpectedly and resolve it by managing your accounts again.
  13. FrankKalis Moderator

    Thanks for you contribution, but please have a look at the date of the threads you are contributing to. This one is almost 1 year old. Chances are that the original questioner has already found a solution, or doesn't follow the thread any longer... [:)]

Share This Page