SQL Server Performance

Bad day at the office!

Discussion in 'General DBA Questions' started by thomas, Dec 17, 2004.

  1. thomas New Member

    My Data Warehouse load started taking 2+ hours longer than usual the last couple of days. Normally when this happens I just re-boot it and it's fine for another 4 or 5 weeks or so. It's a quick, easy solution and saves me spending hours and hours troubleshooting it.<br /><br />So this morning I rebooted it (after warning users). ever since then it's been SICK. Can't log on to the desktop using Domain credentials, weird error messages all over the place when trying to do network-related things (e.g. changing Netbios name), me and a colleague tried removing it from the domain, that wouldn't work (another weird error - which I don't have to hand as I'm home now), you can rename it ON the domain but can't take it out into a workgroup then re-add it.<br /><br />Maybe a WINS database corruption? We applied those new WINS patches to our domain controllers today [still on NT 4.0 DCs <img src='/community/emoticons/emotion-6.gif' alt=':(' />]. Anyway I had to come home and leave it, my boss was having a look remotely.<br /><br />Don't know if anyone's got any ideas? I have checked for viruses - clean. I'm going to have to try to forget it over the whole weekend and face it on Monday (not pleasant).<br /><br />It's Windows 2000 Standard Edition SP4 + patches and SQL Server 2000 Standard Edition SP3a + hotfix.<br /><br />Tom Pullen<br />DBA, Oxfam GB
  2. Chappy New Member

    Certainly sounds like it could be an issue with WINS, consider giving MS a call.

    In the meantime, check your network card properties to make sure there are no bad/dropped TCP/IP packets. Maybe replace the network card too if you can

    Good luck!
  3. thomas New Member

    Yeah I was ready to phone MS but in our sector you have to check and double check before getting the corporate credit card out, if you get my drift.. If it's still not playing monday, I will get straight on the phone. Thanks for the reply..

    Tom Pullen
    DBA, Oxfam GB
  4. Luis Martin Moderator

    Sound network card issue to me.


    Luis Martin
    Moderator
    SQL-Server-Performance.com

    All postings are provided “AS IS” with no warranties for accuracy.

  5. thomas New Member

    yes but the network card is working fine - you can ping, you can net view, you can map drives using domain credentials, it's just certain things don't work. it's quite perplexing..

    it actually has 3 NICs, 1 for public network, 1 for backup LAN, 1 for private switched LAN to source servers for the load, etc. it seems more software-related to me than anything.


    Tom Pullen
    DBA, Oxfam GB
  6. Luis Martin Moderator

    How about last drivers NIC's?

    Luis Martin
    Moderator
    SQL-Server-Performance.com

    All postings are provided “AS IS” with no warranties for accuracy.

  7. thomas New Member

    yeah they're up-to-date, thanks for the suggestion. One idea is to switch the backup LAN card to be the public one and vice versa, but as I've said, I think it's OS/Domain related, I think the NIcs are fine as some basic network things work..

    Tom Pullen
    DBA, Oxfam GB
  8. Twan New Member

    Hi Tom,

    I'd suggest that you chec your DC's error logs for any messages, enable additional auditing if you need to see more info. What message do you get when trying to log onto the desktop with a domain account?

    Cheers
    Twan
  9. thomas New Member

    Twan, thanks - I am at home so don't have all the error messages to hand, it's a series of seemingly spurious errors (e.g. computer's domain account is missing, etc.)

    My manager has been fighting with it from home most of the day and his diagnosis is a W2K re-install... with which I agree - a clean start. We can leave the data partitions in tact so recovery will be relatively painless.

    At one stage he phoned me and he thought it was fixed, while he was on the phone, he tried to restart SQL Server (running under a domain account) and it failed with login failure. It's just a series of these kind of errors... The DCs look fine and there are no issues with any other servers on the domain so, we kind of conclude, it's a local OS-related issue, reasonable, no?

    Thanks for all of your help and suggestions, and I will update you all with news before we go away for 4 days' rest next weekend (for me, it can't come too soon...)

    Tom Pullen
    DBA, Oxfam GB
  10. derrickleggett New Member

    Personally, I've never seen local OS cause this. If they just changed the WINS and you're having trouble with that, that would be the logical thing to troubleshoot. Have you tried dropping the computer completely from all WINS and DNS servers, then reapplying? That would be the logical path in this instance.

    Also, why are you still using WINS? It's never been reliable. I don't understand why people still use it.

    MeanOldDBA
    derrickleggett@hotmail.com

    When life gives you a lemon, fire the DBA.
  11. thomas New Member

    Derrick, good questions.. and the short answer is, I don't know. Our network guys are a law unto themselves...

    Tom Pullen
    DBA, Oxfam GB
  12. derrickleggett New Member

    &lt;RANT&gt;<br /><br /><img src='/community/emoticons/emotion-1.gif' alt=':)' /> That just bugs the fire out of me. The first response of network guys is to always say "it's not us". You then find out it is, or work around the issue until you have things working. They should never be a "law unto themselves". If they are, someone should fire your IT management.<br /><br />&lt;/RANT&gt;<br /><br />MeanOldDBA<br />derrickleggett@hotmail.com<br /><br />When life gives you a lemon, fire the DBA.
  13. thomas New Member

    Derrick, I feel your pain. But I am a minion, not a fromage grande...

    Tom Pullen
    DBA, Oxfam GB
  14. Twan New Member

    Hi ya,

    I must concur with Derrick that this doesn't feel like a local OS problem... Have you tried:
    - reset the domain machine account
    - remove the sql server from the domain, delete the machine account and then re-add it
    - when logged into the sql server you can ping the dc, and resolve it's name to an ip

    have you turned on the sucess/failure auditing of account logons on the DCs and then check to see what messages are shown?

    Cheers
    Twan
  15. thomas New Member

    Twan, no - I haven't looked at the DC security logs yet. I had to leave on friday and haven't been back yet.

    - you can't remove it from the domain, only rename it ON the domain, and that throws errors. We tried manually deleting the computer account from the Domain (on a DC), resynched the domain, then re-added it manually (from the DC, not the sql server), still same problems
    - my manager tried some hack he found on the internet to reset the SID for the machine account, this didn't work either
    - you CAN ping the PDC by name, so name resolution from the sqlserver doesn't seem to be an issue

    The symptoms don't seem like a local problem but if it isn't, why isn't any other server having the same problem, and why do the spurious errors only get thrown up locally and not anywhere else?

    Thanks for ideas..


    Tom Pullen
    DBA, Oxfam GB
  16. derrickleggett New Member

    WINS database is corrupt. I would almost bet money on it.

    MeanOldDBA
    derrickleggett@hotmail.com

    When life gives you a lemon, fire the DBA.
  17. satya Moderator

    Tom<br /><br />If you get a chance post the error messages and hope this festive week will bring the joy back.... for your OLAP users [<img src='/community/emoticons/emotion-2.gif' alt=':D' />]<br /><br /><hr noshade size="1"><b>Satya SKJ</b><br />Moderator<br /<a target="_blank" href=http://www.SQL-Server-Performance.Com/forum>http://www.SQL-Server-Performance.Com/forum</a><br /><center><font color="teal"><font size="1">This posting is provided “AS IS” with no rights for the sake of <i>knowledge sharing.</i></font id="size1"></font id="teal"></center>
  18. thomas New Member

    Example error message from Friday (when trying to remove the server from the domain)

    Network Identification

    The following error occured attempting to unjoin the domain "xxxxx".

    The specficed service does not exist.

    No Error Number, etc.

    Still struggling with it!

    Tom Pullen
    DBA, Oxfam GB
  19. satya Moderator

    Sounds like a group policy is blocking this. Check your local security policies.
    Are you using Active Directory, if so then check:
    Settings,CN=<servername>,CN=Servers,CN=<sitename>,CN=Sites,CN=Configuration,DC=<domain>...

    Satya SKJ
    Moderator
    http://www.SQL-Server-Performance.Com/forum
    This posting is provided “AS IS” with no rights for the sake of knowledge sharing.
  20. thomas New Member

    No, it's an NT 4.0 domain.

    It is now fixed. repair of OS booting from W2K CD fixed it. All data intact and no need to reinstall SQL Server. Almightily relieved DBA in the house!

    Tom Pullen
    DBA, Oxfam GB
  21. Twan New Member

    Hi Tom,

    are you able to send a list of the disabled services on both the SQL box and the DC?

    Cheers
    Twan
  22. Joozh New Member

    quote:Originally posted by thomas

    ...
    It is now fixed. repair of OS booting from W2K CD fixed it. All data intact and no need to reinstall SQL Server. Almightily relieved DBA in the house!

    Tom Pullen
    DBA, Oxfam GB

    Congrats Tom. Glad to hear that it's finally solved.

    Regards.
  23. thomas New Member

    quote:Originally posted by Twan

    Hi Tom,

    are you able to send a list of the disabled services on both the SQL box and the DC?

    Cheers
    Twan

    yes twan will do when I get a second.

    got downtime on live financial OLTP server 3pm - 7pm to move a couple of large tables to their own partition, do some memory tuning and try to get missing SQL Server perfmon counters back, so in all i'm a bit busy, but damn glad my DW is working again!

    Tom Pullen
    DBA, Oxfam GB
  24. Twan New Member

    Hi Tom,

    if your server is working now, then don't worry about sending the list of disabled services... presumably a repair would have reset those back to defaults anyway...

    Cheers
    Twan
  25. thomas New Member

    yes indeed. had more fun & games with this server today, the network connection (public LAN) kept disconnecting and interrupting user queries. arrghhhh.. why me? I got it re-patched into a different port on the switch and it seems to have fixed it.

    do any of you ever get the feeling that it never rains but it pours?

    Tom Pullen
    DBA, Oxfam GB
  26. Adriaan New Member

    Hm, a bad network card perhaps? Or a bad one that is not actually in use but is still disturbing the system?
  27. thomas New Member

    i have had those thoughts myself but ... the intermittent nature of it makes me suspect it's not the card. my belief (maybe wrong, but..) is that network cards tend to be like processors, i.e. they either work or they don't, there's not much "intermittency" likely to happen. that's my boss's opinion, and he's pretty hardware-savvy.


    Tom Pullen
    DBA, Oxfam GB
  28. Twan New Member

    speed and duplex settings on the server and switch port can sometimes cause intermittent problems... check that they're either both set to auto or both hard fixed to 100Mb/full or 1000Mb/full

    Cheers
    Twan
  29. thomas New Member

    yes the public one is set to 100/full duplex and the private ones are auto-detect as directed by our network men.

    they say that these issues (which have also intermittently affected other servers) may indicate a requirement for the edge switches to be rebooted to clear the arp tables, etc. Sounds like a good idea although i must admit when it comes to networks most of it sounds like a load of old 'arp to me anyway. Boom boom!

    Tom Pullen
    DBA, Oxfam GB
  30. Twan New Member

    forgive me but arp tables are a layer 3 feature and therefore wouldn't be on a switch, but would be on routers? switches have cam tables to map ports to mac addresses, but these tend to be very dynamic and tend to not get out of date (in my experience)

    doesn't sound quite right to me...

    cheers
    Twan
  31. thomas New Member

    these things are "passports" which operate as both switches AND routers, allegedly.

    Tom Pullen
    DBA, Oxfam GB
  32. hoo-t New Member

    Thomas,
    I'm not an expert in network issues, but if it were me, I would really be pushing for trying a different network card. I do know of at least once in our shop that a flaky network card had intermittant problems. Its certainly not unheard of for a circuit board to have heat related problems, or a broken solder joint.

    Steve
  33. thomas New Member

    thanks, steve. if these issues persist, i shall most def be doing just that.

    Tom Pullen
    DBA, Oxfam GB

Share This Page