SQL Server Performance

Event ID 1146 and 1069 - strange failover scenario

Discussion in 'SQL Server Clustering' started by franco, Aug 4, 2005.

  1. franco New Member

    Hi all,
    I would like to share with you a strange situation that we notice on a 2
    node cluster, windows server 2003 - no service pack.
    The cluster is configured as follow:
    Node A hosts SQL Server 2K sp3a and file system
    Node B hosts Oracle 9i and Lotus Domino Server 6.5.3.
    Sometimes, not always, when we move one group to the other node we end up in
    very strange situations, like Event id 1146 and Event id 1069.
    The description of event id 1146 is:
    The cluster resource monitor died unexpectedly, an attempt will be made to
    restart it.
    The description of event id is:
    Cluster resource 'Network name' in Resource Group 'Group name' failed.

    The problem is that when this happen, every group of the cluster are
    affected by this problem.
    I mean, for example, when we move SQL Server group to the other node and
    this problem appears, all the other groups, like Oracle group or Lotus group
    are affected by Event id 1146 and event id 1069.
    I know there is a fix provided by Microsoft here:
    The KB article say:
    This problem occurs if the following conditions are true:
    • You have a network name resource with Kerberos enabled.
    • This name of the computer object that corresponds to the Kerberos-enabled
    network name resource of the cluster is located in an organizational unit
    that has a forward slash character in its name.

    So none of this conditions apply to us, but here comes my guess:
    When Lotus Domino was first installed on the cluster, one share was also
    I don't know if this share was created following the rules posted in this MS
    I know for sure that when you create a cluster share the following registry
    key is populated:
    I also notice that when you take offline that share, also the registry key
    The problem is that we found the name of one Share on that key that was
    first created for the Lotus Domino group and then was deleted, but again I
    don't know how this was done.
    Here comes the question:
    Is it possible to think that this dirty registry key is the cause of the
    strange failover scenario described before?
    Sorry for this long post and for my bad english, but every suggestion is
    really appreciated.

  2. Argyle New Member

    1069 is "Cluster resource'Network Name' in Resource Group 'GroupName' failed."

    Check your DNS settings so that there is no name collision when the failover occur. A common issue is that when the group is going offline the name is not correctly unregistered from the DNS. Then when it tries to come back online on the other node it can't becasue the name already exist on the network.
  3. franco New Member

    Thank you for the suggestion.

  4. franco New Member

    I don't have "DNS REGISTRATION MUST SUCCEED" on the network name cluster resource.
    Is your suggestion still valid?
    Please advise.

  5. Argyle New Member

    Any other errors messages than those above?
  6. franco New Member

    No other errors.

  7. Argyle New Member

    Which network name in which group is the 1069 error for? The cluster group, sql group, oracle group or domino group?
  8. franco New Member

    Tipically event id 1069 affects 2 groups at a time, but sometimes even more.
    Yesterday was SQL Server and File system and one week ago was Lotus Domino and Oracle.

  9. Argyle New Member

    Hard to say what it is. Could be a conflict with some of the resource dlls. I would check %windir%clustercluster.log and if no additional info is found there open a case with Microsoft.
  10. franco New Member

    I think the same.
    Thanks for all your time and support.


Share This Page