Strange server hangs

From: Victor Kirhenshtein <victor_at_DOMAIN_REMOVED>
Date: Wed, 16 Nov 2005 19:17:11 +0200

Sometimes server threads hangs somewhere inside this code (from node.cpp
/ Node::StatusPoll). I can't see any deadlocks here nor in underlying
functions. The only scenario I can imagine is that if shutdown() called
on agent's socket in pAgentConn->Disconnect() does not cause recv() in
other thread to exit with an error. Also, I have seen this problem only
on Windows. Any ideas?

   if (m_dwFlags & NF_IS_NATIVE_AGENT)
   {
      AgentConnection *pAgentConn;

      SetPollerInfo(nPoller, "check agent");
      SendPollerMsg(dwRqId, "Checking NetXMS agent connectivity\r\n");
      pAgentConn = new AgentConnection(htonl(m_dwIpAddr), m_wAgentPort,
                                       m_wAuthMethod, m_szSharedSecret);
      SetAgentProxy(pAgentConn);
      if (pAgentConn->Connect(g_pServerKey))
      {
         if (m_dwDynamicFlags & NDF_AGENT_UNREACHEABLE)
         {
            m_dwDynamicFlags &= ~NDF_AGENT_UNREACHEABLE;
            PostEventEx(pQueue, EVENT_AGENT_OK, m_dwId, NULL);
            SendPollerMsg(dwRqId, "Connectivity with NetXMS agent
restored\r\n");
         }
         pAgentConn->Disconnect();
      }
      else
      {
         if (!(m_dwDynamicFlags & NDF_AGENT_UNREACHEABLE))
         {
            m_dwDynamicFlags |= NDF_AGENT_UNREACHEABLE;
            PostEventEx(pQueue, EVENT_AGENT_FAIL, m_dwId, NULL);
            SendPollerMsg(dwRqId, "NetXMS agent unreacheable\r\n");
         }
      }
      delete pAgentConn;
   }

Best regards,
Victor
Received on Wed Nov 16 2005 - 19:17:11 EET

This archive was generated by hypermail 2.2.0 : Wed Nov 16 2005 - 19:30:32 EET