issue with node capabilities after upgrade (Agent and SNMP on windows)

Started by AP_ops, October 26, 2012, 05:20:13 PM

Previous topic - Next topic

AP_ops

Hi

we are experiencing a strange issue since we upgraded from 1.0.11 to 1.2.3. when the configuration poll checks the capabilities of a system it seems to be unable to find the NetXMS agent AND the snmp agent together.
if NetXMS Agent Service is running, it recognizes the Agent but not SNMP. if you stop the service, it recognizes SNMP again.
this means, when we run the NetXMS Agent we are unable to check SNMP values on our systems!

we experienced this on several Windows 2008 R2 systems, an issue we didn't have before.
most of the systems still work, but on every one i run the configuration poll, the SNMP stops working as it doesn't show up in the capabilities anymore.

edit: it seems to be only an issue with manual configuration polls. we upgraded yesterday and we were not yet flodded by errors ;)

any idea on this?
thanks!

bdefloo

I can confirm this issue.

We were reading temperature via SNMP on the NetXMS server itsself since v1.0.x. After a number of upgrades up to 1.2.3, this still worked, until I added some unrelated external parameters to the agent configuration and did a configuration poll.

After a day or so it may get back on its feet, though, so it could be just manual polls.

AP_ops

i agree, after a day or two it sorted things out again. still,  it's an issue to be taken care of.

Victor Kirhenshtein

Hi all!

Can you please post number of nodes/DCIs being monitored and server configuration settings (I'm interested in poller counts and timeouts). Also, did you check internal server queues?

Best regards,
Victor

bdefloo

Hi,

Total number of objects: 14715
Number of monitored nodes: 848
Number of collectable DCIs: 7182

Config variables are attached.
Note that I did play around with the poller counts recently, but I can still reproduce the problem.

First, I do a walk on the desired OID in the MIB browser, this shows the results.

I then set debug to 9 and did a configuration poll:
[30-Oct-2012 10:14:53] [CLSN-0] Received message CMD_POLL_NODE
[30-Oct-2012 10:14:53] [CLSN-0] Sending message CMD_POLLING_INFO
[30-Oct-2012 10:14:53] [CLSN-0] Sending message CMD_POLLING_INFO
[30-Oct-2012 10:14:53] Starting configuration poll for node DCXAS041 (ID: 1225)
[30-Oct-2012 10:14:53] [CLSN-0] Sending message CMD_POLLING_INFO
[30-Oct-2012 10:14:53] ConfPoll(DCXAS041): checking for NetXMS agent Flags={00000010} DynamicFlags={00000400}
[30-Oct-2012 10:14:53] [CLSN-0] Sending message CMD_POLLING_INFO
[30-Oct-2012 10:14:53] ConfPoll(DCXAS041): checking for NetXMS agent - connecting
[30-Oct-2012 10:14:53] ConfPoll(DCXAS041): checking for NetXMS agent - connected
[30-Oct-2012 10:14:53] [CLSN-0] Sending message CMD_POLLING_INFO
[30-Oct-2012 10:14:53] [CLSN-0] Sending message CMD_POLLING_INFO
[30-Oct-2012 10:14:53] [CLSN-0] Sending message CMD_POLLING_INFO
[30-Oct-2012 10:14:53] [CLSN-0] Sending message CMD_POLLING_INFO
[30-Oct-2012 10:14:53] AgentConnection::getSupportedParameters(): RCC=0
[30-Oct-2012 10:14:53] AgentConnection::getSupportedParameters(): 151 parameters received from agent
[30-Oct-2012 10:14:53] AgentConnection::getSupportedParameters(): 2 tables received from agent
[30-Oct-2012 10:14:53] ConfPoll(DCXAS041): checking for NetXMS agent - finished
[30-Oct-2012 10:14:53] ConfPoll(DCXAS041): checking for CheckPoint SNMP on port 260
[30-Oct-2012 10:14:53] [CLSN-0] Sending message CMD_POLLING_INFO
[30-Oct-2012 10:14:53] Event::expandText(event=244C0F20 sourceObject=1225 template='Node capabilities changed (Old: %1; New: %2)' alarmMsg='(null)')
[30-Oct-2012 10:14:53] [CLSN-0] Sending message CMD_POLLING_INFO
[30-Oct-2012 10:14:53] EVENT 13 (ID:6396083 F:0x0000 S:0 TAG:"") FROM DCXAS041: Node capabilities changed (Old: 0x00002013; New: 0x00000012)
[30-Oct-2012 10:14:53] [CLSN-0] Sending message CMD_EVENTLOG_RECORDS
[30-Oct-2012 10:14:53] EPP: processing event 6396083
[30-Oct-2012 10:14:53] Node::updateInterfaceConfiguration(DCXAS041 [1225]): got 1 interfaces
[30-Oct-2012 10:14:53] Checking subnet bindings for node DCXAS041 [1225]
[30-Oct-2012 10:14:53] [CLSN-0] Sending message CMD_POLLING_INFO
[30-Oct-2012 10:14:53] [CLSN-0] Sending message CMD_POLLING_INFO
[30-Oct-2012 10:14:53] [CLSN-0] Sending message CMD_POLLING_INFO
[30-Oct-2012 10:14:53] [CLSN-0] Sending message CMD_POLLING_INFO
[30-Oct-2012 10:14:53] [CLSN-0] Sending message CMD_POLLING_INFO
[30-Oct-2012 10:14:53] Node::executeHookScript(DCXAS041 [1225]): hook script "Hook::ConfigurationPoll" not found
[30-Oct-2012 10:14:53] Finished configuration poll for node DCXAS041 (ID: 1225)
[30-Oct-2012 10:14:53] [CLSN-0] Sending message CMD_POLLING_INFO
[30-Oct-2012 10:14:53] [CLSN-0] Sending message CMD_OBJECT_UPDATE

Checked again with MIB browser, no results:
[30-Oct-2012 10:14:57] Node(DCXAS041)->GetItemFromSNMP(.1.3.6.1.4.1.232.6.2.6.8.1.4.0.2): dwResult=4

AP_ops

Hi

we have 99 nodes with 2815 DCIs. configs are attached.
i did not check internal server queues, how can i do that? what should we look for?

regards

Victor Kirhenshtein

Quote from: bdefloo on October 30, 2012, 11:27:45 AM
Hi,

Total number of objects: 14715
Number of monitored nodes: 848
Number of collectable DCIs: 7182

Config variables are attached.
Note that I did play around with the poller counts recently, but I can still reproduce the problem.

First, I do a walk on the desired OID in the MIB browser, this shows the results.

I then set debug to 9 and did a configuration poll:
[30-Oct-2012 10:14:53] [CLSN-0] Received message CMD_POLL_NODE
[30-Oct-2012 10:14:53] [CLSN-0] Sending message CMD_POLLING_INFO
[30-Oct-2012 10:14:53] [CLSN-0] Sending message CMD_POLLING_INFO
[30-Oct-2012 10:14:53] Starting configuration poll for node DCXAS041 (ID: 1225)
[30-Oct-2012 10:14:53] [CLSN-0] Sending message CMD_POLLING_INFO
[30-Oct-2012 10:14:53] ConfPoll(DCXAS041): checking for NetXMS agent Flags={00000010} DynamicFlags={00000400}
[30-Oct-2012 10:14:53] [CLSN-0] Sending message CMD_POLLING_INFO
[30-Oct-2012 10:14:53] ConfPoll(DCXAS041): checking for NetXMS agent - connecting
[30-Oct-2012 10:14:53] ConfPoll(DCXAS041): checking for NetXMS agent - connected
[30-Oct-2012 10:14:53] [CLSN-0] Sending message CMD_POLLING_INFO
[30-Oct-2012 10:14:53] [CLSN-0] Sending message CMD_POLLING_INFO
[30-Oct-2012 10:14:53] [CLSN-0] Sending message CMD_POLLING_INFO
[30-Oct-2012 10:14:53] [CLSN-0] Sending message CMD_POLLING_INFO
[30-Oct-2012 10:14:53] AgentConnection::getSupportedParameters(): RCC=0
[30-Oct-2012 10:14:53] AgentConnection::getSupportedParameters(): 151 parameters received from agent
[30-Oct-2012 10:14:53] AgentConnection::getSupportedParameters(): 2 tables received from agent
[30-Oct-2012 10:14:53] ConfPoll(DCXAS041): checking for NetXMS agent - finished
[30-Oct-2012 10:14:53] ConfPoll(DCXAS041): checking for CheckPoint SNMP on port 260
[30-Oct-2012 10:14:53] [CLSN-0] Sending message CMD_POLLING_INFO
[30-Oct-2012 10:14:53] Event::expandText(event=244C0F20 sourceObject=1225 template='Node capabilities changed (Old: %1; New: %2)' alarmMsg='(null)')
[30-Oct-2012 10:14:53] [CLSN-0] Sending message CMD_POLLING_INFO
[30-Oct-2012 10:14:53] EVENT 13 (ID:6396083 F:0x0000 S:0 TAG:"") FROM DCXAS041: Node capabilities changed (Old: 0x00002013; New: 0x00000012)
[30-Oct-2012 10:14:53] [CLSN-0] Sending message CMD_EVENTLOG_RECORDS
[30-Oct-2012 10:14:53] EPP: processing event 6396083
[30-Oct-2012 10:14:53] Node::updateInterfaceConfiguration(DCXAS041 [1225]): got 1 interfaces
[30-Oct-2012 10:14:53] Checking subnet bindings for node DCXAS041 [1225]
[30-Oct-2012 10:14:53] [CLSN-0] Sending message CMD_POLLING_INFO
[30-Oct-2012 10:14:53] [CLSN-0] Sending message CMD_POLLING_INFO
[30-Oct-2012 10:14:53] [CLSN-0] Sending message CMD_POLLING_INFO
[30-Oct-2012 10:14:53] [CLSN-0] Sending message CMD_POLLING_INFO
[30-Oct-2012 10:14:53] [CLSN-0] Sending message CMD_POLLING_INFO
[30-Oct-2012 10:14:53] Node::executeHookScript(DCXAS041 [1225]): hook script "Hook::ConfigurationPoll" not found
[30-Oct-2012 10:14:53] Finished configuration poll for node DCXAS041 (ID: 1225)
[30-Oct-2012 10:14:53] [CLSN-0] Sending message CMD_POLLING_INFO
[30-Oct-2012 10:14:53] [CLSN-0] Sending message CMD_OBJECT_UPDATE

Checked again with MIB browser, no results:
[30-Oct-2012 10:14:57] Node(DCXAS041)->GetItemFromSNMP(.1.3.6.1.4.1.232.6.2.6.8.1.4.0.2): dwResult=4

Log was very helpful, I've found a problem. 1.2.4 should work as expected.

Best regards,
Victor

Victor Kirhenshtein

Quote from: AP_ops on October 30, 2012, 04:42:15 PM
i did not check internal server queues, how can i do that? what should we look for?

You can open server's debug console (either using nxadm command line tool or by selecting Tools -> Server Console in Java management console) and enter show queue command. In normal circumstances size of all queues should be zero or near zero. High value for any queue, especially if it keeps high for a long time, usually indicates some problem. There are also predefined DCIs on NetXMS server's node which collect some internal metrics of the server.

Best regards,
Victor

AP_ops

good news!
will 1.2.4 be released soon? or will there also be an update for 1.2.3 addressing this problem?

thanks for clarification, didn't know of these queues.

regards

AP_ops