[SOLVED] SNMP UNSUPPORTED

Started by lweidig, July 13, 2021, 03:40:57 PM

Previous topic - Next topic

lweidig

The dreaded "Status of DCI nnnnn (SNMP oid) changed to UNSUPPORTED" has been one of the items we have always struggled with on this excellent product and once again this is the case.  Really want to get to the root of this so we can monitor and move forward.  The specific issues that we are having this time is monitoring Cambium ePMP LAN / WAN TX / RX byte counters.  We are monitoring 100's of these devices and the issue only shows up on about 15-20 of them.

To help narrow this down I have picked a single radio where these are not working which is located at a site that has one working perfectly.  I have compared configurations for these two devices, they are running the same firmware release and besides IP address information are identical (including hardware model of course).  All four of the counters are not working for this device as well and we get messages like:

Status of DCI 28941 (SNMP: .1.3.6.1.4.1.17713.21.2.1.65.0) changed to UNSUPPORTED

If I use the built in MIB explorer on the working device and WALK starting at .1.3.6.1.4.1.17713.21.2.1 this value is part of the list retrieved.  On the "broken" device it is not displayed in the MIB explorer.  HOWEVER, if I simply go to the command line of the server running NetXMS and run some SNMP commands against the failing device they respond just fine:

# snmpwalk -v 2c -c NOTIT 10.0.XXX.XXX .1.3.6.1.4.1.17713.21.2.1.65.0
iso.3.6.1.4.1.17713.21.2.1.65.0 = Counter64: 106460802368
# snmpget -v 2c -c NOTIT 10.0.XXX.XXX .1.3.6.1.4.1.17713.21.2.1.65.0
iso.3.6.1.4.1.17713.21.2.1.65.0 = Counter64: 106460803992


Hoping to finally be able to resolve this "random" issue that we have always faced.


Filipp Sudanov

What exactly version of NetXMS are you using? Are you collecting directly, or via NetXMS proxy?

Can you try setting debug level 6 for a moment and do manual poll of that DCI? On the server you can change debug level on the fly in Tools -> Server console by issuing e.g.
debug 6

lweidig

From the server console it is reporting V3.8-405-gdf0e338d8a.  We are collecting directly.

I tried to do what you were asking but in just the few seconds from the time I set the debug until after running the DCI poll this generated about 7K lines of debug!  Not even sure what in there I would be looking for and cannot post the log due to the need to scrub and the difficulty that would cause.  Is there some other way to get something and what specifically might we be looking for?


Filipp Sudanov

Checked what we have in log - the line that you would be looking for looks like this
2021.07.19 11:32:30.489 *D* [                   ] Node(OpenWrt.lan)->getMetricFromSNMP(.1.3.6.1.2.1.31.1.1.1.6.3): snmpResult=0
but it's actually debug level 7, not 6 (so would produce event more lines of log).

Probably better approach would be to collect tcpdump on your NetXMS server - you can filter by particular IP address so the dump won't be too big. It wold be good to collect three cases - netxms gets data normally, when DCI goes to unsupported and getting with snmpget.

lweidig

So, collected some data and it returns:

2021.07.23 08:08:19.842 *D* [                   ] Node(MyNode)->getMetricFromSNMP(.1.3.6.1.4.1.17713.21.2.1.65.0): snmpResult=6
2021.07.23 08:08:19.843 *D* [event.corr         ] CorrelateEvent: event SYS_DCI_UNSUPPORTED id 15599447 source MyNode [23754]
2021.07.23 08:08:19.843 *D* [event.corr         ] CorrelateEvent: finished, rootId=0
2021.07.23 08:08:19.843 *D* [client.session.0   ] Sending message CMD_SET_DCI_STATUS (80 bytes)
2021.07.23 08:08:19.843 *D* [event.proc         ] EVENT SYS_DCI_UNSUPPORTED [53] at {0} (ID:15599447 F:0x0001 S:2 TAGS:"") FROM MyNode: Status of DCI 33334 (SNMP: .1.3.6.1.4.1.17713.21.2.1.65.0) changed to UNSUPPORTED

lweidig

Figured out what was happening!  One node (working) was discovered with SNMP v2c and the other node (not working) v1.  Changed it to v2c and immediately they all worked!