Make node go critical only when it's ICMP unreachable?

Started by zshnet, May 27, 2016, 07:19:23 PM

Previous topic - Next topic

zshnet

Hi all,

I have a set of switches that are always reported in the "UNKNOWN" state, due to some issues with snmp (they report IfOperState, not IfAdminState). Also, on random nights between 11pm-6am, the switches will stop responding to snmp. However, they're still functioning and continuing to pass ICMP ping checks.

Is there any way to force a node to only care about its ping status, not interface snmp status? Alternatively, can I force interfaces to poll IfOperState, not IfAdminState?


Thanks,
Zach

Victor Kirhenshtein

Hi,

so switches always return "unknown" for admin state? Or don't return it at all? Could you please provide result of SNMP walk on .1.3.6.1.2.1.2.2.1 for such switch?

Because switch interfaces do not have IP addresses, they stuck in unknown state - because you can only ping management address of a switch. Unknown statuses effectively ignored, so you will get node down event for example when switch stop responding to ICMP ping. You can also set expected state for interfaces to IGNORE - that way interface objects will always have NORMAL status.

Best regards,
Victor

zshnet

Hello!
Thanks for the quick response.

It's odd. The last time I walked it the switches appeared to return nothing at all, not "unknown."  I have set the interfaces to IGNORE. However, when it does not respond to SNMP, the node goes down. For example, if I check the option "stop using SNMP for all polls" in the properties tab of this node, it goes critical because it can't ping any interfaces. However, we can still ping the node from the server. We set up a ping Object Tool with the command ping -D -O -c5 -W1.2 -i1.5 %a.

You can see from my attachments what is returned in my SNMPwalk, the result of a status poll, and the result of the above ping command. Now that you mention it, it seems the ICMP poll is working differently from my ping command. The node is only down because I asked it to not use SNMP on any poll. If I uncheck that option it will go back to the unknown state. Any thoughts?

Thanks,
Zach

Victor Kirhenshtein

Do you by any chance run netxmsd under non-root user? If yes, it has no access to raw sockets and so cannot use ICMP. You should either run netxmsd as root or give access to raw sockets using this instruction: https://wiki.netxms.org/wiki/How_to_enable_ICMP_ping_for_NetXMS_server_running_under_non-root_account.

Best regards,
Victor

zshnet

Hi Victor,

It appears that I am running netxmsd as root, when I run ps -aux | grep netxmsd I get the following:

root      4001 21.9  6.0 3177548 492060 ?      Ssl  May27 1388:22 /usr/bin/netxmsd -d

Which would suggest to me that it's running under root. Is there something else I should check? It's possible we have an odd firewall rule that would fight it, I'll double check that with the other engineer working on NetXMS.

Thanks again for all the help, it's much appreciated.
Zach

zshnet

Problem solved, for those curious:

The other NetXMS engineer figured it out. Apparently, NetXMS grabs the IP address from SNMP. If there isn't one, it walks interfaces to find their IPs. However, many of our devices (Ubiquiti AirOS radios, Netonix switches) do not report these things the way NetXMS expects. When  it can't find these things, it uses SNMP availability to determine status. We only noticed on Netonix switches since they often (~1/day) drop SNMP capability.

The solution that he came up with was to write an nxshell script that loops over devices, determines if we can find an IP address on it, and if not, builds a dummy interface with the main IP address.

Hope this helps someone else!

Staj

This behaviour is annoying if you're dealing with devices with fixed IP addresses (as defined in the node) that have flaky SNMP, would be great if this was toggable somewhere.

Victor Kirhenshtein

Hi,

if IP addresses are readable from vendor MIB on Netonix and Ubiquity AirOS devices we can add a driver for them to create management interfaces correctly.

Best regards,
Victor

Kevo

What is involved in getting a driver for Netonix switches. I see that there is a UBNT driver of some sort as our Ubiquiti AirOS and AirFiber radios show up as UBNT. Our Netonix switches show up as Generic.


Victor Kirhenshtein

Hi,

we need MIBs for the switch and ideally also access to test device to build a driver.

Best regards,
Victor

Kevo

I can supply both of those things. How should I send over the info? I will most likely need to set up a VPN or IP rule on our firewall and give you the switch IP.

Victor Kirhenshtein

Hi,

you can use private messages on forum to send me access info.

Best regards,
Victor