Set node down when ping fails, no matter if other dcis are up or not

Started by Egert143, November 17, 2020, 12:12:06 PM

Previous topic - Next topic

Egert143

Hello

How can i setup node monitoring so if ping fails node is shown as down, no matter what other dcis report ?

Egert

Filipp Sudanov

Please give some more details. Is there NetXMS agent or SNMP on that node?
If there's not, it's the default behavior of NetXMS - it will be pinging the node and showing that it's down.
If yes, would be interesting to know the use-case for that configuration.

Egert143

Device is Mikrotik router, monitored by icmp and snmp. When device is down, some snmp dci-s remain "up" for example ICMP.PacketLoss and ICMP.ResponseTime.Last. They keep reporting last values. And that causes device never to be shown as down.

Filipp Sudanov

The DCIs you mention are not collected by SNMP. These are generated by NetXMS server while it's pinging nodes (see https://www.netxms.org/documentation/adminguide/icmp-ping.html#icmp-response-statistic-collection).

There should be some other reason why this router is not showing as down. Could be the correlation mechanism.
1) Show the output of status poll for this node.
2) Try temporarily increasing debug level to 6 or 7 for the server for the moment when this node goes from to down and show server log.

Egert143

There are some snmp dcis aswell, they indeed show error when node down.

I ran poll->status: when node is not pinging, for some reason node is still "Up"

[19.11.2020 08:41:09] **** Poll request sent to server ****
[19.11.2020 08:41:09] Poll request accepted
[19.11.2020 08:41:09] Starting status poll for node TestNode
[19.11.2020 08:41:09] Checking SNMP agent connectivity
[19.11.2020 08:41:09] SNMP agent unreachable
[19.11.2020 08:41:09]    Starting status poll on access point TestNode
[19.11.2020 08:41:09]       Current access point status is NORMAL
[19.11.2020 08:41:09]       Access point status after poll is NORMAL
[19.11.2020 08:41:09]    Finished status poll on access point TestNode
[19.11.2020 08:41:09]    Starting status poll on interface lte1
[19.11.2020 08:41:09]       Current interface status is NORMAL
[19.11.2020 08:41:09]       Starting ICMP ping
[19.11.2020 08:41:09]       Interface is NORMAL for 3763 polls (1 poll required for status change)
[19.11.2020 08:41:09]       Interface status after poll is NORMAL
[19.11.2020 08:41:09]    Finished status poll on interface lte1
[19.11.2020 08:41:09]    Starting status poll on interface wlan1
[19.11.2020 08:41:09]       Current interface status is UNKNOWN
[19.11.2020 08:41:09]       Interface status cannot be determined
[19.11.2020 08:41:09]       Interface is UNKNOWN for 365 polls (1 poll required for status change)
[19.11.2020 08:41:09]       Interface status after poll is UNKNOWN
[19.11.2020 08:41:09]    Finished status poll on interface wlan1
[19.11.2020 08:41:09]    Starting status poll on interface ether1
[19.11.2020 08:41:09]       Current interface status is UNKNOWN
[19.11.2020 08:41:09]       Interface status cannot be determined
[19.11.2020 08:41:09]       Interface is UNKNOWN for 365 polls (1 poll required for status change)
[19.11.2020 08:41:09]       Interface status after poll is UNKNOWN
[19.11.2020 08:41:09]    Finished status poll on interface ether1
[19.11.2020 08:41:09]    Starting status poll on interface Bridge
[19.11.2020 08:41:09]       Current interface status is NORMAL
[19.11.2020 08:41:09]       Starting ICMP ping
[19.11.2020 08:41:09]       Interface is NORMAL for 3763 polls (1 poll required for status change)
[19.11.2020 08:41:09]       Interface status after poll is NORMAL
[19.11.2020 08:41:09]    Finished status poll on interface Bridge
[19.11.2020 08:41:10] Node is connected
[19.11.2020 08:41:10] Finished status poll for node TestNode
[19.11.2020 08:41:10] Node status after poll is NORMAL
[19.11.2020 08:41:10] **** Poll completed successfully ****

Filipp Sudanov

From first glance looks like it's pinging the interface and gets reply:
[19.11.2020 08:41:09]    Starting status poll on interface lte1
[19.11.2020 08:41:09]       Current interface status is NORMAL
[19.11.2020 08:41:09]       Starting ICMP ping
[19.11.2020 08:41:09]       Interface is NORMAL for 3763 polls (1 poll required for status change)

If you open Object Details for lte1 interface, does it has correct IP address? Could it be, that server can ping that address by some other route?

Try collecting server log at debug level 7 for the duration of status poll.

Egert143

Debugging is bit hard, the time it takes to run poll (about 5-10 sec) there will be 45000 lines generated. :D

Egert