NetXMS Support Forum

English Support => General Support => Topic started by: blingblouw on February 10, 2021, 09:02:02 AM

Title: node status
Post by: blingblouw on February 10, 2021, 09:02:02 AM
Hi.

NetXMS is monitoring nodes over a VPN.Every so often, the VPN will die. When this happens, NetXMS (rightfully) marks the nodes as down but when the VPN comes back, the nodes never get marked as up.

If I manually run the status it says the following

[10.02.2021 09:00:09] **** Poll request sent to server ****
[10.02.2021 09:00:09] Poll request accepted
[10.02.2021 09:00:14] Starting status poll for node Remote-RouterOS RB960PGS-6.40.9
[10.02.2021 09:00:14] Checking SNMP agent connectivity
[10.02.2021 09:00:14]    Starting status poll on interface ether1
[10.02.2021 09:00:14]       Current interface status is NORMAL
[10.02.2021 09:00:14]       Retrieving interface status from SNMP agent
[10.02.2021 09:00:14]       Interface status retrieved from SNMP agent
[10.02.2021 09:00:14]       Interface is NORMAL for 357 polls (1 poll required for status change)
[10.02.2021 09:00:14]       Interface status after poll is NORMAL
[10.02.2021 09:00:14]    Finished status poll on interface ether1
[10.02.2021 09:00:14]    Starting status poll on interface ether2
[10.02.2021 09:00:14]       Current interface status is NORMAL
[10.02.2021 09:00:14]       Retrieving interface status from SNMP agent
[10.02.2021 09:00:14]       Interface status retrieved from SNMP agent
[10.02.2021 09:00:14]       Interface is NORMAL for 357 polls (1 poll required for status change)
[10.02.2021 09:00:14]       Interface status after poll is NORMAL
[10.02.2021 09:00:14]    Finished status poll on interface ether2
[10.02.2021 09:00:14]    Starting status poll on interface ether3
[10.02.2021 09:00:14]       Current interface status is NORMAL
[10.02.2021 09:00:14]       Retrieving interface status from SNMP agent
[10.02.2021 09:00:14]       Interface status retrieved from SNMP agent
[10.02.2021 09:00:14]       Interface is NORMAL for 357 polls (1 poll required for status change)
[10.02.2021 09:00:14]       Interface status after poll is NORMAL
[10.02.2021 09:00:14]    Finished status poll on interface ether3
[10.02.2021 09:00:14]    Starting status poll on interface ether4
[10.02.2021 09:00:14]       Current interface status is NORMAL
[10.02.2021 09:00:14]       Retrieving interface status from SNMP agent
[10.02.2021 09:00:14]       Interface status retrieved from SNMP agent
[10.02.2021 09:00:14]       Interface is NORMAL for 357 polls (1 poll required for status change)
[10.02.2021 09:00:14]       Interface status after poll is NORMAL
[10.02.2021 09:00:14]    Finished status poll on interface ether4
[10.02.2021 09:00:14]    Starting status poll on interface ether5
[10.02.2021 09:00:14]       Current interface status is NORMAL
[10.02.2021 09:00:14]       Retrieving interface status from SNMP agent
[10.02.2021 09:00:14]       Interface status retrieved from SNMP agent
[10.02.2021 09:00:14]       Interface is NORMAL for 357 polls (1 poll required for status change)
[10.02.2021 09:00:14]       Interface status after poll is NORMAL
[10.02.2021 09:00:14]    Finished status poll on interface ether5
[10.02.2021 09:00:14]    Starting status poll on interface sfp1
[10.02.2021 09:00:14]       Current interface status is NORMAL
[10.02.2021 09:00:14]       Retrieving interface status from SNMP agent
[10.02.2021 09:00:14]       Interface status retrieved from SNMP agent
[10.02.2021 09:00:14]       Interface is NORMAL for 357 polls (1 poll required for status change)
[10.02.2021 09:00:14]       Interface status after poll is NORMAL
[10.02.2021 09:00:14]    Finished status poll on interface sfp1
[10.02.2021 09:00:14]    Starting status poll on interface bridge1
[10.02.2021 09:00:14]       Current interface status is NORMAL
[10.02.2021 09:00:14]       Retrieving interface status from SNMP agent
[10.02.2021 09:00:14]       Interface status retrieved from SNMP agent
[10.02.2021 09:00:14]       Interface is NORMAL for 357 polls (1 poll required for status change)
[10.02.2021 09:00:14]       Interface status after poll is NORMAL
[10.02.2021 09:00:14]    Finished status poll on interface bridge1
[10.02.2021 09:00:14] Node is connected
[10.02.2021 09:00:14] Finished status poll for node Remote-RouterOS RB960PGS-6.40.9
[10.02.2021 09:00:14] Node status after poll is CRITICAL
[10.02.2021 09:00:14] **** Poll completed successfully ****

Why would the poll status remain CRITICAL and what can I do to reset it?
Title: Re: node status
Post by: Filipp Sudanov on February 10, 2021, 04:18:52 PM
What's currently in alarms for this node (right click on node -> alarms)?
Title: Re: node status
Post by: blingblouw on February 10, 2021, 04:37:54 PM
says node down which is weird. on the overview field i can see the ICMP average response time so its getting pings
Title: Re: node status
Post by: Filipp Sudanov on February 10, 2021, 05:18:42 PM
In NetXMS alarms that are present on a node affect node status.

Looks like your EPP does not have a rule to automatically terminate node down alarm when node comes back up (see screenshot of default EPP configuration on that).
Title: Re: node status
Post by: blingblouw on February 11, 2021, 08:27:54 AM
Thanks for your reply. I would say that it seems to be something there causing the issue (though i've not touched EPP AFAIK)  but as soon as I remove one of these alarms it does show as up
Title: Re: node status
Post by: Filipp Sudanov on February 12, 2021, 04:26:30 PM
Try checking alarm log (View -> Alarm log) to see what was happening - what was the event that triggered the alarm and what is the alarm key.
Title: Re: node status
Post by: Abraxas on December 08, 2021, 11:56:07 AM
I have the same problem with nodes remaining down (no VPN involved though).
Node down/up works fine for all hosts monitored on the internal IP subnet, but for the ones that use public IPs (still with the agent), they remain down.
I have this node for example, that is reachable, polling its status looks ok:
[08.12.2021 11:46:38] **** Poll request sent to server ****
[08.12.2021 11:46:38] Poll request accepted
[08.12.2021 11:46:38] Starting status poll for node Dolion
[08.12.2021 11:46:38] Checking NetXMS agent connectivity
[08.12.2021 11:46:38]    Starting status poll on interface lo
[08.12.2021 11:46:38]       Current interface status is NORMAL
[08.12.2021 11:46:38]       Retrieving interface status from NetXMS agent
[08.12.2021 11:46:39]       Interface status retrieved from NetXMS agent
[08.12.2021 11:46:39]       Interface is NORMAL for 790 polls (1 poll required for status change)
[08.12.2021 11:46:39]       Interface status after poll is NORMAL
[08.12.2021 11:46:39]    Finished status poll on interface lo
[08.12.2021 11:46:39]    Starting status poll on interface eno1
[08.12.2021 11:46:39]       Current interface status is NORMAL
[08.12.2021 11:46:39]       Retrieving interface status from NetXMS agent
[08.12.2021 11:46:39]       Interface status retrieved from NetXMS agent
[08.12.2021 11:46:39]       Interface is NORMAL for 790 polls (1 poll required for status change)
[08.12.2021 11:46:39]       Interface status after poll is NORMAL
[08.12.2021 11:46:39]    Finished status poll on interface eno1
[08.12.2021 11:46:39]    Starting status poll on interface eno2
[08.12.2021 11:46:39]       Current interface status is DISABLED
[08.12.2021 11:46:39]       Retrieving interface status from NetXMS agent
[08.12.2021 11:46:39]       Interface status retrieved from NetXMS agent
[08.12.2021 11:46:39]       Interface is DISABLED for 754 polls (1 poll required for status change)
[08.12.2021 11:46:39]       Interface status after poll is DISABLED
[08.12.2021 11:46:39]    Finished status poll on interface eno2
[08.12.2021 11:46:39]    Starting status poll on interface enp0s20f0u8u3c2
[08.12.2021 11:46:40]       Current interface status is DISABLED
[08.12.2021 11:46:40]       Retrieving interface status from NetXMS agent
[08.12.2021 11:46:40]       Interface status retrieved from NetXMS agent
[08.12.2021 11:46:40]       Interface is DISABLED for 754 polls (1 poll required for status change)
[08.12.2021 11:46:40]       Interface status after poll is DISABLED
[08.12.2021 11:46:40]    Finished status poll on interface enp0s20f0u8u3c2
[08.12.2021 11:46:40] Node is connected
[08.12.2021 11:46:41] Finished status poll for node Dolion
[08.12.2021 11:46:41] Node status after poll is CRITICAL
[08.12.2021 11:46:41] **** Poll completed successfully ****

Any idea how to fix this?
Title: Re: node status
Post by: Filipp Sudanov on December 08, 2021, 02:07:38 PM
For this node:
- does it has any alarms?
- does it has any interfaces in critical state?
Title: Re: node status
Post by: Abraxas on December 08, 2021, 02:39:19 PM
Thank you for the quick reply!

It only has the node down alarm, as in the attached screenshot.
It has 2 interfaces down, but those were down all the time. I tried to pus them on Ignore, but no luck.
Title: Re: node status
Post by: Filipp Sudanov on December 09, 2021, 11:25:44 AM
So I believe the interfaces are keeping the node in critical. What happened when you tried to set them to "Ignore"? Some error message? SQL error messages in server log? What if you delete these interfaces and do configuration poll - is it possible to edit them then?
Title: Re: node status
Post by: Abraxas on December 09, 2021, 04:38:42 PM
They were already on Ignore. I tried to delete them, and do a Configuration poll. I set them again on Ignore, but no change.
There are no errors in the server log.
Title: Re: node status
Post by: Abraxas on December 09, 2021, 05:11:24 PM
I have termianted manually the alarm, and things look ok. I did this to make sure I get alarms in case something happens with that box, but the issue is still there :(
status poll shows the node Normal now:
[09.12.2021 17:11:14] Node is connected
[09.12.2021 17:11:15] Finished status poll for node Dolion
[09.12.2021 17:11:15] Node status after poll is NORMAL
[09.12.2021 17:11:15] **** Poll completed successfully ****