Will nodes be rediscovered by auto discovery?

lweidig · July 13, 2017, 05:29:01 AM

I had a number of nodes that I had significantly changed the templates for and thought it would be a great idea to delete the nodes and let them get discovered again. I has been past an activediscoveryinterval (900s) and they are not coming back. Wondering if there is something I need to do to get them to come back or if I will just need to add them all manually. This is on a 2.1 server.

Tursiops · July 20, 2017, 01:29:32 AM

Hi,

I'm seeing similar behaviour. As we are using a lot of proxies and zones, we have to rely on passive rather than active discovery.
It looks like it only discovers once and then never again (server restarts do not have any effect).

If I remove something that was discovered, it does not come back later.
If I add new devices to the network or install an Agent on some Workstation, they are not discovered, even though they do show in ARP tables of switches and have SNMP and or NetXMS Agents installed and match the filter condition.

Cheers

Victor Kirhenshtein · July 20, 2017, 10:46:22 AM

Hi,

deleted nodes should be rediscovered unless there is something stuck in the system. Could you try to login with user "system" and check if these nodes still present somewhere, or use nxadm -c "show objects" to list all objects in the system and check?

Best regards,
Victor

lweidig · July 20, 2017, 03:11:30 PM

There is nothing for these nodes when logged in under system or using nxadm -c "show objects". I have also as the other person tried restarting the server to see if they will get automatically discovered again and no luck with that.

Tursiops · July 21, 2017, 01:35:48 AM

Hi,

I wiped a number of discovered nodes from the system last night, ran hkrun and checked show objects before and after.
The nodes were definitely no longer in the list, but have not been rediscovered yet (~10 hours later, with passive discovery meant to run every 15 minutes).

Discovery also doesn't seem to pick up network changes. It really behaves like it only runs once. Is there some flag in the database that might be stuck?

Cheers

Victor Kirhenshtein · July 21, 2017, 02:18:44 AM

Hi,

very weird. Please check server queues and try to run server with debug level 6 fr some time and check messages with prefix DiscoveryPoller.

Best regards,
Victor

Tursiops · July 25, 2017, 03:05:29 PM

Hi,

The logs show a lot of "potential node x.x.x.x rejected (IP address already queued for polling)".
When I check for existing objects, I either can't find them or they are in a different zone to the node that's used for discovery.

As it mentioned the "queued for polling", I had a look at the queues and Node Poller is at 40k+. Looks like that's our problem.
This value rarely decreases by 1, but otherwise just keeps increasing. Quite possible that 40k is simply the figure of all IPs across our networks which NetXMS discovered and wants to check.
I am not quite sure which poller figure to increase for this in the server config. Status? Discovery? Is there a NumberOfNodePollers configuration item?

Cheers

lweidig · July 25, 2017, 03:53:08 PM

For us all of the queues are empty:

Code Select

netxmsd: show queues
Data collector                   : 0
DCI cache loader                 : 0
Database writer                  : 0
Database writer (IData)          : 0
Database writer (raw DCI values) : 0
Event processor                  : 0
Node poller                      : 0
Syslog processing                : 0
Syslog writer                    : 0

Here is all of the debug output for one of the nodes not being rediscovered:

Code Select


[25-Jul-2017 07:43:35.010] [DEBUG] DiscoveryPoller(): checking potential node 10.0.140.1 at shf00-pdu-00:1
[25-Jul-2017 07:43:35.011] [DEBUG] DiscoveryPoller(): new node queued: 10.0.140.1/22
[25-Jul-2017 07:43:35.011] [DEBUG] NodePoller: processing node 10.0.140.1/22 in zone 0
[25-Jul-2017 07:43:35.011] [DEBUG] GetOldNodeWithNewIP: ip=10.0.140.1 mac=E4:8D:8C:25:49:78
[25-Jul-2017 07:43:35.012] [DEBUG] AcceptNewNode(10.0.140.1): auto filter, flags=0004
[25-Jul-2017 07:43:35.012] [DEBUG] AcceptNewNode(10.0.140.1): auto filter - checking range
[25-Jul-2017 07:43:35.012] [DEBUG] AcceptNewNode(10.0.140.1): auto filter - range check result is 0
[25-Jul-2017 07:43:35.124] [DEBUG] DiscoveryPoller(): checking potential node 10.0.140.1 at shf00-pdu-00:1
[25-Jul-2017 07:43:35.128] [DEBUG] DiscoveryPoller(): new node queued: 10.0.140.1/22
[25-Jul-2017 07:43:35.128] [DEBUG] NodePoller: processing node 10.0.140.1/22 in zone 0
[25-Jul-2017 07:43:35.128] [DEBUG] GetOldNodeWithNewIP: ip=10.0.140.1 mac=00:00:00:00:00:00
[25-Jul-2017 07:43:35.129] [DEBUG] AcceptNewNode(10.0.140.1): auto filter, flags=0004
[25-Jul-2017 07:43:35.129] [DEBUG] AcceptNewNode(10.0.140.1): auto filter - checking range
[25-Jul-2017 07:43:35.132] [DEBUG] AcceptNewNode(10.0.140.1): auto filter - range check result is 0

But still the node is never getting added. Also, both the Active Discovery Targets and Address Filters sections of the Network Discovery Configuration contain the address range for this IP.

Victor Kirhenshtein · July 25, 2017, 07:46:32 PM

Hi,

from the log it seems that address range filter is not passed. Could you show how address filter is configured?

Best regards,
Victor

lweidig · July 26, 2017, 12:32:18 AM

It is an address range of 10.0.140.1 - 10.0.140.30. Maybe it is not catching the end IP's properly?

Actually, that is exactly it. Changed my address range to 10.0.104.0 - 10.0.140.30 and it detected the router. Either it needs fixing or we need to adjust all of our ranges? Can you also tell if the upper end of the range has the same issue?

Victor Kirhenshtein · July 26, 2017, 02:14:39 PM

It's a bug in a server - first address of the range always ignored. I just fixed it in development branch (fix will be included into 2.1.1 patch release).

Best regards,
Victor

lweidig · July 26, 2017, 03:30:02 PM

Thank you, appreciate it.

Tursiops · July 27, 2017, 10:05:44 AM

Hi,

I believe I found the source of my problem as well.
A switch which was sending syslog data to a proxy node which was incorrectly configured with Zone ID 0 (default).
For every syslog message received, the logs showed that NetXMS was adding the same IP over and over again to the poller queue (I am not sure why it failed to detect that this IP was already in the queue in this particular instance?)
That node also just happened to be generating a syslog message every few seconds due to another monitoring tool setup and controller by a third party trying to connect to the switch using "public" (and failing, thus generating a log entry).

I fixed the Zone ID and now NetXMS can properly link the incoming messages to the node and our queue is in single digits now.

Still leaves the question how the IP could be added to the poller queue over and over?

Cheers

Victor Kirhenshtein · August 03, 2017, 10:24:52 AM

Hi,

there was a bug in server - check for already queued address was not performed for addresses discovered from syslog messages or SNMP traps. Just fixed it in development branch, fix will be included in 2.1.1.

Best regards,
Victor

NetXMS Support Forum

News:

Will nodes be rediscovered by auto discovery?

lweidig

Tursiops

Victor Kirhenshtein

lweidig

Tursiops

Victor Kirhenshtein

Tursiops

lweidig

Victor Kirhenshtein

lweidig

Victor Kirhenshtein

lweidig

Tursiops

Victor Kirhenshtein