Problems with PING subagent after update to V2.2.1

Started by Dani@M3T, December 08, 2017, 05:57:43 PM

Previous topic - Next topic

Dani@M3T

Hi

We have updated from NetXMS 2.1.2 to 2.2.1. After that PING subagent makes problems. We have nodes with NetXMS agents (linux and windows) which are pinging some targets. Other nodes get the results from these 'pinging'-agent nodes. Everything went fine with V2.1.2.

Now with V2.2.1 I get in debug log (level=9) of the pinging-agents the error 404 (unknown_parameter), but not for all targets.

edit1: it's for all targets, I checked the DCIs.



2017.12.08 16:27:47.063 *D* [VCS-3996] Requesting parameter "Icmp.PacketLoss(gw2.domain.intern)"
2017.12.08 16:27:47.063 *D* [VCS-3996] GetParameterValue(): result is 404 (UNKNOWN_PARAMETER)
2017.12.08 16:27:47.063 *D* [VCS-3997] Requesting parameter "Icmp.AvgPingTime(gw2.domain.intern)"
2017.12.08 16:27:47.063 *D* [VCS-3997] GetParameterValue(): result is 404 (UNKNOWN_PARAMETER)
2017.12.08 16:27:47.063 *D* [VCS-3998] Requesting parameter "Icmp.PacketLoss(gw1.domain2.intern)"
2017.12.08 16:27:47.063 *D* [VCS-3998] GetParameterValue(): result is 404 (UNKNOWN_PARAMETER)
2017.12.08 16:27:47.063 *D* [VCS-3999] Requesting parameter "Icmp.AvgPingTime(gw1.domain2.intern)"
2017.12.08 16:27:47.063 *D* [VCS-3994] GetParameterValue(): result is 0 (SUCCESS)


The ping section of the agent config has no difference between working and not working targets:

[ping]
DefaultPacketSize = 46
Timeout = 2000
PacketRate = 24
Target = 172.16.10.1:gw1.domain.intern
Target = 172.16.50.1:gw2.domain.intern
Target = 172.16.20.1:gw1.domain2.intern


I didn't changed anything in config after update.

thanks
Dani

edit2: server is linux compiled from sources, agents on Windows are binary packages.

Tursiops

Hi,

Same issue here.
We don't use a lot of Ping checks and the few we have are non-critical, but they do show the same problem as yours.

Cheers

Dani@M3T

we use it a lot for VPN connection ckecks (latency, packet loss)

Ēriks Jenkēvics

Hi!

Could you explain in detail how are you trying to achieve the values. Are you using a different source node for the DCI`s and are you using any scripts?
I have tested on Linux with the same configuration and everything seems to be working...

Regards,
Eriks

Dani@M3T

Hi Eriks

We have a few nodes with NetXMS agents installed which are 'pinging' some targets. You can see an example of the agent config file of such a node in my first post.
Other nodes get the results of this by DCI like 'Icmp.PacketLoss(gw1.domain.intern) and source node = 'pinging node'. No scripts at all.
We use this configuration for a long time, since V2.2.1 it doesn't work anymore.

If you need other information, I will check

Thanks
Dani

Ēriks Jenkēvics

Is the pinging node connected via a tunnel or directly?

Dani@M3T

You talk about an agent tunnel or VPN tunnel? One of the pinging node is the NetXMS server itself. Others are connected by VPN tunnels. No agent tunnels in use.

Ēriks Jenkēvics

Could you please verify that the PING subagent is in the list opened by Right click on pinging node -> Tools -> Info -> Agent -> Subagent list. And that you get this entry in your agent log once it is started:
*I* Subagent "PING" (ping.nsm) loaded successfully (version 2.2.1-14-g375721606)

Dani@M3T

I checked that before. Yes subagent is in subagent list (Version 2.2.1 and the value '0x0000000002320A90' in the 4. column).
In agent log file the line:

2017.12.08 17:02:18.457 *I* Subagent "PING" (ping.nsm) loaded successfully (version 2.2.1)


So the subagent is loaded.

Ēriks Jenkēvics

Ok and is there anything in the log similar to this line: "Unable to add ICMP ping target from configuration file."?

Dani@M3T

no, nothing like this in the agent log. Not in normal log and not in debuglog.
In debuglog I see lines like this:

2017.12.08 16:09:47.364 *D* [VCS-1915] Requesting parameter "Icmp.PacketLoss(gw1.domain1.intern)"
2017.12.08 16:09:47.364 *D* [VCS-1916] Requesting parameter "Icmp.AvgPingTime(gw1.domain1.intern)"
2017.12.08 16:09:47.364 *D* [VCS-1906] Requesting parameter "Icmp.AvgPingTime(gw1.domain2.intern)"
2017.12.08 16:09:47.364 *D* [VCS-1901] GetParameterValue(): result is 0 (SUCCESS)
2017.12.08 16:09:47.364 *D* [VCS-1915] GetParameterValue(): result is 404 (UNKNOWN_PARAMETER)
2017.12.08 16:09:47.365 *D* DataCollector: polling DCI 1779 "Icmp.PacketLoss(gw1.domain3.intern)"
2017.12.08 16:09:47.365 *D* DataCollector: polling DCI 1778 "Icmp.AvgPingTime(gw1.domain3.intern)"
2017.12.08 16:09:47.365 *D* [VCS-1913] GetParameterValue(): result is 0 (SUCCESS)
2017.12.08 16:09:47.365 *D* DataCollector: polling DCI 6386 "Icmp.AvgPingTime(gw1.domain4.intern)"
2017.12.08 16:09:47.365 *D* [VCS-1918] Requesting parameter "Icmp.PacketLoss(gw1.domain3.intern)"
2017.12.08 16:09:47.365 *D* DataCollector: polling DCI 6385 "Icmp.PacketLoss(gw1.domain4.intern)"
2017.12.08 16:09:47.365 *D* [VCS-1906] GetParameterValue(): result is 404 (UNKNOWN_PARAMETER)
2017.12.08 16:09:47.365 *D* [VCS-1919] Requesting parameter "Icmp.AvgPingTime(gw1.domain3.intern)"
2017.12.08 16:09:47.365 *D* [VCS-1921] Requesting parameter "Icmp.PacketLoss(gw1.domain4.intern)"
2017.12.08 16:09:47.365 *D* [VCS-1918] GetParameterValue(): result is 404 (UNKNOWN_PARAMETER)
2017.12.08 16:09:47.365 *D* [VCS-1919] GetParameterValue(): result is 404 (UNKNOWN_PARAMETER)


I can send you a debuglog of the pinging agent if that helps. But I can't share it in the forum

Ēriks Jenkēvics

Could you, please, provide the output of "nxagentd -C"? Don`t forget to edit out any sensitive data, I am interested specifically in the PING subtree.

Dani@M3T

here the output of the agent on the NetXMS server itself (linux). I had to change a few sensitive data and removed some of the ping targets...
If you need the unchanged output, you can give me another communication channel.


config
+- CORE
|   +- DebugLevel
|   |    value: 0
|   +- MasterServers
|   |    value: 127.0.0.1
|   |    value: 172.16.10.45
|   |    value: nms.domain.intern
|   +- ListenPort
|   |    value: 4700
|   +- LogFile
|   |    value: /var/log/nxagentd.log
|   +- RequireAuthentication
|   |    value: yes
|   +- RequireEncryption
|   |    value: yes
|   +- EnabledCiphers
|   |    value: 1
|   +- SharedSecret
|   |    value: replaced
|   +- MaxSessions
|   |    value: 64
|   +- SessionIdleTimeout
|   |    value: 60
|   +- FileStore
|   |    value: /opt/netxms/var/nxagentd
|   +- StartupDelay
|   |    value: 5
|   +- EnableSubagentAutoload
|   |    value: yes
|   +- SubAgent
|   |    value: portcheck.nsm
|   |    value: ping.nsm
|   |    value: logwatch.nsm
|   |    value: ecs.nsm
|   |    value: netsvc.nsm
|   |    value: filemgr.nsm
|   |    value: ssh.nsm
|   |    value: pgsql.nsm
|   +- ExternalParameter
|   |    value: ServiceCheck.DNS(*):/usr/bin/dig +noall +short @$1 $2 A | grep -c $3 | sed  's/0/x/g;s/[^x]/0/g;s/x/1/g'
|   +- ExecTimeout
|   |    value: 10000
|   +- Action
|        value: top: /usr/bin/top -s -b -n 1
|        value: netstat: /bin/netstat -p --inet
|        value: IPconfig: /sbin/ifconfig
|        value: ListUpdates: /usr/bin/zypper list-updates
+- filemgr
|   +- RootFolder
|        value: /root
|        value: /opt
|        value: /etc
|        value: /usr/src
|        value: /var/log
+- ping
     +- DefaultPacketSize
     |    value: 46
     +- Timeout
     |    value: 2000
     +- PacketRate
     |    value: 24
     +- Target
          value: 172.16.10.1:gw1.domain1.intern
          value: 172.16.50.1:gw2.domain1.intern
          value: 172.16.41.1:gw4.domain1.intern
          value: 172.16.101.10:gw10.domain1.intern
          value: 172.16.90.1:router1.domain1.intern
          value: 172.16.20.1:gw1.domain5.intern
          value: 172.16.70.1:gw1.domain4.intern
          value: 192.168.30.1:gw1.domain3.intern
          value: 8.8.4.4:dns2.google.com

Ēriks Jenkēvics

Could you try changing the PING subagent section in config to uppercase? As in:
*PING
DefaultPacketSize = 46
...

Dani@M3T

I changed "[ping]" to "[PING]", but no change in problem.

I have all agent config files and policies in INI-style, the sections like this:

[ping]
...


is it necessairy to change to "*PING"-style? I couldn't see anything about that in changelog.