PING Subagent

Started by gmonk63, August 21, 2016, 09:10:39 PM

Previous topic - Next topic

gmonk63

Is is possible to cause a node to go into critical or down status based on the results from icmp stats.  I have a subagent setup to monitor latency of several hosts  but when the threshold is met the alarm is generated for the agent and not the instance being monitored ..  I would like to generate  a critical or down status if the latency shows 10k  for that particular instance   

Tatjana Dubrovica

It is possible to use DCI for status calculation. This option is enabled for each DCI separately(DCI Properties->Other options->Use this DCI for node status calculaiton).
From the documentation:
"DCI can be used to calculate object status. Such kind of DCI should return integer number from 0 till 4 representing object status. To use DCI for status calculation, there should be selected this option in it's properties (Other Options -> Use this DCI for node status calculation)."

gmonk63

#2
I did try this at first but it does not seem to have any bearing on the status of the node.  I also created a transform to return 4 if the value is over 5000 and 0 if under. Maybe i am misunderstanding the way Ping subagent works.   As i mentioned before i have set up a handful of icmp targets where i need to monitor second by second for high latency and if the latency goes above the threshold create an alert.  Right now I only see the alarm created for the node that the icmp targets are created on.  Example nodeA has the subagent to ping node1 and node2  every second if I reboot node1 I only receive and alarm for nodeA  sys_threshold_reached  whereas I would like the alarm to be created for node1 as it gets confusing when all alarms created are generated for nodeA.  I would rather not have Netxms Server ping the node since I am already monitoring latency via subagent  and because of this I would like to generate the critical alert based on the latency ping since the value will go well above 3000 if the node is actually down.  I see there is a threshold instance but the documentation is vague on its usage could this be what I need.


Thanks

Tatjana Dubrovica

#3
You should create DCIs on node1 and node2 with DCI source nodeA. In this case data will be collected from nodeA, but thresholds will be generated for node1 and node2.

"Use this DCI for node status calculation" works for me. What exactly does not work?

Added:
There is description about this solution(note "Proxy node" was renamed to "Source node"): https://www.netxms.org/forum/feature-requests/icmp-ping-as-internal-parameter/

gmonk63

it works now i was not setting the source node on node1.  But It does not seem to work unless both nodeA and node1 have dci node status checked.  And this causes nodeA to show critical as well

gmonk63

on a side note I have noticed the DCI instance ids continue to increment and out of curiosity ran a manual instance discovery poll and got this
Running DCI instance discovery
[29.08.2016 16:13:01]    Updating instances for Icmp.LastPingTime({instance}) [28794]
[29.08.2016 16:13:01]       Existing instance "10.9.61.11 26 29 0 46 10.9.61.11" not found and will be deleted
[29.08.2016 16:13:01]       Existing instance "10.9.61.13 1 0 0 46 10.9.61.13" not found and will be deleted
[29.08.2016 16:13:01]       Existing instance "10.9.61.14 31 28 1 46 10.9.61.14" not found and will be deleted
[29.08.2016 16:13:01]       Creating new DCI for instance "10.9.61.11 39 28 0 46 10.9.61.11"
[29.08.2016 16:13:01]       Creating new DCI for instance "10.9.61.13 0 0 0 46 10.9.61.13"
[29.08.2016 16:13:01]       Creating new DCI for instance "10.9.61.14 4 32 0 46 10.9.61.14"

This seems to happen every time so im assuming after every  IntancePollingInterval which I think is set to 600 the DCI's get recreated  is this normal

Tatjana Dubrovica

Shouldn't instance be just "10.9.61.11" not the "10.9.61.11 26 29 0 46 10.9.61.11"? Can you please provide you instance discovery configuration it looks that it is done incorrectly.

In your situation nodeA should not contain any DCI that calculate status for node1 or node2. You should create DCI on node1 and node2 setting nodeA as a shource node. Can you configure your system like this and explain what is not working using this solution.

gmonk63

#7
Here is NodeA (actual node name is RB1-ERP-POLLER) configuration That has the Agent running ping.nsm


gmonk63

And here is node1 where i would like to calculate the status based on nodeA DCI


Tatjana Dubrovica

Now I understand where is problem.

Additional requirements:
Disable "CheckTrustedNodes" or add node1 and node2 as trusted for nodeA

I attached screenshots - how configuration should look for node1. You can so similar thing for node2. No DCI configuraiton needed for nodeA

Tatjana Dubrovica

Also for this DCI "Use this DCI for node status calculation" should be enabled.

gmonk63

Its working now ... Thanks for all the help