Table Thresholds not working(?)

Started by Tursiops, July 25, 2016, 04:32:48 AM

Previous topic - Next topic

Tursiops

Hi,

We are making extensive use of SNMP tables for monitoring things like CPU, RAM, PSU, fan and HDD/SDD health.
Threshold triggering in tables appears to be a bit hit and miss though.

For example right now I have a system with a disk that reports "Predictive Failure" in the HP System Management Homepage on the server itself.
Our NetXMS picks this up, transforms the numerical value reported into a string to match "Predictive Failure" - and shoul dthen alert on this.
The latter part is not working. I have seen this before with string matching and "equal to" and sometimes switching to "like" worked. In this particular case that didn't work either. Changing to "not like" just for testing triggered an immediate alarm, so it sure looks like what I am looking for does not match what NetXMS finds in the database?

See attached images showing the collected data (note: Display Name matches Column Name) as well as the configured Thresholds.
I also attached the template - nothing special really.

Now I am not sure if there may be other items in an unhealthy state right now, where NetXMS just doesn't create an alarm. The only reason I picked this one up was because I just added the system as a node to NetXMS while it was in that state already.

Thanks

Victor Kirhenshtein

Hi,

from the screenshots and exported config everything seems to be correct. I cannot reproduce this myself - I don't have iLO 3 board to test your template on, but I create sample table with same transformation and threshold works just fine. What NetXMS version you are using? Also, please try to modify event by adding script call like this: "%[TableData]" and create script called TableData in script library as following:


table = GetDCIValue($node, $event->parameters[3]);
row = $event->parameters[4];
return table->get(row, table->getColumnIndex("Smart Status"));


and change to not like to trigger event. You should see actual value for the cell in place of %[TableData]. It will be possible to check if it is actually correct.

Best regards,
Victor

Tursiops

Hi,

The results are somewhat unexpected.
I created the script, I created an event and an event processing policy.
Then I removed the system from the HP template, but kept the DCIs for testing.
Changed the Threshold to !like - and got nothing at all.
I created a duplicate of the DCI - and received three alerts: the three "OK" disks (with "OK" without quotes being passed via %[TableData]). But I didn't get one for the "Predictive Failure" one.
I switched back to == and still did not get any alerts. Changed to !=, again three alerts for "OK" disks.
I changed to != BLUBB (literally BLUBB) and still only got three alerts.

In my original message I mentioned switching to "not like" triggered an immediate alarm. I should've been more thorough at the time, now I am wondering if I only got three alerts back then as well. Out of ideas. Is this some odd off by one problem? The row with the Predictive Failure is the first one.

Cheers

Victor Kirhenshtein

Hi,

can you please try with == OK and != OK?

Best regards,
Victor

Tursiops

Hi,

== OK resulted in three alerts for the "OK" ones.
!= OK resulted in no alerts.

System has four disks. The first one is in "Predictive Failure" status. The actual text is a simple string which is the result of using a transformation, i.e. I know what text is being stored as I defined it in that transformation. The fact that it displays text and not numbers means the transformation itself was also working.

Cheers,

Tursiops

I am starting to wonder if this is related to this one: https://www.netxms.org/forum/general-support/clearing-threshold-column-re-alerting/ ?

How can I, if necessary manually in the database, ensure that NetXMS doesn't think that it may have alerted on this already?

The issue is still present on my system.
First row is in Predictive Failure state for Smart Status and Status. No alert in the system that I can see. Any changes to alerting only affect the last three rows, not the first one.

Tatjana Dubrovica

#6
Please check that correct column is used as the instance column(key). In your case it could be index.

One more note - threshold are checked once per row(if first column of second row violated threshold, then other columns of second row will not be checked).

Updated.
Also please check that "Smart status" and "Status" columns have String data type.

Fixing those two things(instance and data type) I managed to get correct behavior.