One kind of trap that covers dozens of alarms - how to configure

Started by Yarda, November 19, 2012, 04:52:30 PM

Previous topic - Next topic

Yarda

Hello geeks,

I've been playing with processing of traps and generation of events and alarms recently. All works as described. Tho', I am facing a situation I am not successful to cover / or even mentally resolve with the basic knowledge of the framework.

For all the possible situations, our lovely device emits traps with the same trap OID and just by analyzing the content of these traps we know what is going on underneath... The device consists of many cards/subcomponents where each can have a different status at the same moment. One of the trap parameters tells exactly the current status of the subcomponent related to the trap... Identification of the subcomponent is another parameter. Of course there are more parameters inside each trap telling more about sub-subcomponents... :) (But these are probably not important at the moment.)

To describe my issue more practically I will say that I need to see in the Alarm browser (for example) that subcomponent A of my monitored device has a "minor" problem, B ~ "minor", C ~ "critical", no problem on D, and so on. Of course when I receive a trap clearing one of the problems, I need to remove its related alarm in NetXMS too (not from the log, of course; just from the list of active alarms.)

I tried to experiment with the Event Processing Policy panel in a way to register a set of policies for my the only one event that is issued upon receiving my special kind of trap.

I even tried to look at the NXSL (for the Filtering script)... whether that is the way to go or not. But that appeared to be a bit above my "101" lesson... :)

Would you have any suggestions for me, please.

Thank you,
Yarda.

Victor Kirhenshtein

Hi!

There are few different ways to achieve required functionality. First is to create alarm message texts and keys based on subcomponent. Let's assume that you have mapped subcomponend identifier to parameter number 2 of your event. Then you can create alarms with key like DEVICETRAP_%i_%2 . With such key, events related to different subcomponents will create separate alarms, and events related to subcomponent with already active alarm will replace existing alarm's text, severity, etc. and update repeat count. Creating alarms of different severity is abit more tricky - you will need separate rule for each severity. Rules have to be identical except filtering script which should check severity in the trap. For example, if you have mapped severity to event's parameter number 3, and it is text string like "minor", "major", etc., you can use filtering scripts like this:


return $3 == "minor";


or simply


$3 == "minor"

(note the absence of the semicolon in last case).

Another possible way is to create single rule in event processing policy that will call NXSL script for trap-based event and stop processing, and in that script create different new event based on trap parameters (using PostEvent function).

And the hardcore way is to create server module that will process traps and do whatever needed - generate events, update object attributes, etc.

Best regards,
Victor

Yarda

That's promising. Thank you for the cookbook, I'll try your suggestions from the top. Hoping not to reach the hard-core... ;-)

Just a confirmation of my understanding. Every configured Event has a severity, this one will play no role in my scenario, won't it? I am going to set my Event as Normal, to have a lot of relaxing green color in the Event Monitor. :)

Victor Kirhenshtein

Alarm severity may or may not be of same severity as originating event. When you create alarm from event in event processing policy, you have an option to set same severity as in event or any other fixed severity.

Best regards,
Victor

Yarda

Yes, that's what I meant; in my case where I am gonna generate multiple Alarms out of one Event, I will decide manually what would be their statuses. So that my Event's status is nothing else but only color label in the Event Monitor.

Victor Kirhenshtein


Yarda

I was successful with my configuration of multiple event policies. I've got good feelings. Tho' there is one extension to my situation. Imagine that for some of my numeric trap parameters I need to see a written description in alert's message.

ID parameter is a good example in my case. We have around 100 possible ids... It would be nice to be able to map numbers to some hardcoded/configured string values.

So, I need to go for the hard-core way, at least partially, anyway, don't I...?

What approach would you recommend? I am thinking of tweaking (somehow :)) every relevant trap and adding some special parameter that I can refer to as $(X+1) (where X is the number of properly mapped parameters in my SNMP trap mapping config)...

Thank you for your opinion.

Yarda.

Yarda

Also, I noted, and cannot explain it myself (yet), that when I generate my alarms, I have CRITICAL, MAJOR, MINOR and NORMAL all triggered by one "NORMAL" event, only sometimes I get change of the node's status.

Only sporadically (but not rarely :)) I see the following events in the Event Monitor:

Major   SYS_NODE_MAJOR   Node status changed to MAJOR
Critical   SYS_NODE_CRITICAL   Node status changed to CRITICAL
Warning   SYS_NODE_WARNING   Node status changed to WARNING
Normal   SYS_NODE_NORMAL   Node status changed to NORMAL


Victor Kirhenshtein

Hi!

For showing text instead of code in events and alarms, you can use script in script library and macro like %[script_name]. For example, if you have some code as parameter number 5 in trap-generated event, you can create script called CodeToText in library:


switch($5)
{
   case 1: return "CODE 1";
   case 2: return "CODE 2";
   // and so on...
   default: return "Unknown code " . $5;
}


and then use

%[CodeToText]

in event and alarm message template.

Best regards,
Victor

Yarda

If my script contains:

return "HELLO";

...I get this string properly in my alarm's message.

In case of your suggested code that I simplified (for test purposes) down to:

return "HELLO" . $3;

I get:


Script (CodeToText) execution error: Error 5 in line 1: Invalid operation with NULL value


...the same failure with any other parameters, like $2, $4, $5, $6 ... Only $1 doesn't complain, cos' it is an empty string.

Definition of my alarm's message follows:


Alarm Card%5 %4: %3 Status: %2, Direction: %8, Port Level: %6, Port Number: %7 %[CodeToText]


Misunderstood your comments?

Victor Kirhenshtein

Sorry, that was my mistake. Script called via %[] didn't get event's parameters in $1, $2, etc. Instead, you should use $event variable to access event attributes. Event class described here: http://wiki.netxms.org/wiki/NXSL:Event. To access parameters, you can use $event->parameters array. So, correct code in my example would be


default: return "Unknown code " . $event->parameters[5];


Best regards,
Victor

Yarda

The $event->... works like a charm. And that seems to cover all my needs. I am a happy hippo.

Will you have any advice also for the previously posted sub-question:

Quote from: Yarda on November 21, 2012, 02:15:33 PM
Also, I noted, and cannot explain it myself (yet), that when I generate my alarms, I have CRITICAL, MAJOR, MINOR and NORMAL all triggered by one "NORMAL" event, only sometimes I get change of the node's status.

Only sporadically (but not rarely :)) I see the following events in the Event Monitor:

Major   SYS_NODE_MAJOR   Node status changed to MAJOR
Critical   SYS_NODE_CRITICAL   Node status changed to CRITICAL
Warning   SYS_NODE_WARNING   Node status changed to WARNING
Normal   SYS_NODE_NORMAL   Node status changed to NORMAL

Thank you,
Yarda.

Victor Kirhenshtein

Hi!

Can you describe this situation with more details? I'm not sure that I understand it correctly...

Best regards,
Victor