Menu

Show posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Show posts Menu

Messages - Mystery

#1
Hello, Ok:)

SYS_NODE_DOWN Offline EPP is here (+ Events SYS_NODE_DOWN, defined Source Objects):



+ Alarm




EPP for SYS_NODE_UP Alarm:



+ Server Actions:



+ Timer Cancellations all 3 Delay timer keys.

I think problem is due to not getting SYS_NODE_DOWN first ... only SYS_NODE_UNREACHABLE AND SYS_ICMP_UNREACHABLE.
#2
Hello, I have same problem ... Problem is when SYS_NODE_DOWN emails are sent after few polls and SYS_NODE_UP with blocking timer key ... so the result is I am getting SPAM of SYS_NODE_UP emails.


#3
Hello, yes the timer was running. Probably I found an issue, I had same name of timers and first policy canceled the timer so the second one sent email. Dumb error. Thank you:)
#4
This one is bugged way more.

Ofline email:
Condition: IF source object is one of the following: Ustredna.CPU1.Policka AND SYS_NODE_DOWN



Online email:
Condition: IF source object is one of the following: Ustredna.CPU1.Policka AND SYS_NODE_UP



This is what happened with node:



And we got only ONLINE emails :-(



There should be a 180 sec delay with NODE_DOWN_NOTIFICATION_%i
Which is correct behaviour, email is not recieved.
BUT Do not run if timer with key NODE_DOWN_NOTIFICATION_%i is active is not working and it sent email :-(

Any idea? Thank you.
#5
Thank you. I will apply EPP to the container — that should do the trick.
The DCI thresholds are working fine and repeatable without any issues, but sending repeated emails via EPP could be a problem. I don't want to configure 100 server actions with hourly delays.
I'll try to handle that part with a script. Thanks again.
By the way, there might be a bug in the
SYS_NODE_DOWN event.
I followed the documentation on how to send an email when a node goes down and then comes back up.
Specifically, this part:
QuoteIf, in addition, we want to send a notification when a node comes up, but only if a notification about it going down was sent: https://netxms.org/documentation/adminguide/_images/delayed_action_2.png
Somehow, I'm receiving "ONLINE" emails even though there was no "OFFLINE" event.
For example, in one case, there was no
SYS_NODE_DOWN event at all — only
SYS_NODE_UP, which triggered the email.
The blocking part
NODE_DOWN_NOTIFICATION_%i doesn't seem to be working.
This node has only ICMP polling.
Do you have any idea how to fix this?

Thank you:)



#6
Hello,
Thank you for the link to the documentation. It seems that EPP is a much better approach.
May I ask a few more questions?
  • If I want to assign source objects to a list of objects, is that possible? Or do I have to select them one by one?
  • Is it possible to select an entire container with nodes in the Infrastructure section?
  • In the Server Actions section, I have a mail action with a delay. Can I configure it to repeat, for example, every hour while a node is down?
Thank you for the information :)
#7
Hello, I have few more questions. When SYS_NODE_DOWN trigger EPP to send email, is it possible to send it after some time, like if this node is 3-5 polls (3-5 min) down instead of instantly (or delayed) send this notification? I would like to prevent spamming when node is flapping.

Is there any option how to send email from EPP, when node is down (for example 60 min) and goes online? Like device is offline for 5 min (send email) and devices goes online (send me email about it again).

I still think Data Collection Configuration from template and binding to nodes is better, I can poll these values faster and trigger Deactivation event when node is online. Like this one:



Can I get SYS_NODE_DOWN to Data Collection Configuration like internal metric?
Thank you for help.
#8
Hello,

can you help me to figure out, how to send emails from Data Colection Configuration when node goes down?
I have mailing template with some collecting parameters. 
I would like to set up some parameter with threshold which collect something like Internal:ICMP ping: packet loss when >99% ... but this took too long to (when node goes down it jumps like 1,3,5% per poll) . Is there any better approach?
Some nodes have only ICMP ping (not NetXMS agent or SNMP). When I browse Metric to collect from internal Origin, there is only ICMP ping which don't work well.

Thank you:)
#9
Thank you, it helped:)
#10
I tried disabling (+save) and reenabling (+save) and EP is still not working. If you don't mind, database query would be helpfull.
Thank you:)
#11
Ok, thank you for info. 
Is there any way how to fix these bad imported EPPs? Like disable and enable etc? Or should I delete them and create again?

I couldn't upgrade the server, there were many errors in the upgrading process so I had to make it this way.
#12
MariaDB [netxms]> select rule_id, flags, comments from event_policy where rule_id in (59, 78);
+---------+-------+-----------------------------------------+
| rule_id | flags | comments                                |
+---------+-------+-----------------------------------------+
|      59 |  7936 | Network Device is OFFLINE - Send E-mail |
|      78 |  7936 | Test new                                |
+---------+-------+-----------------------------------------+
2 rows in set (0,001 sec)
#13
Yes, XML (from 2.2 version) of this rule ID is here:
Do you need some more details?

<rule id="5">
<guid>0cdde918-dc08-4096-9418-0197355b2833</guid>
<flags>7936</flags>
<alarmMessage>%m</alarmMessage>
<alarmKey></alarmKey>
<alarmSeverity>5</alarmSeverity>
<alarmTimeout>0</alarmTimeout>
<alarmTimeoutEvent>43</alarmTimeoutEvent>
<script></script>
<comments>Network Device is OFFLINE - Send E-mail</comments>
<sources>
</sources>
<events>
<event id="100032">
<name>Network_Device_not_OK</name>
</event>
</events>
<actions>
<action id="16">
<guid>9b8807fa-f8d4-4ad7-ac5d-61f85cd27795</guid>
<timerDelay>
0</timerDelay>
<timerKey>
</timerKey>
</action>
</actions>
<timerCancellations>
</timerCancellations>
<pStorageActions>
</pStorageActions>
</rule>
<event id="100032">
<name>Network_Device_not_OK</name>
<guid>06cdfff3-bce6-4bbd-8c02-0a444a29e90f</guid>
<code>100032</code>
<severity>4</severity>
<flags>1</flags>
<message>NetXMS - Cisco Router/Switch/ASA IS OFFLINE!</message>
<description></description>
</event>
#14
I exported all EPPs from old verions of NetXMS and imported to new NetXMS instance. 
Somehow, the imported ones are not working (sending emails).
I created new EPP with same definitions and email is sent.



See rule 60 is not working while rule 79 is working. Strange.
I also see a new row in the notification log.
There is no rule with "stop event processiong" checkbox.
#15
In the notification log, there are only rows from testing events.
In the event log, there is correct event, which should send these emails as event processing policy define that.

{
  "id": 11134,
  "rootId": 0,
  "code": 100074,
  "name": "Network_Device_not_OK",
  "timestamp": 1747638739,
  "originTimestamp": 1747638739,
  "origin": 0,
  "source": 356,
  "zone": 0,
  "dci": 1137,
  "severity": 4,
  "message": "NetXMS - OFFLINE :-( ",
  "lastAlarmKey": "",
  "lastAlarmMessage": "",
  "tags": null,
  "parameters": [
    {
      "name": "dciName",
      "value": "ICMP.PacketLoss"
    },
    {
      "name": "dciDescription",
      "value": "ICMP ping: packet loss"
    },
    {
      "name": "thresholdValue",
      "value": "99"
    },
    {
      "name": "currentValue",
      "value": "100"
    },
    {
      "name": "dciId",
      "value": "0x00000471"
    },
    {
      "name": "instance",
      "value": ""
    },
    {
      "name": "isRepeatedEvent",
      "value": "1"
    },
    {
      "name": "dciValue",
      "value": "100"
    },
    {
      "name": "operation",
      "value": "4"
    },
    {
      "name": "function",
      "value": "0"
    },
    {
      "name": "pollCount",
      "value": "5"
    },
    {
      "name": "thresholdDefinition",
      "value": "last(5) \u003e 99"
    }
  ]
}