Menu

Show posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Show posts Menu

Messages - Millenium7

#46
I noticed NetXMS will not parse syslog messages unless it can map the source IP address to a known node that is being monitored. Is there a way to change this behavior?
I have some scripts that send syslog error messages on customer routers and I want to get alerts for them. Currently not possible unless I add the node into NetXMS which I don't want to do

Also i've had some issues with this whereby the source address of a device is i.e. 1.2.3.4 but NetXMS is using 5.6.7.8 and it won't parse the syslog message as it thinks it doesn't belong to that node

I understand for syslog storing it has to know which node to put the messages under. But I mostly only care about realtime alerts so I don't care if the syslog messages gets immediately dropped afterwards, but I want the parser to work on them so I can push a notification to SLACK
#47
Awesome work, thanks guys
#48
General Support / Can't use multiple functions in script?
December 11, 2019, 06:16:49 AM
This is not working

I create this as a script called SetInitialAttributes, it compiles and saves just fine no errors
// Set initial device attributes
// If an attribute does not exist, will create it with -1 as value
// If attribute already exists, will not change it


sub AirFiber()
{
if (GetCustomAttribute($node,"TestVar1") == null) SetCustomAttribute($node,"TestVar1",-1);
}

sub AirMax()
{
if (GetCustomAttribute($node,"TestVar2") == null) SetCustomAttribute($node,"TestVar2",-1);
}


Then I try and execute a function. I start off by testing with this

use SetInitialAttributes;
AirFiber();


Works fine

Then I try

use SetInitialAttributes;
AirMax();


And I get "Error 1 in line 13: Data stack underflow"
I tried switching the functions around so that AirMax() is on top i.e.

// Set initial device attributes
// If an attribute does not exist, will create it with -1 as value
// If attribute already exists, will not change it


sub AirMax()
{
if (GetCustomAttribute($node,"TestVar2") == null) SetCustomAttribute($node,"TestVar2",-1);
}

sub AirFiber()
{
if (GetCustomAttribute($node,"TestVar1") == null) SetCustomAttribute($node,"TestVar1",-1);
}



And it works fine when I call AirMax() because its the first function. But then I try calling AirFiber() and I get the same thing, error on line 13 data stack underflow

Am I doing something wrong? Is it not possible to use more than 1 function in a single script?

edit: I think this might be a bug. If I remove the 'SetCustomAttribute' command from both functions and replace it with something like 'println "test";' then I can run both functions just fine. But if I use 'SetCustomAttribute' in the 2nd function it causes the data stack underflow error. I can place lots of 'SetCustomAttribute' commands in the first function and they all run just fine, but when there's 2+ functions and running it on anything other than the first function causes this error
#49
Thanks. Scripts are potentially even better, that would give me enough flexibility
I'll have to think of a way to ignore a particular DCI on some nodes. I don't want to use maintenance mode because I want to ignore only 1 DCI for a node, but still get alerts about other serious problems

I'm thinking maybe at this point in time the way i'll do it is with a combination, custom parameter but also have a monthly script that runs and looks for any nodes with that custom value higher than i.e. 999999 or -1 and if so it'll email me a summary report so that way I can have a regular notification to see if any I should change back to normal monitoring

Are there any plans to have a simple 'ignore DCI thresholds' value implemented into the right click menu though?
#50
Additionally. Is there a way to disable threshold processing for a particular DCI that is obvious? I know I can use sticky acknowledge to disable event processing of thresholds, but the icon for warning/alert/error etc remains. I also know I can disable that DCI but I don't want to do that, I want to continue collecting data but I just want to terminate all visible notifications for a particular DCI on a particular node, but a way to know that its disabled

The use case is maybe we have a link that we know is bad but they havn't paid their bill, they aren't under contract etc. We're happy to just leave it alone and don't want any alerts about it. Fast forward 6 months and we go and improve the link, chop down trees, do whatever. Now suddenly we do want alerting
I don't want to have a difficult time 'resetting' the alerts on this node. I ideally would just like a simple option when right clicking on a DCI to "Disable threshold processing" and the colour changes from green to purple or something different so it stands out as 'this is monitored, but ignored' and without changing the entire node icon (since I do care about if the node is entirely down or suffering other issues and want to see it)

Another use case is i'm about to implement FCS monitoring to detect cable errors. But there's a known issue with some vendors equipment (Ubiquiti Airfiber 24 and Cambium PTP) + MikroTik routers that will report a false FCS error consistently every 30/60/90 seconds. I want to disable FCS monitoring on just those interfaces. But again I need some way to know that its disabled, so later on if we change the radio or use a different port it's easy to see that i.e. ether2 FCS monitoring was disabled, ok now i'll re-enable it because that radio is no longer on that port. And I want FCS monitoring now for the new radio to detect a cable error in the future
#51
Is it possible to use custom parameter values, or variables in Thresholds?
Or alternatively if I use a template to apply DCI's, and then manually edit the DCI threshold values on a particular node will it still be linked to the template, or will it convert it to a standalone DCI so that any further changes to the template won't be updated on that node?

I want to be able to use templates for monitoring RSSI and SNR on radio's but enter known baseline's as threshold values
So that i.e. LinkA is installed, we know that the master and slave are both -55db RSSI on install, master has 34db SNR, slave has 28db SNR
LinkB is installed, it's -68db, master has 20db SNR, slave has 35db SNR

Each of these nodes I want different thresholds. If LinkA drops to -60db thats bad, its gone out of alignment, obstruction has gone up, trees have started growing etc. but LinkB would need approx -72db as its threshold, we will never get it to -60db thats not a suitable threshold so I can't apply that for every node. Likewise they all have different known SNR margins. I want to alert for progressive changes over time so we can schedule spectrum scans and pick a new frequency as new radio's go up and cause more interference etc. I want to do this way before the link actually has an issue, not just wait until it starts dropping packets and customers complaining
#52
/Facepalm

Ok the terminate event processing policy had no events defined. I'm thinking maybe I renamed or deleted the event and recreated it at some point and didn't re-apply it to the terminate EPP. Hence it was 'always' triggering as it had no match criteria. Added and its working
However I still think I needed to enter some number in the 'repeat event' in DCI configuration/threshold for each entry

Anyway i've learned a few things along the way, and its working now
#53
Ok, so to troubleshoot this i've done 2 things

Lowered the 'repeat interval' to 5 seconds. This seems to override everything. Even if the alarm is terminated it just will not occur again unless that timer is over and done with. So setting it to 5 seconds lets me do a Force DCI Poll on that entry and have the alarm re-trigger. Otherwise if that is set to 86400 (which I do want, but not for testing) then it will not trigger again whatsoever until 86400 seconds / 1 day has passed. Doesn't matter if you terminate/delete the alarm or not. You'd have to delete the node entirely and re-add it, or just lower the timer temporarily

The second thing I did was change the terminate event to a 'resolve' event so it keeps the alarm instead of deleting it. And i'm finding this is triggering immediately and the alarm instantly resolves. When I disable that Event Processing Policy, the alarm stays in its intended state. So there is something wrong
Maybe my Alarm Key syntax is wrong? TBH I find scripting incredibly frustrating and confusing at times due to a lack of simple easy troubleshooting tools. For instance I cannot see what the alarm key is, maybe there's some convoluted way to find it through the console and filtering for it, but I can't just see it in the alarms tab. I'm thinking maybe TEMPALERT_%i_%<dciId> is wrong
TBH I don't know what %i and %<dciId> mean exactly. I don't know if they are the correct variables to use. Maybe this is the problem

What I want to happen is for 1 alarm per temperature alert to be created. Right now I have 5 temperatures I monitor on a node and I want individual alarms to be created for each sensor/DCI rather than having them grouped together (which is what happens if I just use TEMPALERT_%i )
If I need to use a different syntax i'm happy but I don't know what to put. I thought this would be correct, i'm guessing $i is the node and %<dciID> would be the ID number of that particular DCI, therefore something like TEMPALERT_[NodeID]_[DCI] and that would create 1 alarm for each DCI, thus 5 different alarms if all 5 sensors had an issue, right?
#54
I tried disabling the event processing rule that is associated with the 'clear temperature alert' and still i don't get alarms

Note that I actually do get Slack notifications of high temperature. So it is triggering, its just not adding itself as an alarm to the node
I'm wondering if maybe there's some internal timer thats stuck related to 'repeat interval' and its not ever repeating, thus not generating the alarm again. I wonder if there's a place I can see all pending timers and clear them if thats the case
#55
I don't know how to troubleshoot this. I did have alarms working for a particular DCI but now they are not.
The DCI shows its over a threshold on the 'last values' page, but no alarm is ever generated anymore
Maybe they are stuck in some state and not allowing new alarms to be generated for the DCI (don't know, and don't know how to check this)

Here's the way i've set it up




#56
General Support / Re: CRON schedule doesn't work?
December 04, 2019, 10:18:54 AM
Would be nice to have that documented somewhere. I searched for ages, found zero mention of it. Infact I found a few posts on these forums pointing to 'CRON generators' including the one that generated that
I'll test it out tomorrow

Edit: Yes using only 5 characters in the CRON schedule works as intended. Thank You. Just wish there was some information in the manual/guides or at least a little ? symbol in Schedule area's that when mousing over says something like "CRON format: [minute] [hour] [day of month] [month] [day] i.e. 5 3 * * * will run at 3:05am every day"
#57
General Support / CRON schedule doesn't work?
November 29, 2019, 04:15:52 AM
I've created a scheduled task to run a script. I set CRON schedule to 0 * * ? * * * as a test to run the script every minute, but it never runs

Is there something i'm doing wrong? Do I need to turn on CRON scheduling in server options or something?
No problem in the script as it runs fine with one time execution (sends me an email), but if I use CRON it doesn't run at all

NetXMS version is 3.0.2329

Edit: Tried creating a dummy node and adding the script as a polling method with a CRON schedule, that too does not work. Yet setting the polling interval to i.e. 60 seconds works fine. So it seems CRON is completely broken in my instance, or i'm doing something wrong that i'm just not seeing
#58
Web UI
3.0.2329
#59
Ah, found the issue
The DCI is not actually missing. It's just obscured by the bottom half of the 'Filter' header. If you look very closely at the photo on the right you can see a little bit of black text. If I click on that pixel or use the up arrow, it will select that DCI entry
So this is a bug with the UI not organising itself correctly. I imagine its placing the list assuming a single-entry header row like the first photo, but the filter one is 2 rows tall hence obscures the first entry
#60
Experimenting with template graphs (have never used them before). I've created one with a regex expression as the template source, right clicked on a node and chose graph->test and it simply says
"Get last values of [node name] has encountered a problem
Not possible to get last values for [node name]: Invalid thread access"

Does this with any node I choose to apply the graph to. Even tried getting rid of the template sources entirely so its just a blank graph, still does the same thing. Can't use graphs on anything
Have checked my user profile and I have full privileges to everything