Menu

Show posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Show posts Menu

Messages - Millenium7

#1
If you are going to improve the maps I highly suggest the ability to 'bend' links in order to visualize multiple links, as well as be able to draw links on a map that don't get in the way (2 straight lines between distant routers on a map might go through several other routers and be very confusing, easy to just bend it around the outer edge of the map)

As well as having links be able to display bandwidth in both Tx and Rx direction. I again use this on asymmetrical links to balance them better or know when to upgrade them
Please have a look at the weathermap tool as its quite well done and has a simple config syntax. It's widely used by other monitoring platforms because of this, but it uses RRD files to pull the SNMP data. If NetXMS could have another plugin that can dump SNMP data into RRD files, then its easy to just run a separate weathermap instance that pulls from these files. But ideally i'd also like to have it integrated so I can directly add nodes on the visual map, pick them from a drop down list with appropriate DCI's etc
#2
The primary idea is 'at a glance' monitoring that is so simple that anyone can make sense of it
If correctly defined, then any links shown in orange or red are a problem, especially if they are constantly red


I'll give you 3 real use case examples

1) Site to Site Links: Simple active/backup where the primary might be a 1gbit link and the secondary is 200mbit/s. I expect the secondary link to always be near idle. Very easy to spot on the map if the backup is in use as it'll almost certainly be red, and the big fat primary link is shown with no traffic going over it

2) Traffic Engineering: I have 2x 1gbit and 1x 500mbit links between a site. These are balanced, not bonded (too much CPU usage). So traffic will never be perfectly split between them and it relies on adjusting the traffic flow ratio's to get the best outcome. I'm seeing LinkA often sitting at 200mbit/s, LinkB at 850mbit/s and LinkC at 350mbit/s. I might glance at the map periodically over a few hours/days and its pretty evident that I need to change the balancing ratio to something like 4:2:1 instead of 2:2:1

3) Upstream link failures: Combination of both of the above. 2+ primary links go down somewhere in the network and cause the network to be split. It's not actually 'down' and no major alerts are fired because we have backup Layer2 links through other carriers (they are visualized as a direct link between routers on a map) but they are low bandwidth, they should never be used unless absolutely necessary and in this case they're the only viable transit paths. They light up red because they are saturated, all other links in that down network segment are showing very low bandwidth usage because those upstream links can't keep up



Some of this can be done with traditional monitoring but the idea is its a very simple and highly effective tool that doesn't require complicated thresholds/alerting/monitoring to be setup. Just add all of the links and define the bandwidth, anybody is able to easily glance at the network map and see roughly what's going on and if there's problems without diving into the network
#3
Is there a way to get Weathermap working with NetXMS?
https://www.network-weathermap.com/


I have not (yet) found anything better than the old Weathermap tool for visualizing network utilization. I would prefer to consolidate and run this directly with NetXMS if possible - but am also open to other map visualization tools I may not have heard of
The biggest thing for me that I absolutely need is the ability to display multiple links between nodes/routers which Weathermap can do by simply bending the links. Is is vital to see throughput on multiple transit paths, a single link between nodes is just not useful
I.e.
#4
Alternatively, can a transform script be used? I'm not sure how the polling engine works but if I did an SNMP poll every second, can I use something simple like

if ($1 == null) return -1;
return 1;

Will null (or any other value) be returned in the case of a failed poll? Or does NetXMS not consider it 'polled' until a response comes back and thus will not log anything in history unless it actually receives a result
#5
Occasionally I have a need to troubleshoot devices with very rapid pings/polls (at least once a second, ideally every 100ms) and need to be able to see every dropped packet. I want to override the default behavior on that DCI so if it doesn't receive a response within a set period of time, it doesn't just ignore it and try again on the next poll. It will still log the result (simple transform script to show -1 or some other value would be fine) so it easily stands out on a line graph and in the history

Is this possible to do?
#6
I want to create a DCI on nodes that have leased fiber lines. We pay for a particular speed i.e. 500/500mbit but the line is actually burstable much higher to i.e. 10gbit/10gbit. We do not pay for bursting, however we do get billed significantly extra if we exceed the provisioned speed for more than 5% of the billing cycle

I can see there is an example NetXMS script for 95th percentile packet loss, but this isn't directly applicable
I'm already polling interface speeds every 5 minutes, how can I create another DCI that can calculate the average speed so far the current month, and then use thresholds to alert at 50%, 85%, 95% etc

#7
Memory usage by NetXMS is a linear upward usage pattern. And System Available Physical memory is exactly the opposite. So it definitely appears like a memory leak





Database writes are pretty consistent and spike up to 1.4k and then back down to 0, only very occasionally going higher (highest was 3.4k, just a one off)

What's interesting is 'Agent communications: unsupported requests' as well as 'Agent communications: failed requests' they perfectly correlate with the memory usage graph. Only going up, never down (until server restarted)
Vast majority of what I monitor is through SNMP. I don't use any other NetXMS agents - just the server itself

Could the agent be continually holding open failed connections indefinitely?
#8
4.2.461 (problem has been happening for a few versions now)

Have just updated and rebooted the server but will get back to you


NetXMS Server, agent, webGUI and DB all running on same server (has been fine for years though)

NetXMS server usage process is interesting. I only have retention going back 31 days so it doesn't paint a full picture (and i've only restarted it once in the past month, it was down for a few days), but it appears to be very steadily rising until it crashes. It's a very linear climbing line
#9
This might seem obvious.... add more memory. Thing is i've bumped the VM from 4GB (which was fine a few months ago) up to 8GB without any much more to the network and its still running out of memory after a few days. So i'm unsure if there's a memory leak, or a patch has made netXMS consume a whole lot more memory all of a sudden, or what.....

This is on a hosted instance so memory upgrades are not cheap
How do I go about troubleshooting and finding out why NetXMS is consuming so much memory?
Otherwise are there any obvious things to look for in my main config or polling templates to adjust to bring the memory usage down?
#10
General Support / SysLog Parser not working after update
November 08, 2022, 07:57:19 AM
Not sure if this is solely because of an update, but I went from 4.0.x to current latest (4.2.395) and shortly after I noticed I was not getting SysLog messages via SLACK

This was working perfectly before and i've not changed anything in the parser
I can definitely see SysLog messages in NetXMS by right clicking on node and choosing Logs->SysLog so they are still being received just fine, but parser doesn't seem to be doing anything

In SysLog Parser I have 'Always process all rules' ticked (always have)

As an example, the very first rule is...

system,error,critical login failure for user (.*) from (.*) via (.*)And to generate an event

This matches perfectly with an actual SysLog message - and has been working for years
i.e.
Quotesystem,error,critical login failure for user [email protected] from 1.2.3.4 via winbox

That event does not appear to be created though
If I go to View->Event Log it's not there

Has something changed? bug?
#11
Can an another platform use data from NetXMS by reading values from NetXMS via SNMP? Is this possible? Or only via API?
#12
Thanks, would be great if this was prioritized as this will massively enhance the potential of NetXMS. One of its greatest strengths for us is the ability to very rapidly run summary DCI's to get all sorts of useful information in a simple clean page. Being able to then select multiple nodes out of compliance and simply right click->tools->run SSH script to resolve issues or push mass config updates is very powerful

Ideally i'd like to see further updates to SSH down the track, such as credentials list, config backups, diff notifications (against last, or template baseline). But for now a simple working SSH command execution would be nice  ;D
#13
So this has a problem. According to https://track.radensolutions.com/issue/NX-1649 I imagine the command 'executeSSHCommand' was implemented to get around the 1024 character limit is this correct? or is there another command? Since the bug tracker doesn't say which command was implemented

But it seems there's actually a ~232 character limit on executeSSHCommand. At least I can't get it to run anything at all when I exceed that

command = (":log info \"this is line01\"\n".
":log info \"this is line02\"\n".
":log info \"this is line03\"\n".
":log info \"this is line04\"\n".
":log info \"this is line05\"\n".
":log info \"this is line06111111111111111111\"\n".
"");

$node->executeSSHCommand(command);


This is the longest possible string I can parse to the command, 1 character longer and nothing executes at all. Is this a bug or am I just doing this whole thing incorrectly?
I have some scripts that are well over 2500 characters in length so this is a big problem
#14
I think I figured this out nope there's a problem, see next post


$node->executeSSHCommand("log warning \"This is line1\"\n".
"log warning \"This is line2\"\n".
"log warning \"This is line3\"\n");


Can also be parsed inside a string parameter. But the rules are
1) all quotations to be parsed need to be escaped with a '\'
2) the end of each line needs to have \n inside the quotations (otherwise they just continue on the same line i.e. "log warning This is line1log warning This is line2")
3) each line needs to end with . to concatenate the lines together

It's not the end of the world, but its a lot of manual manipulation of script files. Would be nice if I could just paste text blocks inside parenthesis but I imagine this is a limitation of NetXMS scripting language. If there is a better/cleaner method i'm all ears

edit: to make life easy, in notepad++ paste the script then do the following...
search->replace, pick extended mode
1)
find what: "
Replace with: \"
replace all

2)
find what: \r\n
replace with: \\n".\r\n"
replace all

3) add (" to beginning of first line

4) add "); to very last line

Now copy/paste into NetXMS

This is fast and effective enough for me, some of my scripts that I want to use NetXMS with are well over a hundred lines long. Doing this manually would have been a flapping nightmare
#15
Is there a way for NetXMS to try multiple credentials rather than just the single one stored in properties->communications->SSH?
It's not feasible to enter credentials for every device manually, nor to manually change them periodically
Most will use common LDAP username/password but many are individual. I'd like to create a list of all known credentials and it will try until it finds a match then stores it

Is there a way to have a list of credentials and have NetXMS try each of them sequentially until it can successfully login? The same as SNMP credentials are enumerated until one is found, then store it
(not enumerate the entire list every time a command is run, because it will generate a ton of invalid login attempt error messages)

And if the password changes down the track and stored credentials are no longer valid, how can I have NetXMS notify me when I try and run scripts against a bunch of nodes that some of them failed because of incorrect details?