Menu

Show posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Show posts Menu

Messages - paul

#91
Well - here is the first cut of my Windows generic template. it includes the events that are triggered.

For explanation of the thresholds that apply in this template, they are implemented as per here:
https://www.netxms.org/forum/general-support/simple-question-cdm-monitoring-using-snmp-anybody-doing-this/msg25843/#msg25843

#92
Now that I have enough things added that I would not want to have to start again - what is the consensus on backups?

Just the DB? and if so, is there a standard command to use - or a sequence of commands?

The closest I found was this
https://www.netxms.org/forum/installation/netxms-server-backup-and-restore-process/

So for my postgres on Windows, it would be

1) Backup netxmsd.conf / nxagentd.conf
2) Backup /netxms (keep everything - much easier)
3) Backup database (pg_dump)
4) Backup copy of DB and NetXMS install files (for bare metal restore)

Restore procedure is:
1) Restore or recreate config
2) Restore files, if any
3) Restore database
4) Check data consistency: "nxdbmgr check"

Bare Metal restore:
1) Build server
2) Install DB - same DB as old
3) Install NetXMS - same version as old
4) Restore config
5) Restore DB
6) Check data consistency: "nxdbmgr check"

All good!!

Is this what others are doing?
#93
Great work Tursiops - fantastic contributions!!

DRAC is on my list as we have a couple of hundred of them scattered around. As many have links about as fast as a tin can and some string, we were judicious in what we monitored. Funny thing - again - we pick things not included normally such as LCD status - our global OK / NOTOK item.

Will also end up with a Xen and Centos applicable template once I get around to the Dells.
#94
It might help if I show the "after" with the trap now in NetXMS. Excluding the effort to get it there - the Alert Detail is now absolutely fantastic from a readability perspective and usability perspective.
#95
Well - I bit the bullet and set up the main two traps I need. The fun and games are as follows:

1. Selecting the trap did not automatically import the OID's. Had to manually enter them.
2. When selecting by OID - make sure you get your order right. The move up / Move down does not work. I got the first one wrong - and therefore the following 12 as well. No way to fix.
3. If you know the order the OIDS will come in - much quicker just to select position and do them in order.
4. As the OID text is not carried across - don't waste your time entering it.
5. When formatting the Alarm in Event Processing Policy - watch out for 256 char limit. UCS traps, when including /n for new line, exceed 256 chars.
6. Copy and paste within  Event Processing Policy when you have used all 256 chars gives an error. Copy and paste from Notepad ++  instead.
7. Alarm log shows Event before Message however Alarm Browser does not - makes for a very busy and messy screen.
8. Alarm Detail is weird - it allows me to show my shortened version of Message in Related Events (without the event title) - in Overview - gives me the whole message only.
9. When hovering over the alarm in Alarm Browser - I get the line by line message - but again, I do not get the Event Message or the Event.
10. Nearly forgot - Creation Time and Modification time are Octet Strings - no way to have NetXMS interpret them - come through as rubbish chars.

So there it is - I managed to get two traps in - with significant effort. Hopefully some of these will be improved.

Here is example of correctly formatted trap with times in corrected format.
#96
OK - Windows threshold is working with only having to use the first three chars of the file system name at either Node level or at Persistent Storage level.

[/if ( left($dci->instance,3) != NULL ) { shortdci = left($dci->instance,3); }
if ( GetCustomAttribute($node,"fileSystem_Warning_".shortdci) != NULL ) { threshold = GetCustomAttribute($node,"fileSystem_Warning_".shortdci); }
else if ( GetCustomAttribute($node,"fileSystem_Warning") != NULL ) { threshold = GetCustomAttribute($node,"fileSystem_Warning"); }
else if ( ReadPersistentStorage("fileSystem_Warning_".shortdci) != "" ) { threshold = ReadPersistentStorage("fileSystem_Warning_".shortdci); }
else { threshold = ReadPersistentStorage("fileSystem_Warning"); }
if ( ( threshold == null ) || ( threshold == "" ) ) { return null; }
if ( $1 >= threshold ) return 1;
else return 0;code]


And this is what it looks like from Object Details - Infrastructure Services as the object - Thresholds tab.
I have set the overall warning threshold to 65% - and - on that first node, I set node specific Custom Attribute settings for C:\, D:\ and P:\  with different severity levels so I would know that it was each setting being picked up and each setting was being applied independently only to the file system specified.

#97
General Support / Re: Node part of DCI Template
June 04, 2019, 11:40:03 AM
I don't use that template but others have had this type of problem and there are a number of ways you can do this.

Given the drive monitoring under the NetXMS Windows template is generated based on instances, here are two ways I know of.

Better option:
To add override capability to that template - this thread explains how to do that.
https://www.netxms.org/forum/configuration/dci-template-manual-override/

Not so good option: You may have to do this for all your drives as the template item to disable may also need to be the instance version.
If you want to disable the template and simply add your own - the following thread explains the disable part. Duplicate the template version first - update the duplicated one - then disable the template one.
https://www.netxms.org/forum/configuration/changing-template-reapply/

As Victor explained - the warning is correct - you are trying to update the template version - which can, and will, get overridden.

What you need - Duplicate, update the duplicate, and disable the template one - try that first. Below is how I would do it - or at least try it.

1. Find DCI item you want to change.
2. Right click - Duplicate
3. Right click - Disable (Goes yellow) - Template version now disabled.
4. Right click on the Duplicate (At bottom of the list of DCI's) - Edit
5. Make changes in Thresholds as per what you want
6. Click Apply and then OK

All done :)

This should work on the basis disabled template items stay disabled even after e template update.




#98
Fantastic - thanks again :)

Deprecated but still included. As much as no changes will be made - not really expecting any. Things like this will just have to be lived with.

Will import and see how I go :)

funny thing is that my switch that I apply it to - only picks up Fa0 and Nu0

Will update the server config and see what happens then :)

*** Magic ***

Both the Windows devices and the switch is showing for all ports now - excluding those that are disabled as per the scripts design.

Thanks again Tursiops - another fantastic piece of assistance that I had no chance on my own of ever working out.



Note to self - remember the 5 second polling and change to 600 :)
#99
Hi Victor - thanks for that.

The only trick is that I have POSTGRES on old machine and POSTGRES on new machine - both installed locally.

Running on the new machine - how would I do that?
#100
Well, between the excitement of success and the agony of defeat or which there has been plenty of both - I am still here - just!

NetXMS is a fantastic product with some fantastic capabilities - but - the initial learning curve is hard - for most, probably too hard.

I persisted due to a particular annoyance with traps but even then, without the invaluable assistance from Tursiops, absolutely invaluable, I would have also given up.

All I needed was two generic templates - Windows and Linux - both based on HOST-RESOURCE-MIB - tailored for the nuances of Linux and Windows. Although templates exist for agents, the fact none exist for SNMP suggest potential SNMP based users never got over that initial hump and simply went elsewhere and found something else. A shame really as NetXMS can do so much!.

I am at the point where my Linux generic SNMP template is about 98% there (just needs some tidy up) and my Windows generic SNMP template is about 80% there - some cleanup plus getting customized threshold per drive using first three letters of drive working - and that is it. Elapsed time for this part is just 8 days from when I first got my hands on the old Linux SNMP template. With the assistance of both Tursiops and Victor as I bumbled my way forward with questions that to others would seem very simple, but to me - stumped - these templates will now become a reality. An immense thank you to both of you.

Had these templates been available on day one - I would never have considered leaving. It is not that they do everything I would like - but they do provide me with the minimum that I need and that is the point - unless I can see a clear path from the problem I am trying to solve and the minimum that I need - I move on to any one of the many alternatives.

I was lucky - very lucky. I hope to return the favor once the templates are complete and post them where others can use them. As Tursiops has pointed out - these templates do not claim to be the best / fastest / neatest / cleanest ways of doing this - they are simply a start that can be built on and evolve over time so that better ones can be created.

So, two weeks on - I am still here. There is still plenty for me to do, but at least I know now that those who follow, who use SNMP for monitoring, will find a much easier experience getting up and going.
#101
Ok - global per drive is working - another successful step forward.

For Linux, this is working fine.

As originally stated (and implied) - the DCI instance is an exact match. Windows SNMP in its infinite wisdom returns the volume serial which means we need and exact matching on that.

Once I get my head around the syntax of the code a bit further, I expect that for windows, we can strip the volume back to the first three chars using something like 
$short_DCI = left($dci.->instance,3) as line 1 to create the short version of the filesystem and use that to compare against the Persistent Storage entry of "P:\" so as to avoid the need to have the whole volume name as the Persistent Storage variable.

Other than that though - this is working fine - absolutely fine :)

My many thanks again to Tursiops for this contribution which has been an incalculable benefit.


Oh - in case you are wondering why my alerts say Linux on a Windows server? - I leave the description as Linux so I can tell if my existing Windows template is triggering the exception or the new template is.

#102
Well - another fantastic piece of assistance - in and already working!! :) :)

What is nice is that when opening Alarm details - the DCI is shown AND each of the threshold alarms that have triggered are shown in events.

The only observation to make is that it does actually the space usage in the message - it just does not include the threshold from the Persistent Storage variable.

I assume - somehow - that this can be inserted / embedded into the alert somewhere.  Worst case - a feature request to be able to specify a Persistent Stored variable as the "exception" in Threshold definition screen for script thresholds so that the custom threshold is passed across in the event / alarm title rather than "script(1)".

I have not started on specific drive level yet = just getting it up and going is 95% of all battles for me - so very happy so far.
#103
If I want to move to a new server - both old and new are using NetXMS and Postgres - what are my options?

From the doco - there are options if the new DB is local and old DB is local and is connected to existing installation.
https://wiki.netxms.org/wiki/How_to_migrate_to_another_database

I know I can do export import of templates etc, but what about collected data - how do I transfer that across?

#104
Thanks again - another template :0 :0 :0 - getting started is all I need :)

Thanks also for putting this in perspective. For Windows / Linux - the agent option was clearly the preference - and at least there were templates for them.

Switches / Routers - they get a device driver but no template - even basic examples - that just seems weird. Does EVERY body build their templates from scratch???

My senior Network guy is already pissed at me when I mistook Maintenance for Unmanage whilst reading up on what active discover does - and started trawling our network for devices - with 60 second polling intervals - with no scope limit.  I think that 5 second interface polling would really make his day - will make sure that I change that one :) :)

As for displaying - showing on performance tab makes sense - but the DCI selections did not have that option - just the DCI's available for collection - tick box. I suppose that under the DCI view for the device I could set it there - but a template looks to be better - which I now have :).

I have four switches I can smash - old HP Blade Centre on its way out - so will see how far I can go with it.

Thanks again for the immense help :)

OK - in, up, and running :) - modified for 60 second polling and changed the auto discover so only gets picked up by switches with cisco driver.
if ( $node->driver imatch "CATALYST-GENERIC") return True;
*** Note to self - put a comment in front of isSNMP so as not to pick up everything else ***

The good news - it works!! Both the switch and the Linux servers are showing interfaces as per the template.
The bad news - only picking up two interfaces.
The worst (but expected) news - Windows devices are not returning anything.

For now - this is close enough, actually, more than close enough!!. I will nut out the rest, the important thing was getting a base to start with that actually was returning correct data.

Thanks again Tursiops - fantastic assistance and fantastic help.





#105
That template saved my sanity - including those *extras*. I have cleaned it out already, however, it was mentioned as, like you pointed out, we also need different thresholds.

It may have been an old template - but there is nothing newer. It was a fantastic start and without it, I would have given up and moved on - seriously!!

where has it now progressed to?
Windows CPU % busy is now working universally and unilaterally. Thanks to Victor for the assist with this - pulled a few handfuls out on this!!
Windows memory I am getting from the mounts table and as long as I remember that Virtual = Physical + Paging - memory monitoring is also fine.
Uptime in days is collected - both numerically for exception monitoring and in easy to read format for display on Overview
Windows file systems - automatic discovery of each file system is working.

For me - less than 2 weeks into NetXMS - script thresholds and default alert values in persistent storage are still beyond me. I have yet to have write a script and the only script I have updated is the discovery hook to populate object name from sysdescription if name is not resolved on discovery.

So what is next?
The different defaults depending on the drive. system drive (C:)  no more than 70%  Data Drive can go to 90% (D:) and Paging Drive (P:) - no threshold as only has paging file.
Override the default - some servers we never change will allow for a higher utilization - no need to add space just to get under a threshold.

I do need different thresholds for individual mount points so I will need to work that out.

I have no preference which way to go - I just want to get there in the quickest and easiest manner. Once finished (or sooner) I will post the template so that any other new users can get quickly up to speed if they use SNMP based system monitoring.



There are thousands out there like me - inherited SNMP based monitoring along with entrenched mindsets against agents - so I live within what I can. There are also those that are frustrated with the monitoring that they have (SolarWinds ORION(NPM) and ManageEngine OpManager) - my two current hair pullers.

Making NetXMS the easy and obvious choice would be fantastic - all it needs, really, are a few additional basic templates. Everyone I have showed are impressed - the loading time is fantastic, the ease of navigation(once you get the hang of it), multi line trap display, multi line emails of selected alarms, comments on alarms that can be updated and deleted. So many of the things with my other products that cause my angst are solved with NetXMS.

So, if you would like to do it differently - feel free to make some suggestions. The scope is as follows:
SNMP only
1000+ devices
Windows (server / Desktop / CE)
Linux(RHEL/Centos/Debian/Ubuntu/Solaris/Rasperian)
CISCO switches
Other devices that support SNMP monitoring such as Firewalls, load balancers, etc.
SNMP Traps - need  to actively support multi line, 15+ varbind traps - including presentation in readable format in both console and via email.

I know NetXMS can do all the above and I am about 95% there already.

I am happy to grind away and work out the rest - however - specific thresholds for DCI's contained within an instance populated table requires NXSL and NetXMS knowledge I simply do not have(yet), and without examples that I can copy and modify, it is an uphill grind. I have 2,000+ traps to work out how to get into NetXMS - NetXMS only includes 8 by default - so I already have my work cut out.

Am I complaining - no way!! - NetXMS is fantastic and the template you supplied was / is awesome. I just need to work out thresholds both the old way and the new way - (we have NAS stuff as well) - and then I can actually go live with this.