Menu

Show posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Show posts Menu

Messages - Tursiops

#106
I did some work on my test system in regards to thresholds as I've had a similar internal request for a while so can put it to use on our own system at some point.

For your Volume Space Used (%) DCI (on the instance DCI in your template), add a script threshold as follows (Value for script thresholds need to be "1"):
if ( GetCustomAttribute($node,"fileSystem_Warning_".$dci->instance) != NULL ) { threshold = GetCustomAttribute($node,"fileSystem_Warning_".$dci->instance); }
else if ( GetCustomAttribute($node,"fileSystem_Warning") != NULL ) { threshold = GetCustomAttribute($node,"fileSystem_Warning"); }
else if ( ReadPersistentStorage("fileSystem_Warning_".$dci->instance) != "" ) { threshold = ReadPersistentStorage("fileSystem_Warning_".$dci->instance); }
else { threshold = ReadPersistentStorage("fileSystem_Warning"); }
if ( ( threshold == null ) || ( threshold == "" ) ) { return null; }
if ( $1 >= threshold ) return 1;
else return 0;

That'll be for your "Warning" Threshold.
Create two more, one using fileSystem_Error and one using fileSystem_Critical instead of fileSystem_Warning. Those will be your Error and Critical thresholds, assign events as required.

This will allow you to use the following "Persistent Storage" variables (which you must create first):

  • fileSystem_Critical
  • fileSystem_Error
  • fileSystem_Warning
These will be the default thresholds for those states.

You can override them in three ways:

  • A global override for a specific file system, i.e. a Persistent Storage variable called "fileSystem_Critical_P:" with a value of 98 will mean any file system with instance "P:" will trigger a critical threshold at 98 or higher.
  • A per-node override. This overrides all global thresholds and will be the new default for this node for all file systems. Simply create a "fileSystem_Critical" (or _Error or _Warning) Custom Attribute on the node and set it to the required value.
  • A per-node, per-file system override. Same as with the global override for a file system, but create this as a custom attribute as opposed to a Persistent Storage variable. For example creating a "fileSystem_Critical_P:" custom attribute with a value of 50 will trigger the critical threshold for file system P: on this particular node.

Mix and match as required.

Priority in thresholds is as follows:

  • Per-Node & Per-FileSystem Override
  • Per-Node Override
  • Global Per-FileSystem Override
  • Global Default

As always, there may be easier/faster/better/more efficient ways of doing this.
The above worked on my test system (I was testing with Linux, i.e. /, /var and similar instances, but should work with P:, C:, etc. as well) and should get you started.
If nothing else, it might serve as an example of Persistent Storage & Custom Attribute usage. :)

Note: What the above will not do is include the actual disk space usage value in the alert message.
#107
I believe the instance discovery script relies on the "UseInterfaceAliases" Server Configuration (Configuration -> Server Configuration) to be set to "Concatenate name and alias" (which is not the default, but the way our server is configured).
If that's not set, any interface with a name/description will probably not work.

I've just made a change to the instance discovery script for the 32 bit counters, which should make it work for Windows. For all I recall, Windows doesn't have 64bit counters for network traffic. And as SNMP in Windows has been deprecated since Windows 2012, they won't be adding those counters either. :(
#108
Drivers are required to populate the Components and Ports tabs for a node. Not sure if they are also used for additional topology related things under the hood.
They have nothing to do with templates though.

Data from a DCI is visible in the Last Values windows and nowhere else by default. You can either build Dashboards, build & save a graph or reconfigure the DCI to "Show on performance tab". The latter option is available under the "Performance Tab" settings of the DCI. I stopped using that quite a while ago and I know there have been changes since then, so I can't comment on the correct way to add data from multiple DCIs onto one graph (which I assume you'd want for network traffic, i.e. show in- and outbound on the same performance graph).

Other than that, yes, you are missing a template.
I quickly hacked something together based on the template we have. No guarantees on how well it'll work for you, but it should get you started I guess.
It will only monitor interfaces which you expect to be UP. If an interface is administratively down, it won't show. If your expected status is DOWN or IGNORE, it won't show. It should pick up 32 and 64bit (i.e. If and IfX table) interfaces.
Last but not least, this one's a bit crazy on the poll interval (took it from a test system of mine) in that it checks the interfaces every 5 seconds. You'll probably want to adjust that...
I'm pretty sure it'll blow up on counter overflows, too. :P
#109
I probably should've cleaned that template up a bit more before posting it.
As I said, this is an older template I built before we switched to agents for Linux.

However, we do still use it for NAS devices and the driveAlertCustom is something that was added to allow for different threshold alerts depending on mount points. It's a combination of auto-bind rules, instance discovery and custom attributes.
By now, I would do this very differently and use script thresholds with default alert values in persistent storage and individual overrides as custom attributes.

So my suggestion for you: remove any reference to that from the instance discovery script. It doesn't do anything without the other stuff. And that other stuff is overly complicated. :)
#110
Quite the opposite for us. We're mostly using agents and are still adding new items to our templates from ExternalParameters, etc.
So from our perspective it works great and I wouldn't want to have to switch from SNMP every time. :)

Allowing the default to be changed via client preferences or server configuration option could be a nice tweak though.
#111
Glad to hear it works for you. :)

The reason for some data not working right away would've been because some items are polled every five minutes or so, while others are only polled only once an hour (like total space, there was no expectation of that changing a lot).
So when the first check failed due to incorrect community, it would've taken another hour to fix itself. Right-click and force poll would've fixed it immediately.
You'll probably want to adjust the instance discovery script to filter out file systems you don't care about.
There are some examples on the forum I believe on how to filter by file system type (not just instance name).

Good luck. :)
#112
I don't have the old default one, nor do I have a particularly great template. Never had a need to clean it up in regards to auto-bind or instance-discovery, as we're not really using this anymore (it gets auto-applied to a few NAS devices still). But maybe it helps getting you started. I've removed our threshold configuration as it's using a number of some custom events.
#113
General Support / Re: Todays dumb question
May 27, 2019, 06:19:07 AM
The following pages might be useful for scripting:
For the node object, there's documentation on attributes here: https://wiki.netxms.org/wiki/NXSL:Node (for class reference, see https://wiki.netxms.org/wiki/Category:NXSL_Class_Reference)
Comparison operators are documented here: https://wiki.netxms.org/wiki/UM:NetXMS_Scripting_Language_(NXSL)#Comparison_Operators
Function Reference (not sure how up to date that is in regards to new functions): https://wiki.netxms.org/wiki/NXSL_Function_Reference
#114
General Support / Re: Todays dumb question
May 26, 2019, 02:29:25 PM
I am assuming this line is only part of your full script? Otherwise, the or statement doesn't really do anything.
If you want to bind a node to a container based on name starting with rawd or rawq, you will have to add a "return" in front of your line, so that the result of your OR statement is actually used.
Alternatively you can also use
return $node->name ~= "^raw[dq]";
#115
We did originally use SNMP for Linux systems (never for Windows, as we needed the agent for other purposes anyway), but switched to agents.
Not that it didn't work, we just found that agents provided additional useful functionality for our Linux systems as well.

In other words: I'm sure you can make it work with SNMP only.
#116
If a DCI is in a template that is applied to a device, then that DCI will be applied to the device.

That leaves you with a couple of options:

  • Hide it from view
  • Move it into a different template

Option 1 could be as simple as right-clicking inside your Last Values tab and unchecking "Show unsupported items". You simply won't see the DCI (or any other unsupported DCI) anymore.

Option 2 could be accomplished by creating another template and using the same auto-apply rule with one minor exception: You add something like this as the first line:
if (ignoreMikrotikVoltage@$node != null) return false;
Then you add a custom attribute called ignoreMicrotikVoltage to the nodes in question and set it to 1, true or whatever. The template will no longer be applied.

You could also add something more complex, e.g. an actual connection to the device to check if that SNMP OID exists. If it doesn't, then don't apply the template. That means you don't have to fiddle around with custom attributes.
That solution would look more like this (again, something you'd add to your existing auto-apply rule):
transport = CreateSNMPTransport($node);
if ( transport == null ) return NULL;
voltageCheck = SNMPGetValue(transport,".1.3.6.1.4.1.14988.1.1.3.8.0");
if ( voltageCheck == null ) return null;
else return true;

You could return false instead of null in line 4, but in this case, null is probably safer.

I'll attempt to explain the difference between return null and return false in this scenario.
Note that all this assumes you have ticked both the "Apply this template automatically..." and "Remove this template automatically..." checkboxes in the Automatic Apply Rules section.
return null means the node will neither be added nor removed to/from the template. If the template is currently applied to the node, it won't be removed. If it is not applied, it won't be added.
return false on the other hand means the node will be removed from the template. As a consequence any DCIs from the template would be removed. All data would be lost.
In situations where an auto-apply rule looks at an immutable feature of a device, return null is usually the safer approach. You are not expecting a device to suddenly lose that feature. For example a Windows system should not suddenly return a Linux OS. If there is some glitch and the device does not return "Windows", you don't want the template to be removed.
In situations where an auto-apply rule looks at something variable, e.g. some software that's installed on a computer which could be removed at any time, you'd want to use return false. That way if the software is uninstalled the template is removed as expected.
#117
You can configure your agents in two ways:
1. Agent to Server connection
2. Server to Agent connection

Option 1 will require certificates and agent tunnels (see https://www.netxms.org/documentation/adminguide/server-management.html#server-configuration-for-agent-to-server-connection-tunnel-connection). This setup is a bit more involved, but makes a lot of sense if you are connecting to agents behind routers/firewalls which the NetXMS server cannot talk to directly.
Option 2 will simply require the NetXMS server be able to talk to your NetXMS agent on TCP port 4700. If you have a VPN between your server and the Agents, that's probably the easiest setup. Just make sure your Windows firewall doesn't block the incoming connection from the NetXMS server to your agent. You won't need the ServerConnection parameter for this either, MasterServers is enough.

I recommend reading through the documentation to get a better idea on how the above work: https://www.netxms.org/documentation/adminguide/agent-management.html#
#119
I am pretty sure that is by design, we ran into the same issue quite a while ago.
We're presently not using Cache Mode, as Force DCI Polls (especially when you have DCIs you poll once an hour but you want an update "right now") is more important to us than caching on the agent.

You can always lodge a feature request on https://track.radensolutions.com.

Cheers
#120
Did you enable Agent Cache mode?
Force DCI Poll does not work on nodes with Cache enabled.