Menu

Show posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Show posts Menu

Topics - lweidig

#1
General Support / asterisk.nsm
June 23, 2022, 10:24:44 PM
We are installing from nxagent-4.1.377-linux-x86_64.tar.gz on a server running FreePBX and were hoping to start monitoring the asterisk server that is running.  However the subagent for this does not appear to be included as part of the build.  Would prefer not to compile from source and wondering what our options might be?
#2
General Support / Overview Tab SLOW (4.1.283)
May 20, 2022, 02:23:46 PM
If we select a node in the Objects tree and then under the Object Details look at any tab and then click back to the Overview tab it can take upwards of 10 seconds before displaying anything besides the header tabs.  However when clicking from the Objects tree it is immediately visible for the node.  This is the latest version of the console / server (4.1.283) and console is MacOS.  I have attached a screen shot of what we see.

#3
General Support / SSH Monitoring "Sensitive"
May 20, 2022, 02:18:41 PM
While I am all in favor of the new SSH monitoring that was implemented it just seems to be reporting false positives way to frequently.  Not sure if it is timing out too quickly, but there does not seem to be any sort of "control" for how it works.  For us every time this has triggered the SYS_SSH_UNREACHABLE event we have been able to SSH in from the NetXMS server. 
#4
General Support / SNMP Communication Properties
April 29, 2022, 09:06:13 PM
We are in the process of making some SNMP changes and during the process many devices will continue responding on v2 as well as v3 (new).  What we have discovered is that since NetXMS discovered them with v2 it just continues to chug along using that which for the most part is fine until we turn v2 off which is the plan.  HOWEVER, once we enabled v3 on the devices they started sending out SNMP Traps using v3 and NetXMS just blindly ignored them because it did not have SNMP v3 setup for the devices.

So, we were looking to mass update these under the Properties -> Communications -> SNMP section of the node.  We are talking 1000's of devices and looked to automate this through a NXSL script.  However, those properties do not appear exposed from what we have been able to discern and therefore cannot be updated. 

Looking for thoughts on how to accomplish this without adjusting the settings 1 by one on each of the nodes.  Thanks!
#5
General Support / Service.Check
April 20, 2022, 10:46:36 PM
We are working to monitor a number of services that per the manual have cURL support and hence should be using the Service.Check agent command.

We are working to get POP3 working at the moment and have:

Service.Check(pop3://username:[email protected]/,^.*OK Authenticated.*)

If we put in actual values and run this using from the command line using the curl -v command we get valid results.  However, when we run it from a DCI it always comes up with 3.  How can we see the value being returned from the check.  Also, it states the Service.Check.XXXXX are to be considered deprecated in the documentation.  Wondering why since they seem to offer more complete testing of common protocols?

Thanks!
#6
General Support / Cascading Custom Attributes
March 31, 2022, 06:29:48 PM
We seem to be having a strange issue, that likely has been recently introduced.  We are running the latest 4.0.2227 version.

We have a number of Custom Attributes defined on the Infrastructure Services folder with values that we use on nodes to control scripts.  They are all setup to be Inheritable.  Underneath we have a folder called "Telephone Customers" where we setup some custom attributes with the same names but override the values to be specific to that group. 

The issue is that Nodes under that folder seem to at times be getting the values from Infrastructures Services folder and at other times getting them as expected from the Telephone Customers folder.  If we open that folder and edit the attribute followed by Apply & Close it is applied to all nodes.  We just need it to get the attributes from Telephone Customers from the start and then not change.  One of them is where it might post messages into Slack and when they go to the wrong spot confuse the responsible people.

Looking for suggestions what might be happening and how to resolve this.
#7
Now I understand that this is going to seem like an odd question but can somebody explain exactly what this controls?  My guess is this is how often instance discovery would be run? 

However, the default value is 10 minutes and that seems REALLY short or at least in our environment.  We have a ton of templates and for something like a switch, report on interfaces that we are using and have to run filters to get to this point.  Then we might also monitor 5+ values per port.  This is a LOT of polling on what seems like pretty static information if we are understanding it correctly.

I was thinking about adjusting this to something like 86400 (1 day) but when it defaults to such a low number wonder if I am missing something. 

Appreciate any further details on what this value does.  Thanks!
#8
General Support / [SOLVED] ActionShellExec Confusion...
February 18, 2022, 08:23:38 PM
I am trying to do a simple update of node status into a MS-SQL server when the status on a node changes.  I have written a really simply .cmd file to handle passing in the hostname, ip address, status and comment.  Can run this from the command prompt on the server and it works as expected.

On the server running the agent we have the following:


ActionShellExec = Sql.StatusUpdate:D:\nodeUpdate.cmd $1 $2 $3 $4


If I enable debugging on the agent can see this is getting fired off:

2022.02.18 13:06:23.802 *D* [comm.cs.232        ] Received message CMD_ACTION (4)
2022.02.18 13:06:23.811 *D* [procexec           ] ProcessExecutor::execute(): process "CMD.EXE /C D:\nodeUpdate.cmd host.com 1.2.3.4 down Yikes" started
2022.02.18 13:06:23.812 *D* [actions            ] Execution of external action Sql.StatusUpdate (D:\nodeUpdate.cmd host.com 1.2.3.4 down Yikes) started
2022.02.18 13:06:23.812 *D* [actions            ] ExecuteAction(Sql.StatusUpdate): requestId=4, RCC=0
2022.02.18 13:06:23.812 *D* [comm.cs.232        ] Sending message CMD_REQUEST_COMPLETED (ID 4; size 32; uncompressed)


One of the first thing the script does is echo the parameters to a temporary file, so we have some history.  That is never getting triggered and definitely the sqlcmd statement in the script is not running.  Have been pulling my hair out for a while and cannot figure out why this is not working.

Ideally, though at this point becoming less of a bid deal that last parameter should allow for a string with spaces.  That at this point is icing on the cake, just want it working.

Thoughts?
#9
General Support / Node / Interface interaction
November 24, 2021, 10:41:33 PM
Looking for some suggestions on the best way to handle a situation we currently have in our alarm monitoring.  If a device is down it will generate the Critical alarm from SYS_NODE_DOWN and that is excellent and what we want.  However, it will also generate a Minor alarm from SYS_IF_DOWN on the switch that the device is connected to.  Looking for ideas on the best way to correlate these two events so that only the Critical one is triggered or when it triggers it clears / prevents the Minor interface alarm.

Thanks!
#10
General Support / SSH DCI Collection
November 13, 2021, 12:24:52 AM
We have some scripts on Mikrotik routers that generate a JSON string output and would like to start using that in NetXMS for monitoring.  However, any combination I try to get this working seems to end with the DCI being in << ERROR >> state.  I ran the agent with debug 7 and this is what we get:

2021.11.12 16:16:01.638 *D* [comm.cs.3          ] Requesting metric "SSH.Command(10.0.xxx.xxx:22,"admin","NOTREALPW",":global inPool ""client"";/system script run dhcpPoolMon","",0)"
2021.11.12 16:16:01.638 *D* [ssh                ] AcquireSession: acquired existing session [email protected]:22/2
2021.11.12 16:16:01.657 *D* [ssh                ] SSHSession::execute: read error: Remote channel is closed.
2021.11.12 16:16:01.657 *D* [ssh                ] SSH output is empty


I did a bunch of searching and currently have an ssh agent config file which looks like:

# cat /etc/nxagentd-ssh-config
HostKeyAlgorithms +ssh-dss
KexAlgorithms +diffie-hellman-group1-sha1


As most others I can of course SSH from the command line on this server to the Mikrotik just fine, use the exact same login / command and get the desired results.  Need some assistance getting this going.  We have the server / agent at 3.9.344 and this is running on an Ubuntu 20.04.3 server.
#11
General Support / Find MAC Address
September 28, 2021, 12:32:28 AM
Is there any way to pass this a partial MAC address and allow it to find all instances that match?  We typically record the last 4 of a MAC address for certain things as it provides enough uniqueness where needed.  Have tried regex and other variations, seems to want all 12 hex digits.  Just curious.
#12
Feature Requests / Node Properties from Object Details
August 25, 2021, 02:08:47 PM
May times when looking through the Alarm Browser we will right click and alarm and select the Show object details option to dig in further.  However there are times we would like to examine the nodes Properties but in order to do that we need to jump over to the Object Browser and find the node there to right click and view the Properties.  It would be nice if this were an option to pull these up right from the Overview tab of the details. 

My suggestion would be a button in the General section header over to the right, but ANYWHERE would be great.  Thanks for considering this enhancement.
#13
Feature Requests / ios Management Console
August 17, 2021, 03:25:39 PM
I realize that this has been brought up in the past but never really see a definitive response.  Not having a client for ios devices is really a negative as this is most of the devices in our organization.  I noticed in github there is an inactive project (2 years) but wondering if this has any hope of being resurrected?
#14
Feature Requests / Server Console - Copy & Paste
August 03, 2021, 04:10:47 PM
Currently if you bring up the server console window inside of the Management Console there is no way that I could find to copy or paste into that window.  When providing information from this it would be nice to be able to copy the text instead of having to transcribe it all.
#15
General Support / [SOLVED] Active Discovery Issue
August 02, 2021, 04:41:54 PM
We are having some problems with active discovery and looked at messages that might help.  Running debug level 6 I see the nodes we are hoping to get discovered showing IP address already queued for polling.  From the server console and show queues I see the Node discovery poller with 6,370 entries and growing!  It does go down 1 or 2 once in a while, but for the most part it is going up.  Server is not loaded.   Can this be cleared and start again?  We are running the latest 3.9.156 but the issue was happening prior to that.
#16
General / Cleanup missing items?
July 31, 2021, 12:31:30 AM
We just performed an nxdbmgr -d check and allowed all items to be cleaned up.  Restarted the server and immediately received a large number of these:

2021.07.30 16:27:36.329 *E* [                   ] Failed to load access point object with ID 21714 from database
2021.07.30 16:27:36.329 *E* [                   ] Inconsistent database: access point MYSSID [21715] linked to non-existent node [2371]


Seems the cleanup tool might need to expand into this and clean things up.  Thanks!
#17
Feature Requests / Debug on node / dci
July 16, 2021, 11:03:07 PM
It would be great if there was a facility to allow you to debug on just one node or maybe just one DCI.  If we are looking for something and turn up the debug level (say 6 for example) it generates about 4K+ lines per second!  Not really useful to get information on a specific issue.
#18
General Support / [SOLVED] SNMP UNSUPPORTED
July 13, 2021, 03:40:57 PM
The dreaded "Status of DCI nnnnn (SNMP oid) changed to UNSUPPORTED" has been one of the items we have always struggled with on this excellent product and once again this is the case.  Really want to get to the root of this so we can monitor and move forward.  The specific issues that we are having this time is monitoring Cambium ePMP LAN / WAN TX / RX byte counters.  We are monitoring 100's of these devices and the issue only shows up on about 15-20 of them.

To help narrow this down I have picked a single radio where these are not working which is located at a site that has one working perfectly.  I have compared configurations for these two devices, they are running the same firmware release and besides IP address information are identical (including hardware model of course).  All four of the counters are not working for this device as well and we get messages like:

Status of DCI 28941 (SNMP: .1.3.6.1.4.1.17713.21.2.1.65.0) changed to UNSUPPORTED

If I use the built in MIB explorer on the working device and WALK starting at .1.3.6.1.4.1.17713.21.2.1 this value is part of the list retrieved.  On the "broken" device it is not displayed in the MIB explorer.  HOWEVER, if I simply go to the command line of the server running NetXMS and run some SNMP commands against the failing device they respond just fine:

# snmpwalk -v 2c -c NOTIT 10.0.XXX.XXX .1.3.6.1.4.1.17713.21.2.1.65.0
iso.3.6.1.4.1.17713.21.2.1.65.0 = Counter64: 106460802368
# snmpget -v 2c -c NOTIT 10.0.XXX.XXX .1.3.6.1.4.1.17713.21.2.1.65.0
iso.3.6.1.4.1.17713.21.2.1.65.0 = Counter64: 106460803992


Hoping to finally be able to resolve this "random" issue that we have always faced.

#19
Feature Requests / Resize Edit Script window
May 04, 2021, 03:44:26 PM
It would be nice if you could expand the size of the Edit Script window anywhere it shows up.  Currently, the pain point is in the Edit Threshold section of the application, but if there are other areas it would probably be helpful as well.  If you have any significant amount of code it is difficult to write within the small window which only allows about 50 characters before scrolling.
#20
Wondering if it would be possible to allow scripts to be processed in thresholds like you can for a single value collection on tables.  Our specific case is that we have a device that has a fan speed table (simply index / fan speed in RPM's) and devices may have anywhere from 1-5 fans depending on model.  We have another polled value which represents a temperature.  If the temperature exceeds a threshold the fans should start.  If they have 0 RPMs when that temperature is exceeded this triggers to us the fan has failed and we need to look at replacing it.  Looking for an elegant solution to trigger these events.  Suppose I could put it into the processing for the board temperature but logically it seems to be better placed with the fan table.

Hoping this makes sense, thanks in advance for any suggestions.