Menu

Show posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Show posts Menu

Messages - paul

#61
Similar setup - no problem here though - one huge difference.

The console does not time out.

We went the console path instead of WebUI as it required no install, we have created Object Tools(only works on Consoles), and it was one less thing I needed to worry about compared to my four browsers I currently have, depending on what program is picky about what browser.

Does not solve your web problem but it does provide a workaround - once and for all :)



#62
Don't you just hate it when the answer is right in front of you!!

Thanks for pointing this out - working much, much better now.

I have included my original settings so everyone knows what NOT to use :)

*** I worked out where my bad settings came from - they are the default when you untick use server settings ***

#63
I have a similar request for a different reason.

I have a couple of hundred routers that have a fallback connectivity option that automatically switches. I do not particular care to have these devices have a status of critical when all that has happened is that we have flipped from the primary to the secondary connectivity.

I would like the option of yes/no - include Interface status in Node status calculation.
I would also like the option on the General panel to see why status is the way it is. Status = Critical still involves me having to check whereas the following would be better:

Current
Status = Critical

Proposed:
Connectivity State = (UP/DOWN) - from status polling - including last connectivity time - one each for the what is listed in capabilities.
Interface status =  whatever the status is
DCI threshold status =  whatever the status is
Status included in overall node status - Connectivity + DCI (do not include Interface status)
Show Nodes down as separate severity and colour (yes / no) - so that for node that are down, I can see this clearly.

#64
Perhaps I just seem to find unusual situations.

Transitioning from three other NMS's to NetXMs meant adding devices by IP - and then working on those that did not discover.

I am fine that unmanaged means no automatic retry or discovery - that is consistent with unmanaged, but manually running status or configuration poll, I would expect that it would do the polling as requested, and then, if the node was contactable, to ask if I would want it managed again.

End result - I now have snmpget and ping as object tools that achieve the equivalent - tell me if the node is manageable again - and I act on that.

This came about when I was trying to get SNMP to work through to an EC2 instance. ICMP was not working, Node had been added, but polling just would not seem to work, no matter what settings along the way we changed. It was only when the comment was made - no snmp traffic requests for over 30 minutes - that  I realized that NetXMS is making up its own mind about whether it would actually do what was requested, based on the connectivity status and discovered status.

Like all NMS tools, each one has its peculiarities and with NetXMS, this is one of them.

I added the Ping and SNMPGET to Node Tools so these show up on the Object Details page and are quicker to execute that right click ==> poll == status
#65
I added an Object Command that issues command ping %n . I have two versions, a local command and a server command. Both have the same fault.

On a normal node it works
On an unmanaged node it works
On a node in maintenance node it works.

On a node that I have added but does not respond to anything, and has yet to respond to anything, I get the following response "Bad parameter SNMP".

Why does it not just simply run the command instead of checking SNMP, which at this point in the discovery process, has yet to succeed.

Output:

Bad parameter SNMP.


*** TERMINATED ***
#66
I have created a snmp get script - line for line out of the admin guide - tweaked with clearer answers.

The problem is - if a node is unmanaged, create transport fails and I am unable to run the script.
If it is in maintenance - it is fine.

Therefore the question is - if unmanaging a node prevents me from creating a transport, is there an alternate option to check snmp response whilst node is unmanaged?



transport = CreateSNMPTransport ($node);// Create SNMP transport for node
println $node->name;
if (transport == null)
{
  println "Failed to create SNMP transport, NetXMS is very unhappy, DONT poll unmanaged nodes exit";
      return $1;
}
value = SNMPGetValue(transport, ".1.3.6.1.2.1.1.1.0");
if (value == null)
{
    println "Failed to issue SNMP GET request as no SNMP response - might be community, might be connectivity - try pinging the node first!!";
       return 2;
       }
       else
       {
         println "SNMP request responded to our request for sysdescrt - and here it is ::" . value;
         return 0;
}
#67
When I status poll an unmanaged node, I get a "Node is Connected" but node status remains unmanaged - however - what I am looking for is if the node is now in a state that I can manage it again. The Status poll does not tell me that - at all.

I went low tech and simply added a tool called ping - local command - ping %n - and I use that as my status poller instead. I have a second one which pings from NetXMS server which I use alternatively, depending on where I think the issue might be.

Next thing is to add is a snmpget of the sysdesc to the tools so I can see if I get SNMP responses as again, Status polling does not trigger polling if it thinks node is unreachable.

#68
When using the filtering option in alarm log (right click on node does not have this option - it has alarms which is already available in Object Details), it starts to search from the first character typed.

Unfortunately for me, with 3500 nodes, this then hangs my NetXMS console till it can populate the list based on that first character - a pointless exercise considering the time it wastes.

Of all the things I like and dislike - this is the one thing that all users get really annoyed about and they vent at me to which there is nothing I can do.
Either let me entered the source node freehand completely, or let me enter it partially and then invoke search. Using the last search is great when adding traps and events, but for source, when there are over 3000 entries in the tree to traverse and then for me to select from - when that is not near enough anyway, is a nightmare.

Is there a setting somewhere where I can turn off "search once first letter is typed" or can I set numberofcharstypedtobeginsearch to 3 instead of 1?
#69
After running NetXMS for a bit now, State is different to Status. Status is an escalating level of severity aggregating upwards.

State is separate and indicates whether a node is contactable via any of its configured mechanisms.

State should be shown as up / down relating to communication with the node and should show "since"

Status shows criticality of alarms assigned to the node.

A dependency should be able to be set globally and overridden at the node level - suspend DCI if State = down. This prevents DCI alarms and also prevents the template DCI's being disabled / removed for nodes that lose connectivity.

For SNMP DCI's , a node that drops connectivity should drop back to status polling (sysdescription only) and once State = UP (response received), DCI polling resumes.

The settings that are already present use Status interchangeably between Status Polling (Up/Down) and Status Alarms (Minor/Major/Critical) when Status Polling should be reflected in a variable called State or NodeState and displayed separately in the General box on the Overview page.

Status polling should also check for any Alarms for Node=Down and automatically clear them if found.
#70
I see weird things like this as well.

I have windows devices discovered and set to v1 even though all windows devices have been 2c supported since 2000

https://docs.microsoft.com/en-us/windows/desktop/snmp/supported-versions

It gets tricky when considering traps. NetXMS assumes V2c and takes out the first varbinds as time as per v2c - but if a trap is v1 - first varbinds is now inaccessible if mapping by position.

As to your original question, when would you back version from 2c to 1 automatically? - other than receiving a v1 trap and deciding that the device must be v1 (incorrect assumption), I cannot imagine when this would occur. If a V1 trap was received and you have enable discovery from traps, then NetXMS "might" treat that as a new device and then merge it with your existing device - changing v2c back to v1.

I have a W2K16 server that NetXMS decided was V1. No idea why - no trap ever received - just weird.

Traps will state which version SNMP they are sent in, but that is not the same as what version you are expecting snmp get to be responded with.

http://www.tcpipguide.com/free/t_SNMPVersion1SNMPv1MessageFormat.htm
http://www.tcpipguide.com/free/t_SNMPVersion2SNMPv2MessageFormats-3.htm

As to why v2 is v2 and not v2c:
http://www.tcpipguide.com/free/t_SNMPVersion2SNMPv2MessageFormats.htm

So, in a nutshell - the version as seen in the snmp Version in the Capabilities I expect should not change - but there is no setting such as "allowdynamicsnmpverchange" which appears to be set to yes - allowing the version to be changed dynamically.

I did test doing a configuration poll based on manually setting v1 and then v2c and back to v1  - configuration polling did not change the version - it happily polled and returned SNMP active and the version it used - so it is not that. Configuration poll(Full) also did not change version.

#71
I was wondering what the outcome of this is as I somehow seem to have a similar problem.

Network discovery = off.
Add device if trap received = on

About 300 CISCO Routers added - SNMP is working.

Every 30 minutes (topology polling interval) - auth failure trap received.

Disable topology polling - alerts go away.

Given I have not touched these devices, but in my list of communities there will be at least one that does not work, why would topology polling not only not be working, but on not receiving a correct response, why not trigger an event for topology polling failure.

More importantly though - having not set anything on these auto discovered devices - how did I break it and how do I fix it?
#72
General Support / Re: Alarm Key bug
June 28, 2019, 10:18:22 AM
Fantastic reply and explanation. Too late in the day to try today - adding EC2 node fun instead - so will try this Monday.
#73
Its been a hectic week, very hectic.

Went live with me estimated 600-700 devices and as of today, 3300+ devices have decided they want to be monitored. Found out 10 years ago the intention was that "in the future, when we have the capacity...." that trap settings were made in the hope that something would arrive that would listen. NetXMS's arrival has now done that.

With traps lagging at over 90 minutes to appear in the console, plus 5 times the number of devices, some emergency tuning and lots of head scratching.

For traps - sending them through with matching varbinds was a killer. APC alerts with 19 varbinds was harsh - and with a minimum of 8 alerts to configure, I simply gave up as too hard - and not worth it. Now it is pos 1 pos 2 pos 3 - and if I will go back and tune any trap hat has flexible varbinds.

I must also be pretty unlucky in that the trap I get not only are most mibs missing, the ones I found would not compile for one reason or another.

As of now, 3,400 nodes, 108,000 DCI's and down to 489 alarms. I have managed to limp by at 80 traps and 109 event policies which I thought would be a lot worse.

Worst moment was when bouncing NetXMS and I got the DB locked by NetXMS problem - 05:45 in the morning, alerts an hour behind - looking like backing out. Found a few posts for the check option which worked - NetXMS back up and 500 Million events to process - but now with enough pollers and collectors to grind through so all good.

At the end of the day, I have ended up doing pretty much the same - Event Processing Policy - but that is a crappy solution as I am still limited to 255 chars in the Message to display, the formatting is limited, the description of the trap from the mib, along with the OID description of the varbinds, both intended to be visible to the trap reader - are not visible. Most trap descriptions have extra info regarding the trap, helpful on many occasions, but getting NetXMS to display it is simply not possible.

The funniest thing so far - after first doing the nxdbmgr check which cleared the lock - started the console which failed - cannot connect - and I just froze. then I remembered - helps if the service is started - and all good from that point onwards.

Real shame about the trap information not being available and me having to hard code oid descriptions into the alarm message field. For something as advanced and progressive as NetXMS - I am really surprised at how hard turning traps into alarms is. I suppose that goes hand in hand with the Alarm Browser view - filtering by severity / status was requested and noted as a feature request years ago bit never got anywhere.

Don't get me wrong - I am still absolutely stoked with NetXMS - it is just that this part sux - really, really sux :(

Perhaps Version 3 with its much improved interface???


#74
General Support / Re: Alarm Key bug
June 27, 2019, 04:44:11 AM
I have had one shot at Event Processing Policy filtering scripts but without an example, I syntactically sucked - so decided to look into that later.

Tried something rocket science based as follows - varbind in 3rd parm is severity so do not want alarms created where severity  = 3.
if $2 !=  3 return;

As for this problem,
I concur that this is the logical place to trigger the transform, and using custom_message makes sense, just need to work out the sha1.

Doco pages are all blank for sha1 sha256 and md5.
https://wiki.netxms.org/index.php?title=NXSL_Function_Reference&mobileaction=toggle_view_desktop

The other option if getting this to work takes too much time is to do similar via the left option and set custom message to be left (%4,255)
https://wiki.netxms.org/wiki/NXSL:left
#75
General Support / Re: Alarm Key bug
June 25, 2019, 09:49:51 AM
OK - at least I understood the problem is real..
As for sha1 hash - shame there is no script option for hash like there is for file watching.
https://www.netxms.org/documentation/adminguide/appendix.html

Were you able to do it using straight netxms scripting or did you have to resort to invoking something external to NetXMS?