News:

We really need your input in this questionnaire

Main Menu
Menu

Show posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Show posts Menu

Messages - lweidig

#136
General Support / Re: Custom Attributes
July 17, 2012, 10:47:15 PM
Ok, back to the typo issue.  What I have found is if the script contains a typo when you save it, no errors are generated - BUT the compiler must be rejecting it (as it should).  This then in turn causes the cannot find script errors. 

Is there some way in the console when you save a script if it cannot be compiled you pop up errors / warnings with line numbers so we can properly correct them.  Also, if a compiled version of the script exists I would suggest not removing it until after a successful compilation of a modified script.  Maybe include line numbering in the left "grey" section (where you might set breakpoints in other IDE's so you can hunt down messages more easily.

Thanks
#137
General Support / Re: Server Configuration Options
July 17, 2012, 02:36:12 PM
lindaemon:

Inside the console if you go into the Configuration -> Server Configuration menu you get all of the system settings displayed.  In the upper right corner is a green circle with a + sign that allows you to ADD variables that are not displayed and values.

Victor:

Yes, I think a function for accessing them would be great and as mentioned pretty trivial.  There are same global settings that we are referring to in scripts so that if we want to change them, it requires only a single place.  At this point we just store them as custom attributes in a known node, but they really more represent server configuration to us and that would seem to be the more appropriate location for them.


Thanks!
#138
General Support / Server Configuration Options
July 17, 2012, 01:56:58 AM
I see that within the client I can add my own Server Configuration variable / value pairs.  How can I retrieve these values within a script and or how would I use my own additions. 
#139
General Support / Re: Server Performance
July 12, 2012, 05:37:36 PM
Ok, this has been resolved.  While I had checked for the leap second bug on the machine, I had not checked the host it was virtualized on top of.  This was being affected by the issue and after fixing the host the NetXMS server machine returned to normal, in fact crazy LOW CPU usage (0.06) even though it is running perfectly from what I can tell.  Odd, because other machines on this host did not seem to be having this issue.   

Sorry about this and do appreciate all of the assistance!
#140
Announcements / Re: High CPU usage after 1/7/2012
July 12, 2012, 03:16:15 PM
For those affected, jumping with the date command can be accomplished with:

     date; sudo date `date +"%m%d%H%M%C%y.%S"`; date;

One additional note, if this is a virtualized machine make sure to check the host server it is running on as well for the issue, it was the cause for us.
#141
General Support / Re: Server Performance
July 12, 2012, 03:12:01 PM
Yeah, I had already run in to the July 1st post stuff and have insured this is not the issue we are running into.

I have attached the gdb output, thanks!
#142
General Support / Re: Server Performance
July 12, 2012, 06:07:15 AM
Ok, so on the core router I checked:
   - Disable routing table polling
   - Disable topology polling
   - Disable network discovery polling

Stopped netxmsd and verified that load did indeed drop to very near zero which it did almost right away.  Restarted the server and then waited a few hours in case it had to calm down.  Had no effect, load averages were at the same level.

Downloaded the source code to the machine and compiled using:
   ./configure --prefix=/usr --with-server --with-mysql --with-agent
All went well and then installed/restarted again.  Had no positive / negative effect other than what appears to be a minimal amount more RAM being used - so you apparently have better compiler settings than I do :) 

Open to any further suggestions.  Thanks!
#143
General Support / Re: Server Performance
July 11, 2012, 09:56:02 PM
Yes, I have multiple full bgp tables on the core router.  Also, depending what you consider large some of the internal routers have a lot of routes as well. 

What at this point is the routing table and topology polling being used for?  Since you are not auto creating dependencies, I am not sure other than  potentially discovery.

Potentially we can just turn this off on all devices.  I did turn those two off on the core router and have not seen any change, but maybe it has a lot queued up it needs to process.  Will give it some time.
#144
General Support / Re: Server Performance
July 11, 2012, 08:01:09 PM
The two clients are my Android phone and Windows machine.  The android app is very nice by the way, but is definitely battery intensive. 



netxmsd: show sessions
ID  STATE                    CIPHER   USER [CLIENT]
0   idle                     AES-256  [email protected].249 [nxjclient/1.2.1 (Linux 2.6.32.9-00005-g2440aba; libnxcl 1.2.1)]
1   idle                     AES-128  [email protected].131 [nxjclient/1.2.1 (Windows 7 6.1; libnxcl 1.2.1)]

2 active sessions

netxmsd: show pollers
PT  TIME                   STATE
S   11/Jul/2012 11:55:57   wait
S   11/Jul/2012 11:55:59   wait
S   11/Jul/2012 11:56:01   wait
S   11/Jul/2012 11:56:01   wait
S   11/Jul/2012 11:56:01   wait
S   11/Jul/2012 11:55:54   wait
S   11/Jul/2012 11:55:57   wait
S   11/Jul/2012 11:56:01   wait
S   11/Jul/2012 11:55:54   wait
S   11/Jul/2012 11:55:58   wait
S   11/Jul/2012 11:56:02   wait
S   11/Jul/2012 11:55:57   wait
S   11/Jul/2012 11:55:58   wait
S   11/Jul/2012 11:56:02   wait
S   11/Jul/2012 11:55:53   wait
S   11/Jul/2012 11:56:01   wait
S   11/Jul/2012 11:55:54   wait
S   11/Jul/2012 11:56:01   wait
S   11/Jul/2012 11:56:01   wait
S   11/Jul/2012 11:56:02   wait
S   11/Jul/2012 11:56:01   wait
S   11/Jul/2012 11:55:59   wait
S   11/Jul/2012 11:55:58   wait
S   11/Jul/2012 11:56:01   wait
S   11/Jul/2012 11:55:58   wait
C   11/Jul/2012 11:54:11   wait
C   11/Jul/2012 11:51:45   wait
C   11/Jul/2012 11:55:59   wait
C   11/Jul/2012 11:55:57   poll: xxx00-au-nn.excel.net [2268] - capability check
C   11/Jul/2012 11:54:04   wait
C   11/Jul/2012 11:56:01   poll: xxx00-au-nn.excel.net [2275] - capability check
C   11/Jul/2012 11:51:51   wait
C   11/Jul/2012 11:52:04   wait
C   11/Jul/2012 11:55:18   wait
C   11/Jul/2012 11:55:18   wait
C   11/Jul/2012 11:51:56   wait
C   11/Jul/2012 11:55:35   wait
C   11/Jul/2012 11:56:03   wait
C   11/Jul/2012 11:55:25   poll: xxx03-au-xx.excel.net [2210] - capability check
C   11/Jul/2012 11:56:03   wait
R   11/Jul/2012 11:51:33   wait
R   11/Jul/2012 11:50:57   wait
R   11/Jul/2012 11:51:41   wait
R   11/Jul/2012 11:52:23   wait
R   11/Jul/2012 11:51:29   wait
R   11/Jul/2012 11:51:29   wait
R   11/Jul/2012 11:51:29   wait
R   11/Jul/2012 11:56:00   wait
R   11/Jul/2012 11:52:04   wait
R   11/Jul/2012 11:55:49   poll: xxx05-rtr-00.excel.net [2043]
D   11/Jul/2012 11:53:08   wait
N   10/Jul/2012 16:39:09   wait
N   10/Jul/2012 16:39:09   wait
N   10/Jul/2012 16:39:09   wait
N   10/Jul/2012 16:39:09   wait
N   10/Jul/2012 16:39:09   wait
N   10/Jul/2012 16:39:09   wait
N   10/Jul/2012 16:39:09   wait
N   10/Jul/2012 16:39:09   wait
N   10/Jul/2012 16:39:09   wait
N   10/Jul/2012 16:39:09   wait
T   11/Jul/2012 11:54:14   wait
T   11/Jul/2012 11:55:19   wait
T   11/Jul/2012 11:55:13   wait
T   11/Jul/2012 11:55:19   wait
T   11/Jul/2012 11:55:35   wait
T   11/Jul/2012 11:54:41   wait
T   11/Jul/2012 11:56:03   wait
T   11/Jul/2012 11:56:01   poll: xxx00-au-nn.excel.net [2268]
T   11/Jul/2012 11:55:08   wait
T   11/Jul/2012 11:54:13   wait
B   10/Jul/2012 16:41:09   wait
B   10/Jul/2012 16:41:09   wait
B   10/Jul/2012 16:41:09   wait
B   10/Jul/2012 16:41:09   wait
B   10/Jul/2012 16:41:09   wait
B   10/Jul/2012 16:41:09   wait
B   10/Jul/2012 16:41:09   wait
B   10/Jul/2012 16:41:09   wait
B   10/Jul/2012 16:41:09   wait
B   10/Jul/2012 16:41:09   wait
A   11/Jul/2012 05:11:51   wait

netxmsd: show queue
Condition poller                 : 0
Configuration poller             : 0
Topology poller                  : 0
Data collector                   : 0
Database writer                  : 0
Database writer (IData)          : 0
Event processor                  : 0
Network discovery poller         : 0
Node poller                      : 1587
Routing table poller             : 0
Status poller                    : 0

netxmsd: show stats
Total number of objects:     1580
Number of monitored nodes:   172
Number of collectable DCIs:  1396

netxmsd: show flags
Flags: 0x4310067D
  AF_DAEMON                        = 1
  AF_USE_SYSLOG                    = 0
  AF_ENABLE_NETWORK_DISCOVERY      = 1
  AF_ACTIVE_NETWORK_DISCOVERY      = 1
  AF_LOG_SQL_ERRORS                = 1
  AF_DELETE_EMPTY_SUBNETS          = 1
  AF_ENABLE_SNMP_TRAPD             = 1
  AF_ENABLE_ZONING                 = 0
  AF_SYNC_NODE_NAMES_WITH_DNS      = 0
  AF_CHECK_TRUSTED_NODES           = 1
  AF_WRITE_FULL_DUMP               = 0
  AF_RESOLVE_NODE_NAMES            = 1
  AF_CATCH_EXCEPTIONS              = 0
  AF_INTERNAL_CA                   = 0
  AF_DB_LOCKED                     = 1
  AF_ENABLE_MULTIPLE_DB_CONN       = 1
  AF_DB_CONNECTION_LOST            = 0
  AF_NO_NETWORK_CONNECTIVITY       = 0
  AF_EVENT_STORM_DETECTED          = 0
  AF_SERVER_INITIALIZED            = 1
  AF_SHUTDOWN                      = 0

netxmsd: show watchdog
Thread                                           Interval Status
----------------------------------------------------------------------------
Item Poller                                      20       Running
Syncer Thread                                    130      Running
Poll Manager                                     60       Running
#145
General Support / Server Performance
July 11, 2012, 07:29:33 PM
I know from the many NMS systems that I have implemented / tested that something is up.  Just not sure how to start digging to find the issue and hoping to get some pointers.  We are running on a quad core Xeon / 4GB RAM machine running Ubuntu 12.04, NetXMS 1.2.1 (from .deb files).  Currently have about 200 nodes and 1500 DCI's setup.  The machine is running NOTHING other than NetXMS and services to support the installation.  Here is a screen grab from top:


# top
top - 11:15:55 up 18:36,  1 user,  load average: 14.94, 14.19, 14.36
Tasks:  30 total,   1 running,  29 sleeping,   0 stopped,   0 zombie
Cpu(s): 23.2%us, 48.7%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si, 28.0%st
Mem:   4194304k total,   436432k used,  3757872k free,        0k buffers
Swap:  2097152k total,        0k used,  2097152k free,   176968k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                             
  284 root      20   0 2282m  24m 3592 S  400  0.6   3679:06 netxmsd                                             
  236 mysql     20   0 1654m 211m 8000 S   24  5.2 617:58.69 mysqld                                               
  291 root      20   0  632m 4884 1460 S   10  0.1 194:04.51 nxagentd                                             
    1 root      20   0 24024 2024 1340 S    0  0.0   0:00.15 init                                                 
    2 root      20   0     0    0    0 S    0  0.0   0:00.00 kthreadd/105
...


Have considered recompiling from source, but prefer to use packages if available for ease of installation and hopefully optimal settings in build.  Can anybody else confirm they are running the .deb files under the latest Ubuntu LTS? 

I do have a question though about the software itself.  Are containers inside the "Infrastructure Services" node just considered logical groupings?  That is what I have assumed so we do have nodes that can appear under 3+ containers.  We have setup an "all node" container, grouped by device type, grouped by location and a few other ways we like to analyze the nodes in our network.    I am assuming software is polling node once, no matter how many times it appears under various containers.  If that is NOT the case then I probably have too much activity going on. 

Thanks!
#146
General Support / Re: Discovered "ghost" devices
July 11, 2012, 12:20:30 AM
Yeah, I suspect something like that.  We do have some bonded interfaces, but that is not where they are in our case.  Have searched arp, routing, hosts, ... on MANY devices and not coming up with anything yet. 

Just a bit embarassed that I cannot track it down, but the software keeps detecting it.  If I delete them they get discovered again, so it was not some sort of one time fluke.  Also, always these three out of hundreds of devices on many subnets.  I will keep hunting...
#147
General Support / Discovered "ghost" devices
July 10, 2012, 06:11:22 PM
I am getting three devices "discovered" but they are not for anything real.  It simply finds the IP address, but then of course cannot detect anything by Agent or SNMP as they are not really devices.  There is nothing on these addresses and they do not show up in any ARP tables or respond to ping.  Looked through the logs, but they are not really showing anything. 

What I am looking for is how these devices are being discovered.  I suspect some device on my network is wrongly making a reference to them and this is getting picked up on during examination of routing / arp tables.  But that is a guess. 
#148
General Support / Re: Custom Attributes
July 10, 2012, 02:58:40 PM
The problem with the typo theory, which I thought as well is that when you are selecting the action you pick it from a list and do not type anything in.  In fact I even tried renaming things but the same issue persisted.  Both events are using the same action, picked from the list.  SYS_NODE_ADDED works, SYS_NODE_UNMANAGED does not.  Screens are attached.

Now however, after taking the screen grabs I thought that I should crank up debugging to 9 and see if that gives any more information.  So I did that, restarted the server and ran it again.  My new problems is that it now worked:

[10-Jul-2012 06:49:45] Event 24997 match EPP rule 24
[10-Jul-2012 06:49:45] Event::expandText(event=0x7f40ac007c50 sourceObject=3184 template='%m' alarmMsg='(null)')
[10-Jul-2012 06:49:45] Event::expandText(event=0x7f40ac007c50 sourceObject=3184 template='' alarmMsg='Node status changed to UNMANAGED')
[10-Jul-2012 06:49:45] Event::expandText(event=0x7f40ac007c50 sourceObject=3184 template='Event::xlSetCustomAttributes' alarmMsg='Node status changed to UNMANAGED')
[10-Jul-2012 06:49:45] *actions* Executing NXSL script "Event::xlSetCustomAttributes"
[10-Jul-2012 06:49:45] [CLSN-0] Sending message CMD_OBJECT_UPDATE
[10-Jul-2012 06:49:45] [CLSN-0] Sending message CMD_OBJECT_UPDATE
[10-Jul-2012 06:49:45] [CLSN-0] Sending message CMD_OBJECT_UPDATE
[10-Jul-2012 06:49:45] [CLSN-0] Sending message CMD_OBJECT_UPDATE
[10-Jul-2012 06:49:45] ExecuteActionScript: script Event::xlSetCustomAttributes successfully executed
[10-Jul-2012 06:49:45] Event 24997 with code 12 passed event processing policy

I am assuming the 4 CMD_OBJECT_UPDATE messages are related to the 4 custom attributes I setup.

At any rate it is working now, just a bit confused what the cause was with both events using the same action.
#149
General Support / Re: Custom Attributes
July 09, 2012, 06:27:37 PM
I set this up however, it seems to be causing problems.  I have the following in the event log:


[09-Jul-2012 10:22:53] *actions* Executing NXSL script "Event::xlSetCustomAttributes"
[09-Jul-2012 10:22:53] ExecuteActionScript(): Cannot find script Event::xlSetCustomAttributes

The odd thing, this is the EXACT same script / action we use for SYS_NODE_ADDED events and that is able to run it / find it just fine.  Is it possible that once an event is unmanaged it cannot have scripts run against it?
#150
General Support / Re: Custom Attributes
July 09, 2012, 05:56:38 PM
Thanks, the unmanaged trick should work fine for the time being, though the 1.2.2 solution is more elegant!