News:

We really need your input in this questionnaire

Main Menu
Menu

Show posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Show posts Menu

Topics - bdefloo

#1
General Support / Netxmsd hangs during startup
February 21, 2024, 07:14:51 PM
Hi,

We have NetXMS 4.4.2 running on a Windows 2019 server for a few months, however after the server was restarted yesterday we are no longer able to connect. The netxmsd.exe process is running and taking ~50% CPU, but looking at the netxmsd.log it appears to freeze mid startup. (see attached)

I ran nxdbmgr check and check-data-tables with clean results, tried reinstalling NetXMS, installed on a different machine and pointed to same database, but all with the same results.
I suspect something got corrupted in the database and is causing the server to lock up.

We're using MSSQL Server 2016 and the ODBC driver. The database is accessible through SQL management studio and appears to be otherwise healthy (as also attested by nxdbmgr)

Are there any further troubleshooting steps I can take?
I can provide debuglevel 9 logging or other detailed information via private message if needed. I haven't found any "smoking gun" error messages so far, unfortunately.

Should I try an update to a later version?

Thanks in advance!
#2
Hi,

We had some problems after upgrading to v2.0.8 from v2.0.2 on Windows 2008 R2. (DB: SQL Server Express 2012)
After the upgrade, we would get an error message "the connection timed out" when trying to connect with the NetXMS Console.

After setting DebugLevel in netxmsd.conf to 7 and restarting the NetXMS Core service, the following  error message started appearing in the log:
[16-May-2017 14:49:38.523] [DEBUG] Database Connection Pool exhausted (call from .\config.cpp:344)

I stopped the service, increased DBConnectionPoolMaxSize in the config table of the database from 40 to 100 and restarted it after which the problem appears to be solved.
Posted here for reference for other users and any suggestions.

Server statistics:
Total number of objects: 53241
Number of monitored nodes: 1016
Number of collectable DCIs: 6447
#3
Hi,

Our NetXMS v2.0.2 server on Windows 2008 R2 stopped working today and kept crashing on startup. The event log showed
[ERROR] EXCEPTION 0xC0000005 (Access violation) at 0x00659F20 (crash dump was generated);
From the crash dump I found it was caused by nxcore!ScheduledMaintenance+0x150, which appears to have something to do with nodes that have been set to maintenance mode.

I checked the database and found that there were some entries in the scheduled_tasks table pointing to IDs of nodes that have been deleted meanwhile. Unfortunately, running "nxdbmgr check" did not clean these up, so I deleted them manually and the problem was gone.

Not sure if this is something already solved in a later version or not, but thought I'd let everyone know in any case!
#4
Hi,

Since upgrading to v2.0.2 the "upload file" tool in the console no longer seems to work. (server, console and node all running on Windows Server 2008 R2)

The upload file dialog opens when clicking "upload file" in the context menu of a node with the agent installed, and we can select a file from the list of files available on the server, but nothing happens when we click OK. The dialog simply remains open.
I tried this on nodes with various agent versions (2.0.2, 1.2.13)

We're uploading the files manually as workaround for now.

Kind regards,
Bastiaan Defloo
#5
Hi,

After upgrading to v2.0.2 on Win2008R2 x64 we're having an issue in the NetXMS Console:
We can no longer select, right click or drag and drop objects in network maps, if the map has an image as background.

Also, with the "display objects as -> small labels" option, the text is now very hard to read against the background, because the label background is now transparent rather than opaque. Can this be configured somewhere?

Thanks!
#6
Hi,

I'm having a small issue in the Win x64 NetXMS v1.2.13 console.

Sometimes, if I add a new custom attribute on a node, close the properties and re-open them, it appears to be gone.
When trying to add the attribute again, it doesn't give an error or anything, but still disappears again after closing the properties window.

After closing and re-opening the console the newly created attributes reappear and everything works normally again.

Maybe the behavior only pops up after the console has been open some time.

Any clues?

Kind regards,
bdefloo
#7
General Support / Memory usage console
February 07, 2014, 03:49:47 PM
Hi,

I was playing around with the max memory settings for the v1.2.11 NetXMS Console when I noticed something peculiar:

My console is usually using around 240MB out of ~600MB allocated heap space (as seen in the status bar of the console). If I reduce the -Xmx parameter from 1024m to 512m This is 240MB out of ~450MB.

I tried reducing it further to 256m, this caused an error to pop up saying it failed to load the MIB file, but the console then only used 60MB! Further tests reveal that if I rename the mib file on the server, and start the console with an adequate heap size (eg 1024MB), only 80-100MB is used.

Is the MIB file integrally loaded and expanded into the heap every time the console is started? My MIB file is only 13MB on disk, but appears to account for 100+ MB of used heap space.

If so, isn't this a bit wasteful of resources? After all, I can't imagine most people use the MIB browser daily. Maybe it could be loaded when opening the MIB browser tab?

Thanks for your input!

Kind regards,
bdefloo
#8
General Support / Request timed out
February 05, 2014, 12:50:14 PM
Hi,

Since upgrading to v1.2.11 I've been getting more "Request timed out" errors in the management console when doing various changes. (applying templates, creating a container, moving nodes, ...)

It doesn't always happen, andsometimes the operation goes through anyway, other times I have to do it again or it only succeeds partially (eg 3 out of 20 nodes are in the template, the others aren't)

Has something changed that could be causing resource locks or delays?

Thanks!

Kind regards,
bdefloo
#9
General Support / searching on serial number
January 06, 2014, 05:07:42 PM
Hi,

Is there any way to search for a node based on the serial number listed under the "Components" tab in Object Details?

I've tried via the database but can't seem to find the table the components are stored in.

Thanks!
#10
General Support / Can't unselect node in map
June 10, 2013, 10:02:58 AM
Hi,

I'm having a small problem with maps in the NetXMS v1.2.7 Windows console (x64).

If I open a map and click on a node, everywhere I right-click I get the menu for that node. I can't get back to the menu for the map itsself without closing and reopening the map.
In previous versions if you right-clicked the background you got the map menu.

Might be a bug?

While on the topic of maps; I notice
https://www.radensolutions.com/chiliproject/issues/204#note-3
is marked as resolved. Where can I find this option?

Kind regards,
bdefloo
#11
General Support / Request timed out
June 06, 2013, 05:54:52 PM
Hi,

Since I upgraded to v1.2.7 yesterday I've been having some small problems with creating nodes, containers, templates, template groups, etc.
Some times (not always), the operations takes up to 10-30 seconds, or even gives a message such as "Request timed out". However, if I manually refresh the object browser a while later, the requested operation is usually done.

The NetXMS server is running on Windows 2008 R2 (x64) with MS SQL Express 2012, with 16GB of RAM and two quad core CPU's. CPU usage of netxmsd is about 1-2%, 400MB RAM, sqlservr about 1-3%, 4GB RAM in use.

The queue lengths are all within reason, except for "Database writer's request queue (other queries) for last minute" which sometimes goes up to 1K-2K+. I have 8 database writers configured, and there are 933 nodes with 18296 objects and 8350 DCI's.

Any ideas?
Thanks in advance!
#12
Hi,

First of all, great work the new update!
I love the monitoring on our WS5100 wireless switches.

However, we also have a number of motorola RFS6000 switches in use. It'd be great if we could monitor these in the same way. These are the successors to the WS series, and appear to use the same MIBs.

But to be sure, is there any way i can force the SYMBOL-WS driver on such a switch to test? Would you prefer I give you SNMP walk results? (of what OIDs?)

Thanks in advance,
bdefloo
#13
Hi,

We have a number of DCIs using last( # ) samples equal to a specific value thresholds. I noticed that every time the NetXMS service restarts these thresholds are deactivated and reactivated # polls later if they were active before NetXMS went down.

Probably this has to do with the server not knowing what the previous values were because it was down. However, it creates the false impression that the threshold was no longer reached at a certain point.

Seeing as threshold activation/deactivation status is kept in the database, would it be possible to retain the old status of a threshold if the current DCI value(s) do not change it, and last values are not available due to a server restart?

For example, let's assume a DCI with threshold last(2) < 10. In the following list of polled values, assume X is the point where the server restarts:
15, 26, 30, X, 40, 50, 60

Currentely (tested on v1.2.6), the threshold would be active before X, be deactivated when it polls 40 first time after the restart, and be reactivated at value 50.
With the change I suggested, it would remain active throughout the test.

Another example:
15, 26, 30, X, 0, 1, 2
In this case, both the current version and my proposal should deactivate the threshold at 0.

15, 26, 30, X, 40, 0, 1
Here, the current version would deactivate the threshold on 40, but my proposal would delay this until the next poll (0).

Kind regards,
bdefloo
#14
General Support / Deleted nodes still throwing events
January 31, 2013, 10:33:27 AM
Hi,

I've been having a strange issue on my v1.2.5 Windows 2003 x86 setup (but I think it's since v1.2.4):
I have a DCI on all my server nodes which checks if the last 5 ICMP ping results are above a certain value.
When the NetXMS core service starts, and one of these nodes is down, it will be triggered and post an event after 5 polls, which then sends an e-mail.

However, I appear to be getting these events and the related e-mails even for nodes I already deleted weeks ago! The node is nowhere to be found in the console object browser, and the alarms I'd expect to see aren't there either, but when I look in the database I can still find the deleted node in the object_properties and nodes tables, the DCIs are still there in the items table, ...
The is_deleted field is also still set to 0 in the object_properties table.

Could something be going wrong with the node deletion process?

Thanks in advance for any help you can provide!

Kind regards,
bdefloo
#15
Hi,

Thanks for fixing the bug in the Image library that caused it to refresh over and over whenever something changed, once for every image in the library. However, the same thing seems to be happening when opening the "Select Image" dialog, under for example Map Properties->Map background tab, causing an unneccessary delay in selecting an image.

Keep up the great work!

Kind regards,
bdefloo
#16
Hi,

Just for future reference of anyone having the same problems as we did upgrading to v1.2.5 on MS SQL 2008. We upgraded from v1.2.4 on a 32bit Windows 2003 server. When upgrading the database I got the following error:
C:\NetXMS\bin>nxdbmgr.exe upgrade
NetXMS Database Manager Version 1.2.5

Upgrading database...
Upgrading from version 265 to 266
SQL query failed ([Microsoft][SQL Server Native Client 10.0][SQL Server]Cannot c
reate index. Object 'event_log' was created with the following SET options off:
'ANSI_NULLS'.):
CREATE INDEX idx_event_log_root_id ON event_log(root_event_id) WHERE root_event_
id > 0
Rolling back last stage due to upgrade errors...
Database upgrade failed


Most likely, this was because we've been upgrading the same NetXMS instance since v1.0.13 or older. The only solution we found was to rename the event_log table, and to recreate it with ANSI_NULLS on:
USE [netxms]
GO

/****** Object:  Table [netxms].[event_log]    Script Date: 01/11/2013 10:37:06 ******/
SET ANSI_NULLS ON
GO

SET QUOTED_IDENTIFIER ON
GO

SET ANSI_PADDING OFF
GO

CREATE TABLE [netxms].[event_log](
[event_id] [bigint] NOT NULL,
[event_code] [int] NOT NULL,
[event_timestamp] [int] NOT NULL,
[event_source] [int] NOT NULL,
[event_severity] [int] NOT NULL,
[event_message] [varchar](255) NULL,
[root_event_id] [bigint] NOT NULL,
[user_tag] [varchar](63) NULL,
PRIMARY KEY CLUSTERED
(
[event_id] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY]

GO

SET ANSI_PADDING OFF
GO


Probably you could then move all the data back to the new table, but we didn't really have any need for it so we just started with an empty table.

Hope this helps anyone coming across the same issue.

Kind regards,
bdefloo
#17
Feature Requests / Network map label type
January 09, 2013, 06:09:23 PM
Hi,

A small thing I've noticed in using network maps is that selecting a different label type via "Display objects as..." is stored locally, so a different user on a different machine may see objects on a map by their icon rather than e.g. small labels.

Problem with this is, in our setup, that the icons are a tad large, and the text below is unreadable due to the map background. Also the placement on our maps is done according to the small label display type.

Would it be possible to have a "Default label type" property or such on the map properties, where you could select how the objects are shown on the map for all users? To not break with current functionality, you could add a "let user select" option that uses the current locally stored setting.

Thanks in advance,

Kind regards,
bdefloo
#18
General Support / No Windows event log message text
December 31, 2012, 12:28:24 PM
Hi,

I noticed a small problem on a number of nodes that have the NetXMS agent v1.2.4 installed.
The Windows system event log shows messages like:
The description for Event ID 23 from source NetXMS Win32 Agent cannot be found. Either the component that raises this event is not installed on your local computer or the installation is corrupted. You can install or repair the component on the local computer.

If the event originated on another computer, the display information had to be saved with the event.

The following information was included with the event:

Counter set B is empty, collector thread for that set will not start

the message resource is present but the message is not found in the string/message table

instead of
Counter set B is empty, collector thread for that set will not start

Some googling led me to the registry key
HKLM\System\CurrentControlSet\Services\Eventlog\system\NetXMS Win32 Agent\EventMessageFile
which I found set to "C:\NetXMS\bin" rather than "C:\NetXMS\bin\nxagentd.exe". Changing it and restarting Event Viewer solved the problem on that node.

Could this be a bug in the installer somewhere? I can't seem to pinpoint what version it exactly went wrong as I've upgraded through several versions of the agent, but I have just had the problem with a clean install of the x64 v1.2.4 agent on a Windows 2008 R2 server.

Kind regards,
bdefloo
#19
General / NetXMS - SQL Performance
December 18, 2012, 05:35:22 PM
Hi,

In the progress of investigating why our NetXMS setup still crashes a few times a day I noticed the database writer queue got very high. It starts climbing about 30min after the crash, and goes up to about 400K to 750K before the server crashes due to a memory access violation in a random module.

Our NetXMS server is running on a Windows 2003 x86 server, with MSSQL 2008 Express. I noticed in the activity monitor that a particular query was taking over a minute to complete, and causing alot of disk read activity:
SELECT event_source FROM event_log WHERE event_source=50220

Searching in the code led me to the CleanDeletedObjects function in the housekeeping module of the server. The reason it's so expensive to run is that SQL Server has to run over all the records in the event log (in my case, about 8 million of them for the default 90 days) to check if that particular event_source is used somewhere, as its not in the index. Probably, some things can be optimized here.

First off, it's searching the object ID of an interface in the event log, while the event source is always a node ID, if I'm not mistaken. Could a filter be added based on the object_class field of the deleted_objects table?

Secondly, if a record does exist for that particular record, all the rest of the records are still processed. It would seem this can be resolved by using the EXISTS condition, which is triggered as soon as one record is found:
e.g. IF EXISTS (SELECT event_source FROM event_log WHERE event_source=50220) SELECT 1 ELSE SELECT 0
http://msdn.microsoft.com/en-us/library/ms188336.aspx
However, I'm not sure if this keyword is supported in all of NetXMS' supported DB environments.

Thirdly, I'm seeing the query repeating multiple times for the same object ID. Could it be timing out, and trying over and over to delete the same object? My deleted_objects table is at 2722 records, so it does seem like something's going wrong.

Whether this is related to our crashes I don't know, but it might be a good performance improvement for anyone with a sufficiently large environment for it to be a problem. I reduced my event log size to 14 days meanwhile to see if it alleviates some of the stress on the SQL server.
#20
Feature Requests / per-user visibility of Alarms
October 05, 2012, 03:17:10 PM
Hi,

Would it be possible to define a specific group of users that can view an alarm in the NetXMS console/web interface?
This would allow a single NetXMS server to be more easily used by multiple people, but with different interests on the same nodes.

Perhaps even some sort of security settings for alarms, so you can select who can view/acknowledge/resolve/terminate alarms coming from a specific event processing rule? These permissions could then be ANDed with the existing permission set for alarms at node level.