Hey all,
currently we evaluate multiple monitoring solutions with demo instances.
Today a Problem arose with our NetXMS server(Ubuntu 16.04 LTS).
First we tried to log in with the Client and it refused our Connection, same with the Webinterface.
Then we noticed, that the server stopped listening on Port 4701. Trying to set the port manually in the config didn't do anything.
after some consulting google and trying some of the older forum posts, we are at wits end.
The connection to the Database works smoothly and in the debugging mode he queries many things successfully, but shows no errors on why he isn't listening on Port 4701.
			
			
			
				There is a stale lock in the database.
Make sure that netxmsd is not running and remove it with "nxdbmgr unlock".
You can also run "nxdbmgr check" for consistency check.
			
			
			
				Hey Alex,
thanks for the tip. I tried it as you instructed, but still get the same error.
Any other ideas why this could be happening?
			
			
			
				DB lock is there to prevent multiple instances of the server from corrupting data integrity. If daemon is not running but lock is there, usually that means that  it either crashed of database was shutdown first.
In your case, netxmsd is not starting at all, or crash at some point?
			
			
			
				hey alex,
I watched it and saw something interesting.
The moment I start the netxms service it takes 96% of one core and begins to build up memory to a point where it nearly takes everything that is left.
We currently got 1GB RAM in the virtual machine, gonna talk to one of the OPs to increase it and see if it can fix the issue
			
			
			
				Yes, I'd recommend at least 2GB of memory when database is on the same server.
How many servers you have configured in the NetXSM?
			
			
			
				Also check output of "dmesg" - most likely there will be records like "Out of memory: Kill process 29957 (netxmsd) score 366 or sacrifice child"
			
			
			
				Hey Alex,
we did expand the RAM, sadly no real improvement.
Also no messages in dmesg.
I emptied the database and reinitialized it today. everything seems to work fine now.
Is there a possibility that some entry into the database could cause such a behaviour?
UPDATE: I investigated some further and discovered, that netxms runs through quite a lot of queries converning the table alarm_notes, which is empty in our DB. After emptying the alarm table I was able to login, seems like a misconfiguration on my side. Shortly after login all alarms where there again
UPDATE2: after trunking alarm and alarm_events it seems to work again. looks like it was just a flood of events netxms couldn't process