Occasional issues with NetXMS Agents after service restart

Started by Tursiops, January 22, 2018, 04:50:31 AM

Previous topic - Next topic

Tursiops

Hi,

On Windows, I am at times seeing an issue where the NetXMS service will not restart (either after an upgrade or a manual restart attempt to load a new configuration).
Checking the logs, I this is generally due to a the NetXMS agent port still being in use by PowerShell (which we use extensively for Parameters, Actions and ParametersProviders). The actual NetXMS agent service is stopped, the process is not running, but a PowerShell process which was started by NetXMS is still active and for some reason managed to hold the agent's network port open. The system does not recover from this by itself, we have to go in manually and kill powershell.exe, after which the NetXMS service will start again.
Not sure if this is an issue where the NetXMS agent either does not clean up all external commands (properly) prior to shutdown or if at some point it just loses track of one of those commands, which is then never killed and just keeps sitting there.

We've had this issue for a while and this is appears to impact maybe a handful of systems a week (I have two systems which seem to have this problem pretty much every week, other systems appear to be more random). Considering we have 2000+ agents, it's certainly not a huge issue, but it is quite a nuisance to pick up on this  (with some systems which we do not expect to be on all the time, this problem may be ongoing and we wouldn't even notice) and resolve it.

I am pretty sure we have this problem prior to starting to implement ParametersProviders and we are not regularly running actions against the two recurring systems, so I have to assume it's just an external parameter problem.

Has anyone else encountered a similar problem?

Cheers


Filipp Sudanov

Quote from: nichky on July 30, 2025, 12:41:16 PMhave u resolv it?
Looks like original post was 7 years ago. Since then external process execution was reworked, there is some mechanism that should terminate external processes on timeout, but I am not sure if this happens during shutdown. 

Are you experiencing some issues now? With what NetXMS version?

nichky

Hi Filipp


i think that i  got that sorted.

Would you be able to help with ssh connecting.

On my device im using different port for SSH, even though i have specification netxms, when i say connect to SSH from netxms, still is loockign for the def one 22.

how to modifica that?

nichky

also from time to time i'm getting timeout.
Are you aware that netxms has lot of issues?

cbwecomm

I think you're being a little harsh... NetXMS is the only product that heavily relies on Java, that I don't hate working with haha.   (seriously, Java sucks almost 100% of the time). 

NetXMS is not an "out of the box" product.  I disagree somewhat with a lot of the marketing and sales stuff on this...It CAN be that easy, but 99% of networks (especially complex ones) are going to take a lot of effort.   
BUT, that's why we're using it, because the complexity of the network, demands a complex and flexible product.    

If you want something easy that you can install and have working in a few minutes on Windows, PRTG is a common favorite.  

NetXMS is the best, but be ready to put in some time to configure, and there's a LOT to learn about, tweak, configure, fine tune, etc.   

Regarding your timeout issues, this could be dozens of things and most of them have nothing to do with NetXMS.  You could have network issues, driver issues, cabling issues, wifi issues, etc etc.  You'll have to troubleshoot through your various configurations and setup and make sure you dont have any connectivity issues to the server you're trying to work with.  

nichky

Thanks. From the looks of it, maybe I've been a bit harsh. However, I do agree that NetXMS is a very good product.

What I've found is that most of the issues I experienced — like timeouts and connection refused errors — were resolved simply by stopping and starting the NetXMS core service. In most cases, I have to start it manually.

Maybe Windows isn't the best platform to run it on.

I've been trying to get it running in a MikroTik container (the container is running, but I can't log into it). Unfortunately, I can't seem to get it working properly, and I'm not sure if there's a proper tutorial out there for doing this.Maybe we need to open a new topic

nichky

one thing which i really don't understand is SNMP works , seen by pull -> status.

However the graphe doesn't work.
Can you help on that?