Problem with centralized agent upgrade

Started by npoljak, June 26, 2014, 04:09:42 PM

Previous topic - Next topic

npoljak

Hi,

We are having problems with centralized agent upgrade (NetXMS Package Manager and deploy to managed nodes option) for all agents.
NetXMS server is running on 2012 R2 standard edition, agents are running on various versions of windows server.
Our NetXMS server was running on NetXMS version 1.2.13 and it was upgraded to version 1.2.14.
All agent are on version 1.2.13.
We have the correct agent package and .NPI file.
I've noticed when I run the option "deploy to managed nodes" the file is uploaded to the agent, installation status is holding on and after 10 minutes the upgrade process fails with the following error:
Deploy agent package has encountered a problem.
Cannot start package deployment: Request timed out

Agent is stopped and it has to be started manually to work.
We tried out in our test environment with same upgrading scenario and it works fine.
When I try to upgrade them manually there is no problem.
Could someone advise me what is the best way to troubleshoot this issue?

Thanks for the help
Nikola


Dani@M3T

I see maybe the same problem on Windows 64bit nodes (Win2012 R2, Win7, Win8.1):
When I start agent upgrade from the NetXMS console in package manager (V1.2.15 to V1.2.16):

  • agent is stopped on node
  • New install package is uploaded to the node
  • Windows Eventlog: Source RestartManger, ID 10006: Application or service 'nxagentd' could not be shut down
  • Package Deployment Monitor (after a few minutes): Cannot start package deployment: Request timed out
  • Agent is not updated and not started again
  • On WinXP 32bit no problem

In agent event-log (DebugLevel=9):
[INFO ] Watchdog process stopped
[WARN ] Communication session broken: A request to send or receive data was disallowed because the socket had already been shut down in that direction with a previous shutdown call.
[INFO ] WINPERF: Collector thread for counter set B terminated
[INFO ] WINPERF: Collector thread for counter set C terminated
[INFO ] WINPERF: Collector thread for counter set A terminated
[INFO ] NetXMS Agent stopped


Now I have to update the agents manually

lindeamon

Hi Victor,

they have the same problem that i have for the past versions.
the agent is downloading the respected agent, stops the current agent but the installation fails.
here is my post:
https://www.netxms.org/forum/general-support/agent-deployment-2948/msg13347/#msg13347

Best Regards,
Lindeamon

Dani@M3T

I also tried with deactivated virus scan, same result.
With Sysinternals ProcessMonitor I can see the update process starts (nxagent-1.2.16-x64.exe and nxagent-1.2.16-x64.tmp).
Maybe the attachted ProcessMonitor trace can help you with troubleshooting (in csv format).


Victor Kirhenshtein

According to trace, installer was unable to rewrite some files because of sharing violation - system didn't allow rewriting them as they was already open (presumably by some process). Looks like nxagentd.exe didn't exit before upgrade. I'll try to simulate this in my network.

Best regards,
Victor

Dani@M3T

I didn't saw any file locks on NetXMS files. But when I set "EnableWatchdog = no" the update works!
Maybe the watchdog process is not already stopped when files should be replaced. Is the second process in the attached printscreen the watchdog process? When I check the running processes when update failed, none of these two processes are still running.

Victor Kirhenshtein

It makes sense. Theoretically it is possible that main process terminates before watchdog process completely stopped, and so files are still locked when installer start copying new files. I've changed agent so main process waits until watchdog process terminates and only then exits.

Was watchdog enabled in all cases of failed upgrades?

Best regards,
Victor

Dani@M3T

I think so but I'm not 100% certain (95 :-)