Situation:
In a test environment i run NetXMS version 2.0RC2.
netxmsd: sh sta
Total number of objects: 489
Number of monitored nodes: 68
Number of collectable DCIs: 272
4 dashboards have been configured and are up and running. During most of the day everything is working properly. DCI values are polled and displayed nicely.
Issue:
On an average of once a day the server seems to crash. In the Windows event viewer the error : Thread "Item Poller" does not respond to watchdog thread.
From that moment the dashboards are empty and show an error message. At the same time when I execute 'last values' an error message is shown and no values can be displayed. See attached screenshots.
When i view the Windows services it appears that netxms core service is still running. But when troubleshooting this issue the only solution to temporarily solve the issue is stopping and starting the NetXMS core service. Eventually resulting with no history in the dashboard graphs.
Any suggestions in fully solving this issue?
Thanks in advance.
Kindest regards.
Markus
Definitely upgrade to 2.0.2.
There were many fixes between 2.0-RC2 and 2.0.2.
If the problem still persists on 2.0.2, let us know :)
Ok thanks for the advice and support.
I will upgrade to version 2.0.2 and then further evaluate if this issue is resolved.
Again thanks for your help.
Yesterday i successfully upgraded to 2.0.2 at around 14:00h.
Unfortunately this issue, "daily server crash - Thread "Item Poller" does not respond to watchdog thread", occurred again 5 hours after the successful upgrade to version 2.0.2.
* application and services still running, but no dci polling values
* Error viewing last values for each node, screenshot attached in this post.
* empty dashboard graphs
Any advice in fixing this issue is more than welcome.
Hi,
can you please add the following to netxmsd.conf:
DebugLevel = 7
CreateCrashDumps = yes
as well as ensure that LogFile points to some file, not {syslog}, and restart it. When server hangs, run
nxadm -c "raise access"
server process will crash and dump will be generated. Send dump file and log file to us for analyze.
Best regards,
Victor
Followed instructions.
Server is running again. Waiting for the next crash. Then i will run : nxadm -c "raise access"
I will sent log and dumpfiles after.
Server crashed again, graphs empty and no values. NetXMS core service still appears to be running.
Executed command line : nxadm -c "raise access"
dump files were created
Attached in this reply the most recent (zipped) log file.
Also attached in this reply the 2 dump files.
Awaiting your response.
Thanks in advance.
Kind regards,
Mark
Server crashed again, graphs empty and no values. NetXMS core service still appears to be running.
Executed command line : nxadm -c "raise access"
dump files were created
Attached in this reply the most recent (zipped) log file.
Also attached in this reply the 2 dump files.
Hope the team can solve this issue with the new information (logs and dumpfiles)
Thanks in advance.
Kind regards,
Mark
From a more recent crash attached in this reply : (zipped)log file and dump files
Hope this addition to the troubleshooting log and dump files can help in solving the issue.
Another crash.
More log and dump files attached in this reply : (zipped)log file and dump files
Again I hope this addition to the troubleshooting log and dump files can help in solving the issue.
Thanks in advance.
Kind regards,
Mark
Different location. Different network. Different NetXMS server, (software new version 2.0.2)
same issue : daily server crash - Thread "Item Poller" does not respond to watchdog thread, empty graphs, no polling values
created DUMP file and log files. Attached in this reply.
Hope you can find some leads within the new information regarding this issue.
This issue has not been solved.
Have done some new troubleshooting.
-MySQL logging enabled and reviewed logging.
Maybe someone can tell me what to look for besides the usual error search.
-Changed some server configuration parameters, for example in one troubleshooting 'run': increased the StatusPollingInterval from 60 seconds to 300 seconds.
and in another troubleshooting 'run' : increased PollerThreadPoolMaxSize from 250 to 500, PollerThreadPoolBaseSize from 10 to 20 and NumberOfDataCollectors from 25 to 50.
Unfortunately until now the server keeps on 'crashing' daily.
Hope there is a solution for solving this issue.
Still crashing.
Added some screenshots in this reply post of situation after the issue has occurred.
FYI
For one location i configured the following settings on a NetXMS server:
Statuspollerinterval 600
DefaultDCIPollingInterval 600
ConditionPollingInterval 600
Although there are unrealistic values for daily operations the server has been running for more than one day now without crashing.
Unfortunately the configuration mentioned in my last post (Statuspollerinterval 600, DefaultDCIPollingInterval 600, ConditionPollingInterval 600) did not last.
Server just logged the error "Thread "Item Poller" does not respond to watchdog thread" and does not poll new values anymore and displays empty graphs again.
This issue has been solved by installing the pre-release of the next version.
NetXMS servers have been running for more than a week without any form of crashing.
Thanks Victor and others for your support with this matter.