Crash Dumps on Windows Server 1.2.5

Started by aron, February 06, 2013, 02:03:11 PM

Previous topic - Next topic

aron

Hello

Unfortunatley I have experienced a few crashes since the upgrade to 1.2.5. The crashes have not been anything clear cause and effect. Please find attached the .info from the crash dumps. Unsure of the content of the .mdmp files and if there would be any security implications of posting these publicly.

Kind regards

Aron

Victor Kirhenshtein

Hi!

Stack traces from different info files seems irrelevant to each other and don't give enough information. Can you please run server with debug level 6 (set LogFile parameter in netxmsd.conf to some file) and after next crash send me .info file together with log file. You can send them to my email to avoid posting sensitive information on the forum.

Best regards,
Victor

bdefloo

Hi,

We've been having crashes similar to these for quite some time (from v1.2.0 up to v1.2.5) on windows 2003 x86.
I'm curious if we're having the same problem. Are there multiple events from the NetXMS Agent in the event log, starting minutes up to hours before the actual crash, with the message:
Communication session broken: An existing connection was forcibly closed by the mote host.
or does the server act strangely before crashing (slow response, node statuses changing to unknown, failing to show graphs/history, ...)?

In our case, I've never been able to pick anything useful out of the server logfile or the crash dumps, possibly because the logs fill up very quickly and memory corruption can take place long before the server actually tries to use the corrupted memory and crashes.

How often does your server crash? The rate at which the problem occurs seems tied in with queue lengths and memory usage in general. When we had large database writer queue lengths (> 100,000) the server was crashing 2-3 times a day, but after improving database performance somewhat (more database writers etc) it's now down to every 2-3 weeks.

I'm also hoping
https://www.netxms.org/forum/general-support/out-of-memory-netxms-v1-2-5/
is related to this problem, as on linux there's the chance that valgrind might catch the culprit.

Kind regards,
bdefloo