[18476348.206574] netxmsd[1485]: segfault at 10680c2c0 ip 00007f61b406a050 sp 00007f6164180978 error 4 in libc-2.13.so[7f61b3f44000+181000]
[19513061.150308] netxmsd[19644]: segfault at f563b5e0 ip 00007ff538c4c9f0 sp 00007ff4deeed978 error 4 in libc-2.13.so[7ff538b24000+184000]
[19677206.312203] netxmsd[867]: segfault at b6b7cd50 ip 00007f92c1d499f0 sp 00007f927ca89978 error 4 in libc-2.13.so[7f92c1c21000+184000]
[20116663.843012] netxmsd[30912]: segfault at e7a81c78 ip 00007f2c047679f0 sp 00007f2bb8685978 error 4 in libc-2.13.so[7f2c0463f000+184000]
since some time netxmsd crashes every few days like above. Everything else on this machine works fine. Any ideas?
Hi.
Can you run NetXMS under gdb?
Also in case of crash provide information about OS, NetXMS version and in case of server information about database.
Thank you!
OK, it's running under gdb. What should I do should crash occur? Anything other than bt?
bt will be enough.
Best regards,
Victor
Happened sooner than I expected.
[New Thread 0x7fffb4ccc700 (LWP 10037)]
[New Thread 0x7fffaad2d700 (LWP 10067)]
[New Thread 0x7fffaac2c700 (LWP 10095)]
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffaf777700 (LWP 25348)]
0x00007ffff43039f0 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
(gdb) bt
#0 0x00007ffff43039f0 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#1 0x00007ffff68d78b7 in AbstractMessageReceiver::readMessage(unsigned int, MessageReceiverResult*) () from /usr/lib/x86_64-linux-gnu/libnetxms.so.2
#2 0x00007ffff7aad0c7 in ClientSession::readThread() () from /usr/lib/x86_64-linux-gnu/libnxcore.so.2
#3 0x00007ffff7aad8d9 in ClientSession::readThreadStarter(void*) () from /usr/lib/x86_64-linux-gnu/libnxcore.so.2
#4 0x00007ffff5106b50 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#5 0x00007ffff42b930d in clone () from /lib/x86_64-linux-gnu/libc.so.6
#6 0x0000000000000000 in ?? ()
(gdb)
Could you please run it again under debugger (with debug level set to 6), and when it crashes, run the following:
bt
frame 1
print *this
frame 2
print *this
and send output as well as last part of log file (last few kilobytes).
Best regards,
Victor
I kept gdb running after the last crash and tried running rest of commands:
(gdb) frame 1
#1 0x00007ffff68d78b7 in AbstractMessageReceiver::readMessage(unsigned int, MessageReceiverResult*) () from /usr/lib/x86_64-linux-gnu/libnetxms.so.2
(gdb) print *this
No symbol table is loaded. Use the "file" command.
(gdb) frame 2
#2 0x00007ffff7aad0c7 in ClientSession::readThread() () from /usr/lib/x86_64-linux-gnu/libnxcore.so.2
(gdb) print *this
No symbol table is loaded. Use the "file" command.
I've now restarted, as per your instructions.
Did you build binaries, or you are using deb packages?
Best regards,
Victor
I'm using deb packages.
...and it crashed again.
[New Thread 0x7fffa78f8700 (LWP 11125)]
[New Thread 0x7fffa7fff700 (LWP 11130)]
[New Thread 0x7fff9c040700 (LWP 11132)]
[New Thread 0x7fff9fb7b700 (LWP 11139)]
[09-Mar-2016 10:36:52.579] [DEBUG] [CLSN-1] Received message CMD_GET_SERVER_INFO
[09-Mar-2016 10:36:52.579] [DEBUG] [CLSN-1] Server time zone: CET+01CEST
[09-Mar-2016 10:36:52.579] [DEBUG] [CLSN-1] Sending message CMD_REQUEST_COMPLETED
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fff9b535700 (LWP 12974)]
0x00007ffff43039f0 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
(gdb) bt
#0 0x00007ffff43039f0 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#1 0x00007ffff68d78b7 in AbstractMessageReceiver::readMessage(unsigned int, MessageReceiverResult*) () from /usr/lib/x86_64-linux-gnu/libnetxms.so.2
#2 0x00007ffff7aad0c7 in ClientSession::readThread() () from /usr/lib/x86_64-linux-gnu/libnxcore.so.2
#3 0x00007ffff7aad8d9 in ClientSession::readThreadStarter(void*) () from /usr/lib/x86_64-linux-gnu/libnxcore.so.2
#4 0x00007ffff5106b50 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#5 0x00007ffff42b930d in clone () from /lib/x86_64-linux-gnu/libc.so.6
#6 0x0000000000000000 in ?? ()
Same problem with other commands.
btw.
Reading symbols from /usr/lib/x86_64-linux-gnu/libnetxms.so.2...(no debugging symbols found)...done.
Reading symbols from /usr/lib/x86_64-linux-gnu/libnxcore.so.2...(no debugging symbols found)...done.
You can install the debug packages, which will provide the debug symbols.
netxms-server-dbg, netxms-agent-dbg, etc.
Can't find these debug packages anywhere, and just apt-get install doesn't work. :(
What version NetXMS are you running?
Are you using beta or main apt repo?
Install netxms-base-dbg and netxms-server-dbg
Sure, I would -- but where do I get them?
I'm using beta repo, as per:
https://www.netxms.org/download/
"Debian packages
Both server and agent packages for wheezy are available in our repository.
1. Add the repository to your sources.list:
deb http://packages.netxms.org/debian wheezy beta"
# dpkg -l | grep netxms
ii netxms-agent:amd64 2.0-RC2-2 amd64 NetXMS agent
ii netxms-base:amd64 2.0-RC2-2 amd64 NetXMS core libraries
ii netxms-dbdrv-pgsql:amd64 2.0-RC2-2 amd64 PostgreSQL driver for netxms-server
ii netxms-dbdrv-sqlite3:amd64 2.0-RC2-2 amd64 SQLite3 driver for netxms-server
ii netxms-server:amd64 2.0-RC2-2 amd64 meta package
Add "main" as well, not only beta
Fine. Will do; please fix the website :)
I'm now running 2.0.2, will see if the problem persists. Should netxms crash again, I'll install debug packages and run netxmsd under gdb again.
There's still no mention of main, only beta on https://www.netxms.org/download/
2.0.2 is working stable so far. Thanks for you help and repository suggestion. :)