libc segfault crash

Started by ofca, March 08, 2016, 11:50:16 AM

Previous topic - Next topic

ofca

[18476348.206574] netxmsd[1485]: segfault at 10680c2c0 ip 00007f61b406a050 sp 00007f6164180978 error 4 in libc-2.13.so[7f61b3f44000+181000]
[19513061.150308] netxmsd[19644]: segfault at f563b5e0 ip 00007ff538c4c9f0 sp 00007ff4deeed978 error 4 in libc-2.13.so[7ff538b24000+184000]
[19677206.312203] netxmsd[867]: segfault at b6b7cd50 ip 00007f92c1d499f0 sp 00007f927ca89978 error 4 in libc-2.13.so[7f92c1c21000+184000]
[20116663.843012] netxmsd[30912]: segfault at e7a81c78 ip 00007f2c047679f0 sp 00007f2bb8685978 error 4 in libc-2.13.so[7f2c0463f000+184000]

since some time netxmsd crashes every few days like above. Everything else on this machine works fine. Any ideas?

Tatjana Dubrovica

Hi.

Can you run NetXMS under gdb?

Also in case of crash provide information about OS, NetXMS version and in case of server information about database.

Thank you!

ofca

OK, it's running under gdb. What should I do should crash occur? Anything other than bt?

Victor Kirhenshtein

bt will be enough.

Best regards,
Victor

ofca

Happened sooner than I expected.

[New Thread 0x7fffb4ccc700 (LWP 10037)]
[New Thread 0x7fffaad2d700 (LWP 10067)]
[New Thread 0x7fffaac2c700 (LWP 10095)]

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffaf777700 (LWP 25348)]
0x00007ffff43039f0 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
(gdb) bt
#0  0x00007ffff43039f0 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00007ffff68d78b7 in AbstractMessageReceiver::readMessage(unsigned int, MessageReceiverResult*) () from /usr/lib/x86_64-linux-gnu/libnetxms.so.2
#2  0x00007ffff7aad0c7 in ClientSession::readThread() () from /usr/lib/x86_64-linux-gnu/libnxcore.so.2
#3  0x00007ffff7aad8d9 in ClientSession::readThreadStarter(void*) () from /usr/lib/x86_64-linux-gnu/libnxcore.so.2
#4  0x00007ffff5106b50 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#5  0x00007ffff42b930d in clone () from /lib/x86_64-linux-gnu/libc.so.6
#6  0x0000000000000000 in ?? ()
(gdb)

Victor Kirhenshtein

Could you please run it again under debugger (with debug level set to 6), and when it crashes, run the following:

bt
frame 1
print *this
frame 2
print *this

and send output as well as last part of log file (last few kilobytes).

Best regards,
Victor

ofca

I kept gdb running after the last crash and tried running rest of commands:

(gdb) frame 1
#1  0x00007ffff68d78b7 in AbstractMessageReceiver::readMessage(unsigned int, MessageReceiverResult*) () from /usr/lib/x86_64-linux-gnu/libnetxms.so.2
(gdb) print *this
No symbol table is loaded.  Use the "file" command.
(gdb) frame 2
#2  0x00007ffff7aad0c7 in ClientSession::readThread() () from /usr/lib/x86_64-linux-gnu/libnxcore.so.2
(gdb) print *this
No symbol table is loaded.  Use the "file" command.

I've now restarted, as per your instructions.

Victor Kirhenshtein

Did you build binaries, or you are using deb packages?

Best regards,
Victor

ofca


ofca

...and it crashed again.

[New Thread 0x7fffa78f8700 (LWP 11125)]
[New Thread 0x7fffa7fff700 (LWP 11130)]
[New Thread 0x7fff9c040700 (LWP 11132)]
[New Thread 0x7fff9fb7b700 (LWP 11139)]
[09-Mar-2016 10:36:52.579] [DEBUG] [CLSN-1] Received message CMD_GET_SERVER_INFO
[09-Mar-2016 10:36:52.579] [DEBUG] [CLSN-1] Server time zone: CET+01CEST
[09-Mar-2016 10:36:52.579] [DEBUG] [CLSN-1] Sending message CMD_REQUEST_COMPLETED

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fff9b535700 (LWP 12974)]
0x00007ffff43039f0 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
(gdb) bt
#0  0x00007ffff43039f0 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00007ffff68d78b7 in AbstractMessageReceiver::readMessage(unsigned int, MessageReceiverResult*) () from /usr/lib/x86_64-linux-gnu/libnetxms.so.2
#2  0x00007ffff7aad0c7 in ClientSession::readThread() () from /usr/lib/x86_64-linux-gnu/libnxcore.so.2
#3  0x00007ffff7aad8d9 in ClientSession::readThreadStarter(void*) () from /usr/lib/x86_64-linux-gnu/libnxcore.so.2
#4  0x00007ffff5106b50 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#5  0x00007ffff42b930d in clone () from /lib/x86_64-linux-gnu/libc.so.6
#6  0x0000000000000000 in ?? ()

Same problem with other commands.

ofca

btw.
Reading symbols from /usr/lib/x86_64-linux-gnu/libnetxms.so.2...(no debugging symbols found)...done.
Reading symbols from /usr/lib/x86_64-linux-gnu/libnxcore.so.2...(no debugging symbols found)...done.

tomaskir

You can install the debug packages, which will provide the debug symbols.
netxms-server-dbg, netxms-agent-dbg,  etc.

ofca

Can't find these debug packages anywhere, and just apt-get install doesn't work. :(

tomaskir

What version NetXMS are you running?

Are you using beta or main apt repo?

Alex Kirhenshtein

Install netxms-base-dbg and netxms-server-dbg