Segfaults after upgradin to the 1.2.17

Started by tuomar, October 27, 2014, 01:02:52 PM

Previous topic - Next topic

tuomar

Hi,

We are getting segfaults on the netxmsd after upgrading from 1.2.16 to 1.2.17.
Crash occurs about 20 minutes after the start of the service.

Server is SLES 11 SP3 with kernel 3.0.101-0.40-default x86_64 and all patches installed.


Oct 27 09:01:01 kernel: [25988161.002060] netxmsd[10021]: segfault at 3e4 ip 00007f30b08c4985 sp 00007f309baf9d80 error 4 in libnxcore.so.1.0.0[7f30b0810000+13b000]
Oct 27 09:22:15 kernel: [25989435.606785] netxmsd[10678]: segfault at 3e4 ip 00007fafd01c4985 sp 00007fafbf0afd80 error 4 in libnxcore.so.1.0.0[7fafd0110000+13b000]
Oct 27 09:42:40 kernel: [25990660.489147] netxmsd[11255]: segfault at 3e4 ip 00007f40a695c985 sp 00007f4095ad9d80 error 4 in libnxcore.so.1.0.0[7f40a68a8000+13b000]


With debugger (gdb) i get this kind of output:


Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7f9488359700 (LWP 5152)]
Node::topologyPoll (this=0x7f94908f40e0, pSession=0x0, dwRqId=0, nPoller=62) at node.cpp:5329
5329                if (peerNode->isDown())

(gdb) bt
#0  Node::topologyPoll (this=0x7f94908f40e0, pSession=0x0, dwRqId=0, nPoller=62) at node.cpp:5329
#1  0x00007f9499b07d67 in TopologyPoller (arg=0x3e) at poll.cpp:586
#2  0x00007f9497e8f806 in start_thread () from /lib64/libpthread.so.0
#3  0x00007f94970a3e8d in clone () from /lib64/libc.so.6
#4  0x0000000000000000 in ?? ()


I think that libc version is 2.11.3

Rgs
TM


tuomar

Hi,

Thank you for fast reply.

I tried install new version (latest snapshot) from git source, but make stopped with error.
This is my first time when i try use git, so maybe i miss something.


make[3]: Leaving directory `/usr/local/src/netxms-git/src/libnxmap'
Making all in libnxsl
make[3]: Entering directory `/usr/local/src/netxms-git/src/libnxsl'
bison -b parser -o parser.tab.cpp -d -t -v parser.y
parser.y:106.32-39: syntax error, unexpected type, expecting string or identifier
make[3]: *** [parser.tab.cpp] Error 1
make[3]: Leaving directory `/usr/local/src/netxms-git/src/libnxsl'
make[2]: *** [all-recursive] Error 1


gcc version is 4.3.4, bison version is 2.3 and flex version is 2.5.35.

At last i tried modify 1.2.17 source code (add patch), but i only break it  :-[

Rgs
TM

Alex Kirhenshtein

Sorry, my bad. Patch from my previous post will not apply correctly to 1.2.17. New patch attached to this post.

1) Unpack:
tar zxf netxms-1.2.17.tar.gz
2) Download fix.diff
3) cd netxms-1.2.17
4) Apply patch:
patch -p1 < ../fix.diff
5) configure with desired options, make, then make install

tuomar

Hi,

Thanks for the step by step instructions
This was my first (or second) time than i added the patch to source.

make command gave the following error:

node.cpp: In member function 'UINT32 Node::getItemFromSNMP(WORD, const char*, size_t, char*, int)':
node.cpp:3188: warning: format '%llu' expects type 'long long unsigned int', but argument 4 has type 'long unsigned int'
node.cpp:3188: warning: format '%llu' expects type 'long long unsigned int', but argument 4 has type 'long unsigned int'
node.cpp: In member function 'void Node::topologyPoll(ClientSession*, UINT32, int)':
node.cpp:5331: error: 'm_name' was not declared in this scope
node.cpp:5331: error: 'm_id' was not declared in this scope


I tried to read the source code and find right variable names. File netxms-1.2.17/src/server/core/node.cpp and row 5331
Orginal
DbgPrintf(6, _T("Node::topologyPoll(%s [%d]): peer node set but node object does not exist"), m_name, m_id);

Modified
DbgPrintf(6, _T("Node::topologyPoll(%s [%d]): peer node set but node object does not exist"), m_szName, m_dwId);


I'm not sure whether those correctly, but after change make command does not give any error.

...and yes, database is backed up before this code is executed ;)

Rgs
TM

Alex Kirhenshtein

Yes, your changes are correct – I attached wrong file :)
Proper fix attached.