News:

We really need your input in this questionnaire

Main Menu

NetXMS 4.5 patch release 2

Started by Victor Kirhenshtein, February 08, 2024, 01:43:49 PM

Previous topic - Next topic

Victor Kirhenshtein

We just published patch release 2 for version 4.5. It contains few important fixes and some small improvements. Full change log is following:

- Fixed server crash on client session disconnect
- Fixed updated issues in new web UI
- Cosmetic fixes in UI
- Fixed issues:
   - NX-2490 (Server tries to read from tdata_xxxx table when TimescaleDB is used as backend)
   - NX-2502 (nxagentd uses UDP port 4700 to exchange hearthbeat messages and listens on address 0.0.0.0)

Mapik

Quote from: Mapik on February 06, 2024, 04:23:53 PMSince the update we are experiencing the following segfaults.

Feb  6 14:37:25 netxms kernel: [91647.540569] $POLLERS/WRK[155817]: segfault at 3c8 ip 00007f65e405eef4 sp 00007f65b466e808 error 4 in libc.so.6[7f65e3fef000+195000]
Feb  6 14:37:25 netxms kernel: [91647.540604] Code: 14 00 e8 7f 1f fa ff 48 8d 0d 48 99 14 00 ba 53 02 00 00 48 8d 35 8a 16 14 00 48 8d 3d ae 16 14 00 e8 60 1f fa ff f3 0f 1e fa <8b> 47 10 89 c2 81 e2 7f 01 00 00 90 83 e0 7c 0f 85 a7 00 00 00 53
Feb  6 14:39:59 netxms kernel: [91802.232098] $POLLERS/WRK[156297]: segfault at 3c8 ip 00007fef26191ef4 sp 00007feef6865808 error 4 in libc.so.6[7fef26122000+195000]
Feb  6 14:39:59 netxms kernel: [91802.232140] Code: 14 00 e8 7f 1f fa ff 48 8d 0d 48 99 14 00 ba 53 02 00 00 48 8d 35 8a 16 14 00 48 8d 3d ae 16 14 00 e8 60 1f fa ff f3 0f 1e fa <8b> 47 10 89 c2 81 e2 7f 01 00 00 90 83 e0 7c 0f 85 a7 00 00 00 53
Feb  6 14:47:27 netxms kernel: [92249.652890] $POLLERS/WRK[156910]: segfault at 3c8 ip 00007f85a5807ef4 sp 00007f857532d808 error 4 in libc.so.6[7f85a5798000+195000]
Feb  6 14:47:27 netxms kernel: [92249.652922] Code: 14 00 e8 7f 1f fa ff 48 8d 0d 48 99 14 00 ba 53 02 00 00 48 8d 35 8a 16 14 00 48 8d 3d ae 16 14 00 e8 60 1f fa ff f3 0f 1e fa <8b> 47 10 89 c2 81 e2 7f 01 00 00 90 83 e0 7c 0f 85 a7 00 00 00 53
Feb  6 14:49:40 netxms kernel: [92382.869305] $POLLERS/WRK[157431]: segfault at 3c8 ip 00007ffab7616ef4 sp 00007ffa9166d808 error 4 in libc.so.6[7ffab75a7000+195000]
Feb  6 14:49:40 netxms kernel: [92382.869357] Code: 14 00 e8 7f 1f fa ff 48 8d 0d 48 99 14 00 ba 53 02 00 00 48 8d 35 8a 16 14 00 48 8d 3d ae 16 14 00 e8 60 1f fa ff f3 0f 1e fa <8b> 47 10 89 c2 81 e2 7f 01 00 00 90 83 e0 7c 0f 85 a7 00 00 00 53

Is there anything we can try to prevent this?

We use Ubuntu Server 22.04 with all updates, PostgreSQL 15.5 and TimescaleDB 2.13.1 extension.

Update, output from debug run:
Thread 196 "$POLLERS/WRK" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffd007a640 (LWP 163934)]
___pthread_mutex_lock (mutex=0x3b8) at ./nptl/pthread_mutex_lock.c:80
80      ./nptl/pthread_mutex_lock.c: No such file or directory.

The problem persists in version 4.5.2, the same error occurs when debugging. Also tested with kernel 6.5.0.17, it behaves the same way.

Dmesg:
[65057.277570] $POLLERS/WRK[96358]: segfault at 3c8 ip 00007f216e231ef4 sp 00007f21480bb808 error 4 in libc.so.6[7f216e1c2000+195000]
[65057.277639] Code: 14 00 e8 7f 1f fa ff 48 8d 0d 48 99 14 00 ba 53 02 00 00 48 8d 35 8a 16 14 00 48 8d 3d ae 16 14 00 e8 60 1f fa ff f3 0f 1e fa <8b> 47 10 89 c2 81 e2 7f 01 00 00 90 83 e0 7c 0f 85 a7 00 00 00 53
[65334.782282] $POLLERS/WRK[96956]: segfault at 3c8 ip 00007f28d6849ef4 sp 00007f28a7932808 error 4 in libc.so.6[7f28d67da000+195000]
[65334.782334] Code: 14 00 e8 7f 1f fa ff 48 8d 0d 48 99 14 00 ba 53 02 00 00 48 8d 35 8a 16 14 00 48 8d 3d ae 16 14 00 e8 60 1f fa ff f3 0f 1e fa <8b> 47 10 89 c2 81 e2 7f 01 00 00 90 83 e0 7c 0f 85 a7 00 00 00 53
[65612.318083] $POLLERS/WRK[98664]: segfault at 3c8 ip 00007f2909292ef4 sp 00007f28da9ae808 error 4 in libc.so.6[7f2909223000+195000]
[65612.318114] Code: 14 00 e8 7f 1f fa ff 48 8d 0d 48 99 14 00 ba 53 02 00 00 48 8d 35 8a 16 14 00 48 8d 3d ae 16 14 00 e8 60 1f fa ff f3 0f 1e fa <8b> 47 10 89 c2 81 e2 7f 01 00 00 90 83 e0 7c 0f 85 a7 00 00 00 53

Filipp Sudanov

Can you install netxms-dbg package and run netxms under gdb. It should be like:

gdb netxmsd
run

when it crashes, issue
bt
and share the output

Mapik

Quote from: Filipp Sudanov on February 09, 2024, 08:13:39 PMCan you install netxms-dbg package and run netxms under gdb. It should be like:

gdb netxmsd
run

when it crashes, issue
bt
and share the output
It seems to be related to the discovery poller, we are using passive discovery mode, when I disabled discovery it stopped crashing.

Here is the backtrace after the crash with discovery enabled as "passive only":
Thread 191 "$POLLERS/WRK" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffc79f8640 (LWP 179903)]
___pthread_mutex_lock (mutex=0x3b8) at ./nptl/pthread_mutex_lock.c:80
80      ./nptl/pthread_mutex_lock.c: No such file or directory.
(gdb) bt
#0  ___pthread_mutex_lock (mutex=0x3b8) at ./nptl/pthread_mutex_lock.c:80
#1  0x00007ffff7c0dba3 in Mutex::lock (this=<optimized out>, this=<optimized out>) at ../../../include/nms_threads.h:1734
#2  GetAttributeWithLock<MacAddress> (mutex=..., attr=...) at ../../../include/nms_util.h:5420
#3  Interface::getMacAddress (this=<optimized out>, this=<optimized out>) at ../../../src/server/include/nms_objects.h:2103
#4  CheckPotentialNode (node=node@entry=0x7fffe4499810, ipAddr=..., ifIndex=<optimized out>, macAddr=..., sourceType=sourceType@entry=DA_SRC_ARP_CACHE, sourceNodeId=<optimized out>)
    at /build/nxbuild.Bqkhx2oRxN/build/netxms-4.5.2/src/server/core/discovery.cpp:921
#5  0x00007ffff7c0e5a7 in DiscoveryPoller (poller=0x7fffd483aa80) at /build/nxbuild.Bqkhx2oRxN/build/netxms-4.5.2/src/server/core/discovery.cpp:1040
#6  0x00007ffff7bb8eb6 in __ThreadPoolExecute_Wrapper_1<Pollable, PollerInfo*> (arg=0x7fffd48361a0) at ../../../include/nms_threads.h:1191
#7  0x00007ffff7a5a2ed in ProcessSerializedRequests (data=0x7fffd4837140) at /build/nxbuild.Bqkhx2oRxN/build/netxms-4.5.2/src/libnetxms/tp.cpp:483
#8  0x00007ffff7a58979 in WorkerThread (threadInfo=0x7fffdc4703e0) at /build/nxbuild.Bqkhx2oRxN/build/netxms-4.5.2/src/libnetxms/tp.cpp:199
#9  0x00007ffff7a5385f in ThreadCreate_Wrapper_1<WorkerThreadInfo*> (context=0x7fffdc4703f0) at ../../include/nms_threads.h:539
#10 0x00007ffff75e0ac3 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#11 0x00007ffff7672850 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
 

dtk33d

Hi,

[28039.876445] $POLLERS/WRK[15318]: segfault at 3c8 ip 00007f60d32ab2b0 sp 00007f608aaea8c8 error 4 in libc.so.6[7f60d3245000+155000] likely on CPU 5 (core 5, socket 0)
[28039.876460] Code: 30 11 00 ba c2 01 00 00 48 8d 35 57 ac 10 00 48 8d 3d 65 ac 10 00 e8 cf 8b fa ff e8 1a be 08 00 66 2e 0f 1f 84 00 00 00 00 00 <8b> 47 10 89 c2 81 e2 7f 01 00 00 83 e0 7c 0f 85 ac 00 00 00 53 48

got the same error.

Victor Kirhenshtein

I just fixed that. We will publish patch release with the fix shortly.

Best regards,
Victor

Mapik

Quote from: Victor Kirhenshtein on February 13, 2024, 09:10:03 AMI just fixed that. We will publish patch release with the fix shortly.
Best regards,
Victor
Looks like it's fine, version 4.5.3 now 10 hrs without crashing with passive discovery active. Thank you so much :-)