Menu

Show posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Show posts Menu

Messages - Mapik

#1
Announcements / Re: NetXMS 5.1 patch release 3
February 24, 2025, 09:43:12 AM
Here is a memory graph (LibreNMS) from a virtual machine running only NetXMS and TimescaleDB 15.

Now it has 48 GB allocated, on the older version it ran without problems on 32 GB.

I don't remember exactly what version it was. The problem appears from versions from about October/November. We update about once a month.

#2
Announcements / Re: NetXMS 5.1 patch release 3
February 05, 2025, 09:26:03 AM
Hello, we have a problem with crashing NetXMS Server, it causes segfault after about 1-2 days.

Dmesg 1:
Quote[58727.484875] $POLLERS/WRK[8933]: segfault at 3800 ip 00007f655c27dc61 sp 00007f64fc55e948 error 6 in libc.so.6[7f655c0fe000+195000]
[58727.484904] Code: 1f 84 00 00 00 00 00 66 90 f3 0f 1e fa 48 89 f8 48 83 fa 20 72 33 c5 fe 6f 06 48 83 fa 40 0f 87 b5 00 00 00 c5 fe 6f 4c 16 e0 <c5> fe 7f 07 c5 fe 7f 4c 17 e0 0f 01 d6 74 04 c5 fc 77 c3 c5 f8 77

Dmesg 2:
Quote[264395.048604] $DATACOLL/WRK[165295]: segfault at 0 ip 00007f4bb8324b1d sp 00007f4b87a06a58 error 4 in libc.so.6[7f4bb81a2000+195000]
[264395.048638] Code: 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 f3 0f 1e fa 89 f8 48 89 fa c5 f9 ef c0 25 ff 0f 00 00 3d e0 0f 00 00 0f 87 23 01 00 00 <c5> fd 74 0f c5 fd d7 c1 85 c0 74 57 f3 0f bc c0 e9 2c 01 00 00 66

Backtrace after second segfault:
Quoteterminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc
terminate called recursively
terminate called recursively

Thread 1837 "$POLLERS/WRK" received signal SIGABRT, Aborted.
[Switching to Thread 0x7fffba9c8640 (LWP 340969)]
__pthread_kill_implementation (no_tid=0, signo=6, threadid=140736324208192) at ./nptl/pthread_kill.c:44
44      ./nptl/pthread_kill.c: No such file or directory.
(gdb) bt
#0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=140736324208192) at ./nptl/pthread_kill.c:44
#1  __pthread_kill_internal (signo=6, threadid=140736324208192) at ./nptl/pthread_kill.c:78
#2  __GI___pthread_kill (threadid=140736324208192, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
#3  0x00007ffff751f476 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#4  0x00007ffff75057f3 in __GI_abort () at ./stdlib/abort.c:79
#5  0x00007ffff77cab9e in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6
#6  0x00007ffff77d620c in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6
#7  0x00007ffff77d6277 in std::terminate() () from /lib/x86_64-linux-gnu/libstdc++.so.6
#8  0x00007ffff77d64d8 in __cxa_throw () from /lib/x86_64-linux-gnu/libstdc++.so.6
#9  0x00007ffff77cd265 in std::__throw_bad_alloc() () from /lib/x86_64-linux-gnu/libstdc++.so.6
#10 0x00005555555e28f6 in handleOOM(unsigned long, bool) ()
#11 0x00007ffff74ced8d in operator() (__closure=<optimized out>, __closure=<optimized out>, var=0x7fffb650bd20) at /build/nxbuild.kEGh9XOvdg/build/netxms-5.1.3/src/snmp/libnxsnmp/snapshot.cpp:125
#12 std::__invoke_impl<unsigned int, SNMP_Snapshot::create(SNMP_Transport*, const uint32_t*, size_t)::<lambda(SNMP_Variable*)>&, SNMP_Variable*> (__f=...) at /usr/include/c++/11/bits/invoke.h:61
#13 std::__invoke_r<unsigned int, SNMP_Snapshot::create(SNMP_Transport*, const uint32_t*, size_t)::<lambda(SNMP_Variable*)>&, SNMP_Variable*> (__fn=...) at /usr/include/c++/11/bits/invoke.h:114
#14 std::_Function_handler<unsigned int(SNMP_Variable*), SNMP_Snapshot::create(SNMP_Transport*, const uint32_t*, size_t)::<lambda(SNMP_Variable*)> >::_M_invoke(const std::_Any_data &, SNMP_Variable *&&) (
    __functor=..., __args#0=<optimized out>) at /usr/include/c++/11/bits/std_function.h:290
#15 0x00007ffff74ce1fd in std::function<unsigned int (SNMP_Variable*)>::operator()(SNMP_Variable*) const (__args#0=<optimized out>, this=0x7fffba9c6a10) at /usr/include/c++/11/bits/std_function.h:590
#16 SnmpWalk(SNMP_Transport*, unsigned int const*, unsigned long, std::function<unsigned int (SNMP_Variable*)>, bool, bool) (transport=transport@entry=0x7fffd1089600, rootOid=rootOid@entry=0x7fffba9c6aa0,
    rootOidLen=rootOidLen@entry=10, handler=..., logErrors=logErrors@entry=false, failOnShutdown=failOnShutdown@entry=false) at /build/nxbuild.kEGh9XOvdg/build/netxms-5.1.3/src/snmp/libnxsnmp/util.cpp:372
#17 0x00007ffff74ce30f in SNMP_Snapshot::create (transport=transport@entry=0x7fffd1089600, baseOid=baseOid@entry=0x7fffba9c6aa0, oidLen=oidLen@entry=10)
    at /build/nxbuild.kEGh9XOvdg/build/netxms-5.1.3/src/snmp/libnxsnmp/snapshot.cpp:124
#18 0x00007ffff7d48ffe in SNMP_Snapshot::create (baseOid=std::initializer_list of length 10 = {...}, transport=0x7fffd1089600) at ../../../include/nxsnmp.h:1134
#19 SnmpGetRoutingTable (snmp=0x7fffd1089600, node=...) at /build/nxbuild.kEGh9XOvdg/build/netxms-5.1.3/src/server/core/snmp.cpp:214
#20 0x00007ffff7c5eb67 in Node::getRoutingTable (this=this@entry=0x7fffe4376810) at /build/nxbuild.kEGh9XOvdg/build/netxms-5.1.3/src/server/core/node.cpp:10071
#21 0x00007ffff7c5edfa in Node::routingTablePoll (this=0x7fffe4376810, poller=<optimized out>, session=<optimized out>, rqId=<optimized out>)
    at /build/nxbuild.kEGh9XOvdg/build/netxms-5.1.3/src/server/core/node.cpp:10223
#22 0x00007ffff7ce25b2 in Pollable::doRoutingTablePoll (this=0x7fffe4377390, poller=0x7fffd4dbdb00) at /build/nxbuild.kEGh9XOvdg/build/netxms-5.1.3/src/server/core/pollable.cpp:206
#23 0x00007ffff7b73f16 in __ThreadPoolExecute_Wrapper_1<Pollable, PollerInfo*> (arg=0x7fffd4cddb80) at ../../../include/nms_threads.h:1191
#24 0x00007ffff79ef0ef in ProcessSerializedRequests (data=0x7fffc86b2740) at /build/nxbuild.kEGh9XOvdg/build/netxms-5.1.3/src/libnetxms/tp.cpp:495
#25 0x00007ffff79ed784 in WorkerThread (threadInfo=0x7ffff1a2ca70) at /build/nxbuild.kEGh9XOvdg/build/netxms-5.1.3/src/libnetxms/tp.cpp:214
#26 0x00007ffff79e787f in ThreadCreate_Wrapper_1<WorkerThreadInfo*> (context=0x7ffff1a2cae0) at ../../include/nms_threads.h:539
#27 0x00007ffff7571ac3 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#28 0x00007ffff7603850 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
(gdb)

Version:
QuoteNetXMS Server Version 5.1.3 Build 5.1-496-g315015358c (UNICODE)
NXCP: 5.62.1.52 (AES-256, 3DES, AES-128)
Built with: g++ (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0

OS: Ubuntu Server 22.04.5 LTS
#3
Announcements / Re: NetXMS 4.5 patch release 2
February 16, 2024, 08:12:38 PM
Quote from: Victor Kirhenshtein on February 13, 2024, 09:10:03 AMI just fixed that. We will publish patch release with the fix shortly.
Best regards,
Victor
Looks like it's fine, version 4.5.3 now 10 hrs without crashing with passive discovery active. Thank you so much :-)
#4
Announcements / Re: NetXMS 4.5 patch release 2
February 10, 2024, 02:21:51 PM
Quote from: Filipp Sudanov on February 09, 2024, 08:13:39 PMCan you install netxms-dbg package and run netxms under gdb. It should be like:

gdb netxmsd
run

when it crashes, issue
bt
and share the output
It seems to be related to the discovery poller, we are using passive discovery mode, when I disabled discovery it stopped crashing.

Here is the backtrace after the crash with discovery enabled as "passive only":
Thread 191 "$POLLERS/WRK" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffc79f8640 (LWP 179903)]
___pthread_mutex_lock (mutex=0x3b8) at ./nptl/pthread_mutex_lock.c:80
80      ./nptl/pthread_mutex_lock.c: No such file or directory.
(gdb) bt
#0  ___pthread_mutex_lock (mutex=0x3b8) at ./nptl/pthread_mutex_lock.c:80
#1  0x00007ffff7c0dba3 in Mutex::lock (this=<optimized out>, this=<optimized out>) at ../../../include/nms_threads.h:1734
#2  GetAttributeWithLock<MacAddress> (mutex=..., attr=...) at ../../../include/nms_util.h:5420
#3  Interface::getMacAddress (this=<optimized out>, this=<optimized out>) at ../../../src/server/include/nms_objects.h:2103
#4  CheckPotentialNode (node=node@entry=0x7fffe4499810, ipAddr=..., ifIndex=<optimized out>, macAddr=..., sourceType=sourceType@entry=DA_SRC_ARP_CACHE, sourceNodeId=<optimized out>)
    at /build/nxbuild.Bqkhx2oRxN/build/netxms-4.5.2/src/server/core/discovery.cpp:921
#5  0x00007ffff7c0e5a7 in DiscoveryPoller (poller=0x7fffd483aa80) at /build/nxbuild.Bqkhx2oRxN/build/netxms-4.5.2/src/server/core/discovery.cpp:1040
#6  0x00007ffff7bb8eb6 in __ThreadPoolExecute_Wrapper_1<Pollable, PollerInfo*> (arg=0x7fffd48361a0) at ../../../include/nms_threads.h:1191
#7  0x00007ffff7a5a2ed in ProcessSerializedRequests (data=0x7fffd4837140) at /build/nxbuild.Bqkhx2oRxN/build/netxms-4.5.2/src/libnetxms/tp.cpp:483
#8  0x00007ffff7a58979 in WorkerThread (threadInfo=0x7fffdc4703e0) at /build/nxbuild.Bqkhx2oRxN/build/netxms-4.5.2/src/libnetxms/tp.cpp:199
#9  0x00007ffff7a5385f in ThreadCreate_Wrapper_1<WorkerThreadInfo*> (context=0x7fffdc4703f0) at ../../include/nms_threads.h:539
#10 0x00007ffff75e0ac3 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#11 0x00007ffff7672850 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
 
#5
Announcements / Re: NetXMS 4.5 patch release 2
February 09, 2024, 01:33:02 PM
Quote from: Mapik on February 06, 2024, 04:23:53 PMSince the update we are experiencing the following segfaults.

Feb  6 14:37:25 netxms kernel: [91647.540569] $POLLERS/WRK[155817]: segfault at 3c8 ip 00007f65e405eef4 sp 00007f65b466e808 error 4 in libc.so.6[7f65e3fef000+195000]
Feb  6 14:37:25 netxms kernel: [91647.540604] Code: 14 00 e8 7f 1f fa ff 48 8d 0d 48 99 14 00 ba 53 02 00 00 48 8d 35 8a 16 14 00 48 8d 3d ae 16 14 00 e8 60 1f fa ff f3 0f 1e fa <8b> 47 10 89 c2 81 e2 7f 01 00 00 90 83 e0 7c 0f 85 a7 00 00 00 53
Feb  6 14:39:59 netxms kernel: [91802.232098] $POLLERS/WRK[156297]: segfault at 3c8 ip 00007fef26191ef4 sp 00007feef6865808 error 4 in libc.so.6[7fef26122000+195000]
Feb  6 14:39:59 netxms kernel: [91802.232140] Code: 14 00 e8 7f 1f fa ff 48 8d 0d 48 99 14 00 ba 53 02 00 00 48 8d 35 8a 16 14 00 48 8d 3d ae 16 14 00 e8 60 1f fa ff f3 0f 1e fa <8b> 47 10 89 c2 81 e2 7f 01 00 00 90 83 e0 7c 0f 85 a7 00 00 00 53
Feb  6 14:47:27 netxms kernel: [92249.652890] $POLLERS/WRK[156910]: segfault at 3c8 ip 00007f85a5807ef4 sp 00007f857532d808 error 4 in libc.so.6[7f85a5798000+195000]
Feb  6 14:47:27 netxms kernel: [92249.652922] Code: 14 00 e8 7f 1f fa ff 48 8d 0d 48 99 14 00 ba 53 02 00 00 48 8d 35 8a 16 14 00 48 8d 3d ae 16 14 00 e8 60 1f fa ff f3 0f 1e fa <8b> 47 10 89 c2 81 e2 7f 01 00 00 90 83 e0 7c 0f 85 a7 00 00 00 53
Feb  6 14:49:40 netxms kernel: [92382.869305] $POLLERS/WRK[157431]: segfault at 3c8 ip 00007ffab7616ef4 sp 00007ffa9166d808 error 4 in libc.so.6[7ffab75a7000+195000]
Feb  6 14:49:40 netxms kernel: [92382.869357] Code: 14 00 e8 7f 1f fa ff 48 8d 0d 48 99 14 00 ba 53 02 00 00 48 8d 35 8a 16 14 00 48 8d 3d ae 16 14 00 e8 60 1f fa ff f3 0f 1e fa <8b> 47 10 89 c2 81 e2 7f 01 00 00 90 83 e0 7c 0f 85 a7 00 00 00 53

Is there anything we can try to prevent this?

We use Ubuntu Server 22.04 with all updates, PostgreSQL 15.5 and TimescaleDB 2.13.1 extension.

Update, output from debug run:
Thread 196 "$POLLERS/WRK" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffd007a640 (LWP 163934)]
___pthread_mutex_lock (mutex=0x3b8) at ./nptl/pthread_mutex_lock.c:80
80      ./nptl/pthread_mutex_lock.c: No such file or directory.

The problem persists in version 4.5.2, the same error occurs when debugging. Also tested with kernel 6.5.0.17, it behaves the same way.

Dmesg:
[65057.277570] $POLLERS/WRK[96358]: segfault at 3c8 ip 00007f216e231ef4 sp 00007f21480bb808 error 4 in libc.so.6[7f216e1c2000+195000]
[65057.277639] Code: 14 00 e8 7f 1f fa ff 48 8d 0d 48 99 14 00 ba 53 02 00 00 48 8d 35 8a 16 14 00 48 8d 3d ae 16 14 00 e8 60 1f fa ff f3 0f 1e fa <8b> 47 10 89 c2 81 e2 7f 01 00 00 90 83 e0 7c 0f 85 a7 00 00 00 53
[65334.782282] $POLLERS/WRK[96956]: segfault at 3c8 ip 00007f28d6849ef4 sp 00007f28a7932808 error 4 in libc.so.6[7f28d67da000+195000]
[65334.782334] Code: 14 00 e8 7f 1f fa ff 48 8d 0d 48 99 14 00 ba 53 02 00 00 48 8d 35 8a 16 14 00 48 8d 3d ae 16 14 00 e8 60 1f fa ff f3 0f 1e fa <8b> 47 10 89 c2 81 e2 7f 01 00 00 90 83 e0 7c 0f 85 a7 00 00 00 53
[65612.318083] $POLLERS/WRK[98664]: segfault at 3c8 ip 00007f2909292ef4 sp 00007f28da9ae808 error 4 in libc.so.6[7f2909223000+195000]
[65612.318114] Code: 14 00 e8 7f 1f fa ff 48 8d 0d 48 99 14 00 ba 53 02 00 00 48 8d 35 8a 16 14 00 48 8d 3d ae 16 14 00 e8 60 1f fa ff f3 0f 1e fa <8b> 47 10 89 c2 81 e2 7f 01 00 00 90 83 e0 7c 0f 85 a7 00 00 00 53
#6
Announcements / Re: NetXMS 4.5 patch release 1
February 06, 2024, 04:23:53 PM
Since the update we are experiencing the following segfaults.

Feb  6 14:37:25 netxms kernel: [91647.540569] $POLLERS/WRK[155817]: segfault at 3c8 ip 00007f65e405eef4 sp 00007f65b466e808 error 4 in libc.so.6[7f65e3fef000+195000]
Feb  6 14:37:25 netxms kernel: [91647.540604] Code: 14 00 e8 7f 1f fa ff 48 8d 0d 48 99 14 00 ba 53 02 00 00 48 8d 35 8a 16 14 00 48 8d 3d ae 16 14 00 e8 60 1f fa ff f3 0f 1e fa <8b> 47 10 89 c2 81 e2 7f 01 00 00 90 83 e0 7c 0f 85 a7 00 00 00 53
Feb  6 14:39:59 netxms kernel: [91802.232098] $POLLERS/WRK[156297]: segfault at 3c8 ip 00007fef26191ef4 sp 00007feef6865808 error 4 in libc.so.6[7fef26122000+195000]
Feb  6 14:39:59 netxms kernel: [91802.232140] Code: 14 00 e8 7f 1f fa ff 48 8d 0d 48 99 14 00 ba 53 02 00 00 48 8d 35 8a 16 14 00 48 8d 3d ae 16 14 00 e8 60 1f fa ff f3 0f 1e fa <8b> 47 10 89 c2 81 e2 7f 01 00 00 90 83 e0 7c 0f 85 a7 00 00 00 53
Feb  6 14:47:27 netxms kernel: [92249.652890] $POLLERS/WRK[156910]: segfault at 3c8 ip 00007f85a5807ef4 sp 00007f857532d808 error 4 in libc.so.6[7f85a5798000+195000]
Feb  6 14:47:27 netxms kernel: [92249.652922] Code: 14 00 e8 7f 1f fa ff 48 8d 0d 48 99 14 00 ba 53 02 00 00 48 8d 35 8a 16 14 00 48 8d 3d ae 16 14 00 e8 60 1f fa ff f3 0f 1e fa <8b> 47 10 89 c2 81 e2 7f 01 00 00 90 83 e0 7c 0f 85 a7 00 00 00 53
Feb  6 14:49:40 netxms kernel: [92382.869305] $POLLERS/WRK[157431]: segfault at 3c8 ip 00007ffab7616ef4 sp 00007ffa9166d808 error 4 in libc.so.6[7ffab75a7000+195000]
Feb  6 14:49:40 netxms kernel: [92382.869357] Code: 14 00 e8 7f 1f fa ff 48 8d 0d 48 99 14 00 ba 53 02 00 00 48 8d 35 8a 16 14 00 48 8d 3d ae 16 14 00 e8 60 1f fa ff f3 0f 1e fa <8b> 47 10 89 c2 81 e2 7f 01 00 00 90 83 e0 7c 0f 85 a7 00 00 00 53

Is there anything we can try to prevent this?

We use Ubuntu Server 22.04 with all updates, PostgreSQL 15.5 and TimescaleDB 2.13.1 extension.

Update, output from debug run:
Thread 196 "$POLLERS/WRK" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffd007a640 (LWP 163934)]
___pthread_mutex_lock (mutex=0x3b8) at ./nptl/pthread_mutex_lock.c:80
80      ./nptl/pthread_mutex_lock.c: No such file or directory.