NetXMS Support Forum

English Support => General Support => Topic started by: troffasky on May 31, 2019, 08:04:07 PM

Title: Crashing after upgrade to 2.2.15
Post by: troffasky on May 31, 2019, 08:04:07 PM
Upgraded from 2.2.13 to 2.2.15 and since then the service isn't staying up. If I start it with netxmsd -D 9, these are the dying gasps:


2019.05.31 18:03:13.434 *D* StatusPoll(PDU): finished child object poll
2019.05.31 18:03:13.434 *D* StatusPoll(PDU): allDown=false, dynFlags=0x00001001
2019.05.31 18:03:13.441 *D* Node(PDU)->GetItemFromSNMP(.1.3.6.1.2.1.1.3.0): dwResult=0
2019.05.31 18:03:13.441 *D* StatusPoll(PDU [1458]): boot time set to 1554531149 from SNMP
2019.05.31 18:03:13.441 *D* Finished status poll for node PDU (ID: 1458)
2019.05.31 18:03:13.441 *D* ConfigReadStr: (cached) name=DeleteUnreachableNodesPeriod value="0"
2019.05.31 18:03:13.441 *D* [poll.conf          ] Starting configuration poll for node Big-Rack-PDU (ID: 1588)
2019.05.31 18:03:13.441 *D* Node is marked as unreachable, configuration poll aborted
2019.05.31 18:03:13.441 *D* Finished configuration poll for node Big-Rack-PDU (ID: 1588)
2019.05.31 18:03:13.441 *D* [poll.status        ] Starting status poll for node Big-Rack-PDU (ID: 1588)
2019.05.31 18:03:13.441 *D* ConfigReadStr: (cached) name=CapabilityExpirationTime value="604800"
2019.05.31 18:03:13.441 *D* [poll.status        ] StatusPoll(Big-Rack-PDU): check SNMP
2019.05.31 18:03:13.448 *D* [snmp.entity        ] Building component tree for BarkS-Core3-POE [1054
Segmentation fault (core dumped)



The output varies, but out of about 10 attempts, more than half of them end with the same line. That node is a Cisco SB switch, which I see mentioned in the release notes for 2.2.15, so I don't think this is a coincidence. How can I troubleshoot this?
nxdbmgr check comes back clear.
Title: Re: Crashing after upgrade to 2.2.15
Post by: troffasky on May 31, 2019, 11:25:50 PM
I have worked around this temporarily by moving cisco.ndd out of the way. I have no idea what effect this is going to have on NetXMS but at least it's staying up!
Title: Re: Crashing after upgrade to 2.2.15
Post by: Victor Kirhenshtein on June 03, 2019, 07:40:18 PM
Hi,

could you please run netxmsd under gdb (with Cisco driver back) and when it crashes show output of bt command. You'll need to install -dbg packages if you installed NetXMS from deb packages.

Best regards,
Victor
Title: Re: Crashing after upgrade to 2.2.15
Post by: troffasky on June 04, 2019, 12:17:40 AM

2019.06.03 22:12:53.762 *D* [poll.conf          ] ConfPoll(BarkP2P-1): node is wireless controller, reading access point information
2019.06.03 22:12:53.778 *D* [snmp.entity        ] Building component tree for BarkS-Core3-POE [1054

Thread 74 "$POLLERS/WRK" received signal SIGBUS, Bus error.
[Switching to Thread 0x7fffe30ee700 (LWP 593)]
0x00007fffea4777ce in HandlerPhysicalPorts (var=<optimised out>, snmp=snmp@entry=0x7fffebc6e500, arg=arg@entry=0x7fffe30dad50) at sb.cpp:113
113     sb.cpp: No such file or directory.
(gdb) bt
#0  0x00007fffea4777ce in HandlerPhysicalPorts (var=<optimised out>, snmp=snmp@entry=0x7fffebc6e500, arg=arg@entry=0x7fffe30dad50) at sb.cpp:113
#1  0x00007ffff6871da7 in SnmpWalk (transport=transport@entry=0x7fffebc6e500, rootOid=rootOid@entry=0x7fffe30daae0, rootOidLen=14,
    handler=handler@entry=0x7fffea477600 <HandlerPhysicalPorts(SNMP_Variable*, SNMP_Transport*, void*)>, userArg=userArg@entry=0x7fffe30dad50, logErrors=<optimised out>, failOnShutdown=false) at util.cpp:334
#2  0x00007ffff6871eea in SnmpWalk (transport=transport@entry=0x7fffebc6e500, rootOid=rootOid@entry=0x7fffea478310 L".1.3.6.1.4.1.9.6.1.101.53.3.1.5",
    handler=handler@entry=0x7fffea477600 <HandlerPhysicalPorts(SNMP_Variable*, SNMP_Transport*, void*)>, userArg=userArg@entry=0x7fffe30dad50, logErrors=logErrors@entry=false, failOnShutdown=failOnShutdown@entry=false)
    at util.cpp:265
#3  0x00007fffea477a3a in CiscoSbDriver::getPhysicalPortLayout (this=<optimised out>, snmp=0x7fffebc6e500, layout=0x7fffe30dad50) at sb.cpp:126
#4  0x00007fffea477aa2 in CiscoSbDriver::getModuleLayout (this=<optimised out>, snmp=<optimised out>, attributes=<optimised out>, driverData=<optimised out>, module=1, layout=0x7fffe30df340) at sb.cpp:217
#5  0x00007ffff7a22b5c in Node::confPollSnmp (this=0x7fffe671b000, rqId=0) at node.cpp:3486
#6  0x00007ffff7a2cfe4 in Node::configurationPoll (this=this@entry=0x7fffe671b000, pSession=pSession@entry=0x0, rqId=rqId@entry=0, poller=poller@entry=0x7fffec8ee500, maskBits=maskBits@entry=0) at node.cpp:2724
#7  0x00007ffff7a2da93 in Node::configurationPoll (this=0x7fffe671b000, poller=0x7fffec8ee500) at node.cpp:2652
#8  0x00007ffff79b9e41 in __ThreadPoolExecute_Wrapper_1<Node, PollerInfo*> (arg=0x7fffec80e160) at ../../../include/nms_threads.h:1215
#9  0x00007ffff74eb398 in ProcessSerializedRequests (data=0x7fffec80e180) at tp.cpp:466
#10 0x00007ffff74eb19b in WorkerThread (arg=0x7fffec803660) at tp.cpp:186
#11 0x00007ffff6e706db in start_thread (arg=0x7fffe30ee700) at pthread_create.c:463
#12 0x00007ffff6b9988f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95


Title: Re: Crashing after upgrade to 2.2.15
Post by: troffasky on June 04, 2019, 06:40:31 PM
Here's a walk from that OID, for what it's worth:

CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.5.49 = INTEGER: 1
CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.5.50 = INTEGER: 1
CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.5.51 = INTEGER: 1
CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.5.52 = INTEGER: 1
CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.5.53 = INTEGER: 1
CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.5.54 = INTEGER: 1
CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.5.55 = INTEGER: 1
CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.5.56 = INTEGER: 1
CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.5.57 = INTEGER: 1
CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.5.58 = INTEGER: 1
CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.5.59 = INTEGER: 1
CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.5.60 = INTEGER: 1
CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.5.61 = INTEGER: 1
CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.5.62 = INTEGER: 1
CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.5.63 = INTEGER: 1
CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.5.64 = INTEGER: 1
CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.5.65 = INTEGER: 1
CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.5.66 = INTEGER: 1
CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.5.67 = INTEGER: 1
CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.5.68 = INTEGER: 1
CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.5.69 = INTEGER: 1
CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.5.70 = INTEGER: 1
CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.5.71 = INTEGER: 1
CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.5.72 = INTEGER: 1
CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.5.107 = INTEGER: 1
CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.5.108 = INTEGER: 1
CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.5.109 = INTEGER: 1
CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.5.110 = INTEGER: 1
Title: Re: Crashing after upgrade to 2.2.15
Post by: Victor Kirhenshtein on June 06, 2019, 09:48:51 AM
Could you please also show walk output for .1.3.6.1.4.1.9.6.1.101.53.3.1.6 and .1.3.6.1.4.1.9.6.1.101.53.3.1.7?

Best regards.
Victor
Title: Re: Crashing after upgrade to 2.2.15
Post by: troffasky on June 06, 2019, 06:06:31 PM
 .1.3.6.1.4.1.9.6.1.101.53.3.1.6

CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.6.49 = INTEGER: 1
CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.6.50 = INTEGER: 1
CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.6.51 = INTEGER: 1
CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.6.52 = INTEGER: 1
CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.6.53 = INTEGER: 1
CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.6.54 = INTEGER: 1
CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.6.55 = INTEGER: 1
CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.6.56 = INTEGER: 1
CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.6.57 = INTEGER: 1
CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.6.58 = INTEGER: 1
CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.6.59 = INTEGER: 1
CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.6.60 = INTEGER: 1
CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.6.61 = INTEGER: 2
CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.6.62 = INTEGER: 2
CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.6.63 = INTEGER: 2
CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.6.64 = INTEGER: 2
CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.6.65 = INTEGER: 2
CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.6.66 = INTEGER: 2
CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.6.67 = INTEGER: 2
CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.6.68 = INTEGER: 2
CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.6.69 = INTEGER: 2
CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.6.70 = INTEGER: 2
CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.6.71 = INTEGER: 2
CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.6.72 = INTEGER: 2
CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.6.107 = INTEGER: 1
CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.6.108 = INTEGER: 2
CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.6.109 = INTEGER: 1
CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.6.110 = INTEGER: 2

.1.3.6.1.4.1.9.6.1.101.53.3.1.7

CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.7.49 = INTEGER: 1
CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.7.50 = INTEGER: 2
CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.7.51 = INTEGER: 3
CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.7.52 = INTEGER: 4
CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.7.53 = INTEGER: 5
CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.7.54 = INTEGER: 6
CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.7.55 = INTEGER: 7
CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.7.56 = INTEGER: 8
CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.7.57 = INTEGER: 9
CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.7.58 = INTEGER: 10
CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.7.59 = INTEGER: 11
CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.7.60 = INTEGER: 12
CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.7.61 = INTEGER: 1
CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.7.62 = INTEGER: 2
CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.7.63 = INTEGER: 3
CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.7.64 = INTEGER: 4
CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.7.65 = INTEGER: 5
CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.7.66 = INTEGER: 6
CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.7.67 = INTEGER: 7
CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.7.68 = INTEGER: 8
CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.7.69 = INTEGER: 9
CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.7.70 = INTEGER: 10
CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.7.71 = INTEGER: 11
CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.7.72 = INTEGER: 12
CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.7.107 = INTEGER: 13
CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.7.108 = INTEGER: 13
CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.7.109 = INTEGER: 14
CISCOSB-Physicaldescription-MIB::rlPhysicalDescription.3.1.7.110 = INTEGER: 14
Title: Re: Crashing after upgrade to 2.2.15
Post by: Victor Kirhenshtein on June 08, 2019, 12:22:31 PM
Quite strange, I don't see anything wrong with the data. Could you please make it crash under debugger again, and then do the following commands:

bt
print module
print *module
print row
print column
print *var
print *request
print *response

Best regards,
Victor
Title: Re: Crashing after upgrade to 2.2.15
Post by: troffasky on June 10, 2019, 04:06:42 PM
Not sure if "optimised out" means I'm not using the -dbg packages properly or not, but they are installed.


2019.06.10 14:00:38.335 *D* [poll.conf          ] Starting configuration poll for node analyzer.example.co.uk (ID: 1586)
2019.06.10 14:00:38.335 *D* [poll.conf          ] ConfPoll(analyzer.example.co.uk): checking for NetXMS agent Flags={02000000} DynamicFlags={00000003}
2019.06.10 14:00:38.335 *D* [poll.conf          ] ConfPoll(analyzer.example.co.uk): calling SnmpCheckCommSettings()

Thread 70 "$POLLERS/WRK" received signal SIGBUS, Bus error.
[Switching to Thread 0x7fffe46bd700 (LWP 13158)]
0x00007fffea4777ce in HandlerPhysicalPorts (var=<optimised out>, snmp=snmp@entry=0x7fffe2309500, arg=arg@entry=0x7fffe46a9d50) at sb.cpp:113
113     sb.cpp: No such file or directory.
(gdb) bt
#0  0x00007fffea4777ce in HandlerPhysicalPorts (var=<optimised out>, snmp=snmp@entry=0x7fffe2309500, arg=arg@entry=0x7fffe46a9d50) at sb.cpp:113
#1  0x00007ffff6871da7 in SnmpWalk (transport=transport@entry=0x7fffe2309500, rootOid=rootOid@entry=0x7fffe46a9ae0, rootOidLen=14,
    handler=handler@entry=0x7fffea477600 <HandlerPhysicalPorts(SNMP_Variable*, SNMP_Transport*, void*)>, userArg=userArg@entry=0x7fffe46a9d50, logErrors=<optimised out>, failOnShutdown=false) at util.cpp:334
#2  0x00007ffff6871eea in SnmpWalk (transport=transport@entry=0x7fffe2309500, rootOid=rootOid@entry=0x7fffea478310 L".1.3.6.1.4.1.9.6.1.101.53.3.1.5",
    handler=handler@entry=0x7fffea477600 <HandlerPhysicalPorts(SNMP_Variable*, SNMP_Transport*, void*)>, userArg=userArg@entry=0x7fffe46a9d50, logErrors=logErrors@entry=false, failOnShutdown=failOnShutdown@entry=false)
    at util.cpp:265
#3  0x00007fffea477a3a in CiscoSbDriver::getPhysicalPortLayout (this=<optimised out>, snmp=0x7fffe2309500, layout=0x7fffe46a9d50) at sb.cpp:126
#4  0x00007fffea477aa2 in CiscoSbDriver::getModuleLayout (this=<optimised out>, snmp=<optimised out>, attributes=<optimised out>, driverData=<optimised out>, module=1, layout=0x7fffe46ae340) at sb.cpp:217
#5  0x00007ffff7a22b5c in Node::confPollSnmp (this=0x7fffe6729000, rqId=0) at node.cpp:3486
#6  0x00007ffff7a2cfe4 in Node::configurationPoll (this=this@entry=0x7fffe6729000, pSession=pSession@entry=0x0, rqId=rqId@entry=0, poller=poller@entry=0x7fffec83c500, maskBits=maskBits@entry=0) at node.cpp:2724
#7  0x00007ffff7a2da93 in Node::configurationPoll (this=0x7fffe6729000, poller=0x7fffec83c500) at node.cpp:2652
#8  0x00007ffff79b9e41 in __ThreadPoolExecute_Wrapper_1<Node, PollerInfo*> (arg=0x7fffec80e160) at ../../../include/nms_threads.h:1215
#9  0x00007ffff74eb398 in ProcessSerializedRequests (data=0x7fffec80e180) at tp.cpp:466
#10 0x00007ffff74eb19b in WorkerThread (arg=0x7fffec803640) at tp.cpp:186
#11 0x00007ffff6e706db in start_thread (arg=0x7fffe46bd700) at pthread_create.c:463
#12 0x00007ffff6b9988f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
(gdb) print module
$1 = (SB_MODULE_LAYOUT *) 0x7fffe46a9d50
(gdb) print *module
$2 = {index = 1, minIfIndex = 49, maxIfIndex = 107, rows = 2, columns = 12, interfaces = {49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 0 <repeats 20 times>, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72,
    0 <repeats 20 times>}}
(gdb) print row
$3 = <optimised out>
(gdb) print column
$4 = <optimised out>
(gdb) print *var
value has been optimised out
(gdb) print *requestprint *request
No symbol "requestprint" in current context.
(gdb) print *request
No symbol "operator*" in current context.
(gdb) print *response
$5 = {m_version = 1, m_command = 2, m_variables = 0x7fffe2323780, m_pEnterprise = 0x0, m_trapType = 0, m_specificTrap = 0, m_dwTimeStamp = 0, m_dwAgentAddr = 0, m_dwRqId = 1044, m_dwErrorCode = 0, m_dwErrorIndex = 0,
  m_msgId = 0, m_msgMaxSize = 65536, m_contextEngineId = '\000' <repeats 255 times>, m_contextEngineIdLen = 0, m_contextName = '\000' <repeats 255 times>, m_salt = "\000\000\000\000\000\000\000", m_reportable = true,
  m_flags = 0 '\000', m_authObject = 0x7fffe2301f30 "8eareWa7ching", m_authoritativeEngine = {m_id = '\000' <repeats 255 times>, m_idLen = 0, m_engineBoots = 0, m_engineTime = 0}, m_securityModel = 1,
  m_signature = '\000' <repeats 11 times>, m_signatureOffset = 0}
(gdb) quit
Title: Re: Crashing after upgrade to 2.2.15
Post by: troffasky on September 16, 2019, 07:59:15 PM
Version 3 seems to have fixed this. Been up for 45 minutes now.