nxagentd v3.0.x crash with signal 11 on CentOS 7.2.511 64bit

Started by scomoletti, November 07, 2019, 08:12:14 PM

Previous topic - Next topic

scomoletti

I've tried nxagent-3.0.2258-linux-x86_64-static.tar.gz, nxagent-3.0.2357-linux-x86_64-static.tar.gz, and nxagent-3.0.2258-linux-x86_64-static.tar.gz all with the same results. Configuration poll crashes the agent 90% of the time. After it worked successfully one time if I do not run another config poll it would stay running and correctly poll dcis but if I click on the 'user sessions' tab it immediately crashes.

Server side is Netxms 3.0.2258 on CentOS 7.5.1804 and has been stable without issue.

abrt output with core and backtrace is attached.

Configuration full output as follows:
[07.11.2019 13:00:03] **** Poll request sent to server ****
[07.11.2019 13:00:03] Poll request accepted
[07.11.2019 13:00:03] Starting configuration poll for node nj-prod-analytics
[07.11.2019 13:00:03] Capability reset
[07.11.2019 13:00:03] Checking node's capabilities...
[07.11.2019 13:00:03]    Checking NetXMS agent...
[07.11.2019 13:00:03]    Connectivity with NetXMS agent restored
[07.11.2019 13:00:03]    NetXMS agent version changed to 3.0.2357
[07.11.2019 13:00:03]    Platform name changed to Linux-x86_64
[07.11.2019 13:00:03]    System description changed to Linux nj-prod-analytics-1 3.10.0-327.28.2.el7.x86_64 #1 SMP Wed Aug 3 11:11:39 UTC 2016 x86_64
[07.11.2019 13:00:15] Capability check finished
[07.11.2019 13:00:15] Checking interface configuration...
[07.11.2019 13:00:15] Unable to get interface list from node
[07.11.2019 13:00:15]    Interface "unknown" is no longer exist
[07.11.2019 13:00:15] Interface configuration check finished
[07.11.2019 13:00:15] Checking node name
[07.11.2019 13:00:15] Node name is OK
[07.11.2019 13:00:15] Reading list of installed software packages
[07.11.2019 13:00:15] Unable to get information about installed software packages
[07.11.2019 13:00:15] Reading list of installed hardware components
[07.11.2019 13:00:15] Cannot read hardware component information
[07.11.2019 13:00:15] Finished configuration poll for node nj-prod-analytics
[07.11.2019 13:00:15] Node configuration was changed after poll
[07.11.2019 13:00:15] **** Poll completed successfully ****


And debug logs from server at the time of the crash:
2019.11.07 18:00:03.226 *D* [agent.conn.48      ] Received message CMD_REQUEST_COMPLETED (4) from agent at 10.172.102.228
2019.11.07 18:00:03.226 *D* [client.session.2   ] Sending message CMD_POLLING_INFO (96 bytes)
2019.11.07 18:00:03.226 *D* [client.session.2   ] Message dump:
  ** 000000: 005A5000000000600000084D00000002 .ZP....`...M....
  ** 000010: 0000001C000000000000001700000000 ................
  ** 000020: 0000006C070000000000002D2020204E ...l.......-   N
  ** 000030: 6574584D53206167656E742076657273 etXMS agent vers
  ** 000040: 696F6E206368616E67656420746F2033 ion changed to 3
  ** 000050: 2E302E323335370D0A00000000000000 .0.2357.........
  ** code=0x005A (CMD_POLLING_INFO) version=5 flags=0x0000 id=2125 size=96 numFields=2
  ** 000000: [    28] INT32       23
  ** 000010: [   108] UTF8-STRING "   NetXMS agent version changed to 3.0.2357^M
"

2019.11.07 18:00:03.226 *D* [agent.conn.48      ] Sending message CMD_GET_PARAMETER (5) to agent at 10.172.102.228
2019.11.07 18:00:03.226 *D* [agent.conn.48      ] Received message CMD_REQUEST_COMPLETED (5) from agent at 10.172.102.228
2019.11.07 18:00:03.226 *D* [agent.conn.48      ] Sending message CMD_GET_PARAMETER (6) to agent at 10.172.102.228
2019.11.07 18:00:03.227 *D* [agent.conn.48      ] Received message CMD_REQUEST_COMPLETED (6) from agent at 10.172.102.228
2019.11.07 18:00:03.227 *D* [client.session.2   ] Sending message CMD_POLLING_INFO (88 bytes)
2019.11.07 18:00:03.227 *D* [client.session.2   ] Message dump:
  ** 000000: 005A5000000000580000084D00000002 .ZP....X...M....
  ** 000010: 0000001C000000000000001700000000 ................
  ** 000020: 0000006C070000000000002A20202050 ...l.......*   P
  ** 000030: 6C6174666F726D206E616D6520636861 latform name cha
  ** 000040: 6E67656420746F204C696E75782D7838 nged to Linux-x8
  ** 000050: 365F36340D0A0000 6_64....
  ** code=0x005A (CMD_POLLING_INFO) version=5 flags=0x0000 id=2125 size=88 numFields=2
  ** 000000: [    28] INT32       23
  ** 000010: [   108] UTF8-STRING "   Platform name changed to Linux-x86_64^M
"

2019.11.07 18:00:03.227 *D* [agent.conn.48      ] Sending message CMD_GET_PARAMETER (7) to agent at 10.172.102.228
2019.11.07 18:00:03.227 *D* [agent.conn.48      ] Received message CMD_REQUEST_COMPLETED (7) from agent at 10.172.102.228
2019.11.07 18:00:03.227 *D* [agent.conn.48      ] Sending message CMD_GET_PARAMETER (8) to agent at 10.172.102.228
2019.11.07 18:00:03.227 *D* [agent.conn.48      ] Received message CMD_REQUEST_COMPLETED (8) from agent at 10.172.102.228
2019.11.07 18:00:03.227 *D* [client.session.2   ] Sending compressed message CMD_POLLING_INFO (168 bytes)
2019.11.07 18:00:03.227 *D* [client.session.2   ] Message dump:
  ** 000000: 005A5040000000A80000084D00000002 [email protected]....
  ** 000010: 000000B078DA2D8EB10AC2300044B309 ....x.-....0.D..
  ** 000020: 82B38BC381734293485BDDC45541A8E2 .....sB.H[..UA..
  ** 000030: 28A109355293D2A4D08EFEB985781CBC (..5R........x..
  ** 000040: E5711C21644352D67FB68BC42F806A0A .q.!dCR...../.j.
  ** 000050: D17CA04DA87BDB45EB1DEA97728DD188 .|.M.{.E....r...
  ** 000060: 1E67EB8611EE4DBBDE6BAA9C6AA768EB .g....M..k..j.h.
  ** 000070: 403924E319CBA81405132513CCB4051B @9$.......%.....
  ** 000080: CBFC99EFB0E5A82E573CE689E3D04082 ........W<....@.
  ** 000090: F3C35CB9C7FD7682C8788E64AE96F389 ..\...v..x.d....
  ** 0000A0: 1F816E2622000000 ..n&"...
  ** code=0x005A (CMD_POLLING_INFO) version=5 flags=0x0040 id=2125 size=168 numFields=2
  ** 000000: [    28] INT32       23
  ** 000010: [   108] UTF8-STRING "   System description changed to Linux nj-prod-analytics-1 3.10.0-327.28.2.el7.x86_64 #1 SMP Wed Aug 3 11:11:39 UTC 2016 x86_64^M
"

2019.11.07 18:00:03.228 *D* [agent.conn.48      ] Sending message CMD_GET_PARAMETER (9) to agent at 10.172.102.228
2019.11.07 18:00:03.228 *D* [agent.conn.48      ] Received message CMD_REQUEST_COMPLETED (9) from agent at 10.172.102.228
2019.11.07 18:00:03.228 *D* [agent.conn.48      ] Sending message CMD_GET_PARAMETER_LIST (10) to agent at 10.172.102.228
2019.11.07 18:00:03.229 *D* [                   ] SQL request queued: DELETE FROM alarm_events WHERE alarm_id=67
2019.11.07 18:00:03.229 *D* [db.cpool           ] Handle 0x7f17f4021c20 acquired (call from dbwrite.cpp:257)
2019.11.07 18:00:03.230 *D* [db.cpool           ] Handle 0x7f17f40284c0 released
2019.11.07 18:00:03.230 *D* [event.proc         ] Event 251 with code 32 passed event processing policy
2019.11.07 18:00:03.230 *D* [                   ] NetObj::expandText(sourceObject=179 template='Node status changed to WARNING' alarm=0 event=252)
2019.11.07 18:00:03.230 *D* [event.corr         ] CorrelateEvent: event SYS_NODE_WARNING id 252 source nj-prod-analytics [179]
2019.11.07 18:00:03.230 *D* [db.cpool           ] Handle 0x7f17f4001220 acquired (call from evproc.cpp:113)
2019.11.07 18:00:03.230 *D* [event.corr         ] CorrelateEvent: finished, rootId=0
2019.11.07 18:00:03.230 *D* [event.proc         ] EVENT SYS_NODE_WARNING [7] (ID:252 F:0x0001 S:1 TAGS:"NodeStatus") FROM nj-prod-analytics: Node status changed to WARNING
2019.11.07 18:00:03.230 *D* [event.policy       ] EPP: processing event 252
2019.11.07 18:00:03.230 *D* [event.proc         ] Event 252 with code 7 passed event processing policy
2019.11.07 18:00:03.231 *D* [db.cpool           ] Handle 0x7f17f4021c20 released
2019.11.07 18:00:03.234 *D* [event.proc         ] EventLogger: DBExecute: id=251,code=32
2019.11.07 18:00:03.237 *D* [event.proc         ] EventLogger: DBExecute: id=252,code=7
2019.11.07 18:00:03.385 *D* [agent.conn.48      ] AgentConnection::ReceiverThread(): communication channel shutdown
2019.11.07 18:00:03.385 *D* [agent.conn.48      ] Receiver loop terminated
2019.11.07 18:00:03.385 *D* [agent.conn.47      ] AgentConnection::ReceiverThread(): communication channel shutdown
2019.11.07 18:00:03.385 *D* [agent.conn.47      ] Receiver loop terminated
2019.11.07 18:00:03.385 *D* [agent.conn.47      ] Closing communication channel
2019.11.07 18:00:03.385 *D* [agent.conn.47      ] Receiver thread stopped
2019.11.07 18:00:03.385 *D* [agent.conn.48      ] Closing communication channel
2019.11.07 18:00:03.385 *D* [agent.conn.48      ] Receiver thread stopped


Victor Kirhenshtein

Please provide agent configuration file and configure options used for build.

Best regards,
Victor

scomoletti

Agents were downloaded from https://netxms.org/download/releases/3.0/ and not built locally in my environment. Specific filenames are in the beginning of my email.
nxagent was started as "nxagentd -d -r 10.172.102.27"

initial config file:
MasterServers = 10.172.102.28, 10.172.102.27
LogFile = /var/log/nxagentd.log
FileStore = /var/nxagentd


This is the method I've used without issue in 2.2.x though that said I'm not opposed to compiling a static agent myself. The agent running on my netxms server which I did compile and is dynamically linked seems stable.

Victor Kirhenshtein

Sorry, I miss that first part with file names. Will check static builds, looks like this is issue specific to either static linking or non-UNICODE build. Usually you do not need static agents though - generic Linux agent (nxagent-3.0.2357-linux-x86_64.tar.gz) should work as well.

Best regards,
Victor