Hello
We are experiencing with crashing netxms after the upgrade to 1.2.17. This generally happens after 10 minutes of loading, setting up the server for CrashDumpLog does not produce a log file for program crash. Running the server with debug mode 9 also does not give any clear indication before netxms stops. We have currently tried this on a windows instance and on a linux instance, both appear to have the same issue. The only current indication of issues are from;
[03-Dec-2014 08:57:48.626] [ERROR] SQL query failed (Query = "INSERT INTO alarm_events (alarm_id,event_id,event_code,event_name,severity,source_object_id,event_timestamp,message) VALUES (?,?,?,?,?,?,?,?)"): Duplicate entry '405363-30146854' for key 'PRIMARY'
[03-Dec-2014 08:57:48.639] [ERROR] SQL query failed (Query = "INSERT INTO alarm_events (alarm_id,event_id,event_code,event_name,severity,source_object_id,event_timestamp,message) VALUES (?,?,?,?,?,?,?,?)"): Duplicate entry '405363-30146855' for key 'PRIMARY'
[03-Dec-2014 08:57:48.668] [ERROR] SQL query failed (Query = "INSERT INTO alarm_events (alarm_id,event_id,event_code,event_name,severity,source_object_id,event_timestamp,message) VALUES (?,?,?,?,?,?,?,?)"): Duplicate entry '405941-30146859' for key 'PRIMARY'
[03-Dec-2014 08:57:48.695] [ERROR] SQL query failed (Query = "INSERT INTO alarm_events (alarm_id,event_id,event_code,event_name,severity,source_object_id,event_timestamp,message) VALUES (?,?,?,?,?,?,?,?)"): Duplicate entry '405941-30146862' for key 'PRIMARY'
[03-Dec-2014 08:57:48.714] [ERROR] SQL query failed (Query = "INSERT INTO alarm_events (alarm_id,event_id,event_code,event_name,severity,source_object_id,event_timestamp,message) VALUES (?,?,?,?,?,?,?,?)"): Duplicate entry '405363-30146864' for key 'PRIMARY'
[03-Dec-2014 08:57:48.725] [ERROR] SQL query failed (Query = "INSERT INTO alarm_events (alarm_id,event_id,event_code,event_name,severity,source_object_id,event_timestamp,message) VALUES (?,?,?,?,?,?,?,?)"): Duplicate entry '406032-30146865' for key 'PRIMARY'
[03-Dec-2014 08:57:48.737] [ERROR] SQL query failed (Query = "INSERT INTO alarm_events (alarm_id,event_id,event_code,event_name,severity,source_object_id,event_timestamp,message) VALUES (?,?,?,?,?,?,?,?)"): Duplicate entry '405363-30146866' for key 'PRIMARY'
We have made no amendments to the Database structure. Deleting the content of the alarm_events table completely makes no difference. 'nxdbmgr check' passes all the current checks.
Not to sure what the next step should be. Not particularly clear how the BIGINT is shown as a hyphenated number?
Regards
Aron
			
			
			
				Hi,
if you have Linux installation, can you please run it under gdb?
Commands would be like this:
gdb /path/to/netxmsd
will show (gdb) prompt, then
run -D5
server will run in foreground. When crash will happen, (gdb) prompt will be shown again. Type
bt
and send me an output.
Best regards,
Victor
			
			
			
				Hello
Found a core dump for which I have a backtrace of, awaiting the gdb version to run and provide a live version;
(gdb) bt
#0  Node::topologyPoll (this=0xcbd8090, pSession=0x0, dwRqId=0, nPoller=115) at node.cpp:5329
#1  0xb7666ba1 in TopologyPoller (arg=0x73) at poll.cpp:586
#2  0xb7335d4c in start_thread () from /lib/i386-linux-gnu/libpthread.so.0
#3  0xb6f999de in clone () from /lib/i386-linux-gnu/libc.so.6
(gdb) bt full
#0  Node::topologyPoll (this=0xcbd8090, pSession=0x0, dwRqId=0, nPoller=115) at node.cpp:5329
        peerNode = 0x0
        ifaceFound = <optimized out>
        iface = 0x18eb61d8
        i = <optimized out>
        fdb = <optimized out>
        nbs = 0xa26a7f60
#1  0xb7666ba1 in TopologyPoller (arg=0x73) at poll.cpp:586
        node = 0xcbd8090
        szBuffer = L"poll: BLC-SW1 [110]\000]\000 [50574]\000]\000\000\060]\000]", '\000' <repeats 89 times>
#2  0xb7335d4c in start_thread () from /lib/i386-linux-gnu/libpthread.so.0
No symbol table info available.
#3  0xb6f999de in clone () from /lib/i386-linux-gnu/libc.so.6
No symbol table info available.
(gdb) thread apply all bt
Summary look at the node does not imply any particular issue;
Seems to be un-connected with the other SQL errors which are happening -D5
[03-Dec-2014 11:40:44.198] [ERROR] SQL query failed (Query = "INSERT INTO alarm_events (alarm_id,event_id,event_code,event_name,severity,source_object_id,event_timestamp,message) VALUES (?,?,?,?,?,?,?,?)"): Duplicate entry '405363-30157884' for key 'PRIMARY'
[Thread 0xa5107b40 (LWP 28681) exited]
[03-Dec-2014 11:40:44.272] [DEBUG] EVENT 52 (ID:30157885 F:0x0001 S:4 TAG:"") FROM Ldn-NetXMS: Database query failed (Query: INSERT INTO alarm_events (alarm_id,event_id,event_code,event_name,severity,source_object_id,event_timestamp,message) VALUES (?,?,?,?,?,?,?,?); Error: Duplicate entry '405363-30157075' for key 'PRIMARY')
[Thread 0x824fcb40 (LWP 27788) exited]
Regards
Aron
			
			
			
				Hi,
it's a bug that is already fixed in development branch (will be released as 2.0-M1 soon). In the meantime you can patch server code manually if you are building it from source:
In file src/server/core/node.cpp find line
             Node *peerNode = (Node *)FindObjectById(iface->getPeerNodeId(), OBJECT_NODE);
(should be around line 5329). Immediately after that line add the following code:
            if (peerNode == NULL)
            {
               iface->clearPeer();
               continue;
            }
Best regards,
Victor
			
			
			
				Thank you,
Will apply the fix now.
Regards
Aron