XMPP delays.

Started by ofca, December 28, 2014, 01:27:53 AM

Previous topic - Next topic

ofca

I've successfully setup XMPP alerting (no thanks to the documentation, which is quite lacking in this department.)
Big kudos for supporting XMPP out of the box :)

However, there are long delays between log information about sending message and effectively sending the message. Did I do something wrong? I'm sending the message via action configured in event processing policy.

Victor Kirhenshtein

I didn't experience any delay myslef when I was testing this feature. Can you please enable debug logging (level 6) and check log messages started with XMPP:?

Best regards,
Victor

ofca

Relevant log entries below.
Please note, that almost 2 minutes passed between posting the message (15:34:51) and effectively sending it out (15:36:37).
Any ideas?


[28-Dec-2014 15:34:51.911] [DEBUG] Finished status poll for node 999999-TestNode (ID: 323)
[28-Dec-2014 15:34:51.911] [DEBUG] *actions* Executing action 1 (XMPP Test) of type XMPP MESSAGE
[28-Dec-2014 15:34:51.911] [DEBUG] *actions* Sending XMPP message to [JID1];[JID2]: "999999-TestNode - Node down"
[28-Dec-2014 15:35:01.435] [DEBUG] XMPP: RECV: <presence id="7QoiO-274" to="[NetXMS-JID]/20892836741419736745627465" from="[JID2]/androidO2Mp615A"><status/><priority>11</priority><show>away</show><c hash="sha-1" xmlns="http://jabber.org/protocol/caps" ver="xYEd+1ZdePfGl3AaJ23FB7rizRg=" node="http://www.igniterealtime.org/projects/smack/"/></presence>
[28-Dec-2014 15:35:26.959] [DEBUG] Node 323 "999999-TestNode" queued for status poll
[28-Dec-2014 15:35:26.960] [DEBUG] Starting status poll for node 999999-TestNode (ID: 323)
[28-Dec-2014 15:35:26.960] [DEBUG] StatusPoll(999999-TestNode): check SNMP
[28-Dec-2014 15:35:28.961] [DEBUG] StatusPoll(999999-TestNode): SNMP check finished
[28-Dec-2014 15:35:46.964] [DEBUG] StatusPoll(999999-TestNode): bAllDown=true, dynFlags=0x00001015
[28-Dec-2014 15:35:46.964] [DEBUG] Finished status poll for node 999999-TestNode (ID: 323)
[28-Dec-2014 15:36:17.055] [DEBUG] Node 323 "999999-TestNode" queued for status poll
[28-Dec-2014 15:36:17.055] [DEBUG] Starting status poll for node 999999-TestNode (ID: 323)
[28-Dec-2014 15:36:17.055] [DEBUG] StatusPoll(999999-TestNode): check SNMP
[28-Dec-2014 15:36:19.058] [DEBUG] StatusPoll(999999-TestNode): SNMP check finished
[28-Dec-2014 15:36:37.063] [DEBUG] StatusPoll(999999-TestNode): bAllDown=true, dynFlags=0x00001015
[28-Dec-2014 15:36:37.063] [DEBUG] Finished status poll for node 999999-TestNode (ID: 323)
[28-Dec-2014 15:36:37.324] [DEBUG] XMPP: SENT: <message to="[JID1]" type="chat"><body>999999-TestNode - Node down</body></message>
[28-Dec-2014 15:36:37.324] [DEBUG] XMPP: SENT: <message to="[JID2]" type="chat"><body>999999-TestNode - Node down</body></message>
[28-Dec-2014 15:36:37.324] [DEBUG] CorrelateEvent: event SYS_SNMP_UNREACHABLE id 2313744 source 999999-TestNode [323]

Victor Kirhenshtein

So far looks like problem inside XMPP library we are using (libstrophe). I'll try to setup XMPP on my test system and reproduce this issue.

Best regards,
Victor

ofca

#4
Just a note, that sending XMPP messages holds up the event processing thread. 7 events with xmpp action held up the queue for 10 minutes and 'show queues' on server console shown 206 waiting events.

tcpdump shows no communication with xmpp server until the exact moment the message is actually sent, so it looks like the problem is internal to netxms.

Oh, and it may be worth mentioning, that server console over xmpp works just fine, since above mentioned 'show queues' results were gathered via xmpp, so it would seem, like the library itself isn't the culprit.

Victor Kirhenshtein

Thanks for update, I'll look more carefully into XMPP action then.

Best regards,
Victor

Victor Kirhenshtein

Still seems that problem is with XMPP library. XMPP sender action waits on mutex to acquire XMPP handle. On what operating system your server is running? If it is Linux, can you attach gdb to server process after

*actions* Sending XMPP message to

line appears in log but before actual message is sent, and create dump of all threads using thread apply all bt command?

Best regards,
Victor

volkoff

I have the same problem.
Server 2008 R2.
Event queue is frozen for almost 10 minutes.
XMPP user subscribed and server answers to console command instantly even during "freeze".
Log fragments:

[27-Jan-2015 11:46:04.231] [DEBUG] Event 402 match EPP rule 15
[27-Jan-2015 11:46:04.231] [DEBUG] AlarmManager: adding new active alarm, current alarm count 14
[27-Jan-2015 11:46:04.231] [DEBUG] *actions* Executing action 2 (Send XMPP message to ***) of type XMPP MESSAGE
[27-Jan-2015 11:46:04.231] [DEBUG] *actions* Sending XMPP message to v***: "Test: Threshold reached for data collection item "Toner left" (Parameter: Dummy; Threshold value: 10; Actual value: 7.000000)"
...
[27-Jan-2015 11:47:21.856] [DEBUG] XMPP: RECV: <message id="***" to="e***" type="chat" from="v***"><body>Show wat</body><thread>**</thread><request xmlns="urn:xmpp:receipts"/></message>
[27-Jan-2015 11:47:21.856] [DEBUG] XMPP: Incoming message from v***: Show wat
[27-Jan-2015 11:47:21.856] [DEBUG] User admin authenticated for XMPP commands
[27-Jan-2015 11:47:21.856] [DEBUG] XMPP: SENT: <message to="v***" type="chat"><body>Thread                                           Interval Status
----------------------------------------------------------------------------
Item Poller                                      20       Running
Syncer Thread                                    130      Running
Poll Manager                                     60       Running

</body></message>
...
[27-Jan-2015 11:54:28.869] [DEBUG] XMPP: RECV: <message id="***" to="e***" type="chat" from="v***"><body>Show que</body><thread>***</thread><request xmlns="urn:xmpp:receipts"/></message>
[27-Jan-2015 11:54:28.869] [DEBUG] XMPP: Incoming message from v***: Show que
[27-Jan-2015 11:54:28.869] [DEBUG] User admin authenticated for XMPP commands
[27-Jan-2015 11:54:28.869] [DEBUG] XMPP: SENT: <message to="v***" type="chat"><body>Condition poller                 : 0
Configuration poller             : 0
Topology poller                  : 0
Data collector                   : 0
DCI cache loader                 : 0
Database writer                  : 1
Database writer (IData)          : 0
Database writer (raw DCI values) : 0
Event processor                  : 5
Network discovery poller         : 0
Node poller                      : 0
Routing table poller             : 0
Status poller                    : 0
Syslog processing                : 0
Syslog writer                    : 0

</body></message>
...
[27-Jan-2015 11:54:33.783] [DEBUG] XMPP: SENT: <message to="v***" type="chat"><body>Test: Threshold reached for data collection item &quot;Toner left&quot; (Parameter: Dummy; Threshold value: 10; Actual value: 7.000000)</body></message>
[27-Jan-2015 11:54:33.783] [DEBUG] Event 402 with code 17 passed event processing policy
[27-Jan-2015 11:54:33.783] [DEBUG] CorrelateEvent: event SYS_NODE_WARNING id 403 source Kyocera TA 181 [146]
[27-Jan-2015 11:54:33.783] [DEBUG] Database Connection Pool: handle 0000000000FBA400 acquired (call from .\evproc.cpp:103)
[27-Jan-2015 11:54:33.783] [DEBUG] CorrelateEvent: finished, rootId=0

ofca

I'll probably get to debugging with gdb this weekend, but if you could do it sooner then go ahead. Details are in Victor's post.

volkoff

Quote from: Victor Kirhenshtein on January 07, 2015, 12:57:39 AM
Still seems that problem is with XMPP library. XMPP sender action waits on mutex to acquire XMPP handle. On what operating system your server is running? If it is Linux, can you attach gdb to server process after
Quite unlikely. I did a quick review of the code. Mutex (which actually is a critical section) object is never passed to the XMPP library.
Remove the MutexLock/MutexUnlock in SendXMPPMessage since it looks single-threaded anyway and we'll see if it helps. Not very many of us, XMPP users :)

ofca

Yeah, that's sad. People use pointless things like google/facebook chat, and these aren't xmpp-compatible because 'why not'. I'm only missing http://xmpp.org/extensions/xep-0280.html - next to no client implements it.

djex81

Has any progress been made on this issue?

I am having this exact issue as well. ~5 min delay with the event processor stuck at 1 until the message is received in the XMPP client then it clears to 0. Sending commands to netxms via XMPP works instantly.

Windows Server 2008 R2 Standard.
NetXMS version 2.0-M3

What XMPP server is everyone using?


ofca

I'm using ejabberd, not that it matters. Since everybody using XMPP seeem to have this problem, I strongly believe that main problem with XMPP support in NetXMS is Victor not using XMPP :)

djex81

Quote from: ofca on April 10, 2015, 03:49:46 AM
I'm using ejabberd, not that it matters. Since everybody using XMPP seeem to have this problem, I strongly believe that main problem with XMPP support in NetXMS is Victor not using XMPP :)

Yea I'm using talk.google.com for now. I used XMPP in place of SMS notifications on my mobile phone. Works well besides the 5 - 7 min delay of the event processor in netxms. Maybe I'll work on a quick external program to send the XMPP alerts until this is fixed. Are you using Windows or Linux server for your netxms server?

ofca