TABLE DCI goes DOWN for a lot of rows

Started by civieroc, April 08, 2016, 11:43:12 AM

Previous topic - Next topic

civieroc

Hi all,

I have a Table DCI, when the result is with more than ..for example 150 rows, the agent can't recover the data, it retries the operation with the same result and at the end, the agent restarts itself and after it jumps the dci table.

This with the netxms 1.7 version, you know if this problem it's resolved with the 2.3 version?

Thanks a lot!
Claudio.

civieroc

Hi, I updated netxms to 2.0.3 version but the problem is the same..

I lose the result of DCI TABLE query, not for all my queries but for a query that return a lot of rows as result ( es. 150 rows)
Not ever, but too often, the result data are not reliable.

I start the agent in cmd command with the log at 5 level and I could see that the agent request the query but the result is an error:

MSGRECV_CLOSED

and the result list of the query table is empty.

civieroc

the log (name of query and ip are fictional):

can somebady help me? very thanks!!

[13-Apr-2016 09:45:28.649] [session:6] Requesting table "query_test1"
[13-Apr-2016 09:45:32.643] [session:6] Message receiving error (MSGRECV_CLOSED)
[13-Apr-2016 09:45:32.643] Incoming connection from 192.168.7.1
[13-Apr-2016 09:45:32.643] Connection from 192.168.7.1 accepted
[13-Apr-2016 09:45:32.659] [session:7] Server ID set to 2B865D6C80000D2D
[13-Apr-2016 09:45:32.659] 0 SNMP targets received from server 2B865D6C80000D2D
[13-Apr-2016 09:45:32.659] 0 data collection elements received from server 2B865
D6C80000D2D
[13-Apr-2016 09:45:32.659] Data collection for server 2B865D6C80000D2D reconfigu
red
[13-Apr-2016 09:45:32.659] [session:7] Requesting table "query_test1"
[13-Apr-2016 09:45:33.548] [session:0] Session with 192.168.7.1 closed
[13-Apr-2016 09:45:36.653] [session:7] Message receiving error (MSGRECV_CLOSED)
[13-Apr-2016 09:45:36.653] Incoming connection from 192.168.7.1
[13-Apr-2016 09:45:36.653] Connection from 192.168.7.1 accepted
[13-Apr-2016 09:45:36.668] [session:0] Server ID set to 2B865D6C80000D2D
[13-Apr-2016 09:45:36.668] 0 SNMP targets received from server 2B865D6C80000D2D
[13-Apr-2016 09:45:36.668] 0 data collection elements received from server 2B865
D6C80000D2D
[13-Apr-2016 09:45:36.668] Data collection for server 2B865D6C80000D2D reconfigu
red
[13-Apr-2016 09:45:36.668] [session:0] Requesting table "query_test1"
[13-Apr-2016 09:45:37.011] [session:6] Session with 192.168.7.1 closed
[13-Apr-2016 09:45:37.105] [session:1] Session with 192.168.7.1 closed
[13-Apr-2016 09:45:40.662] [session:0] Message receiving error (MSGRECV_CLOSED)
[13-Apr-2016 09:45:40.662] Incoming connection from 192.168.7.1
[13-Apr-2016 09:45:40.662] Connection from 192.168.7.1 accepted
[13-Apr-2016 09:45:40.662] [session:1] Server ID set to 2B865D6C80000D2D
[13-Apr-2016 09:45:40.678] 0 SNMP targets received from server 2B865D6C80000D2D
[13-Apr-2016 09:45:40.678] 0 data collection elements received from server 2B865
D6C80000D2D
[13-Apr-2016 09:45:40.678] Data collection for server 2B865D6C80000D2D reconfigu
red
[13-Apr-2016 09:45:40.678] [session:1] Requesting parameter "query_test2"
[13-Apr-2016 09:45:44.671] [session:1] Message receiving error (MSGRECV_CLOSED)

Victor Kirhenshtein

Hi,

it seems that query executes for too long and server reset connection to the agent because of request timeout. You can try to increase timeout by changing AgentCommandTimeout on server (it is in milliseconds), but you should note that too big timeout may cause delays in data collection.

Best regards,
Victor

civieroc

Many thanks Victor! You are very kind! :)

ok there were two problem, the first was the timeout to 4000 ms, my query use 8 seconds for work, but there is another problem:

My query return to much data for the agent: 150 rows.
If I use top 20 rows, it works but I need the real result..

Now with the agent in cmd (and the AgentCommandTimeout for test setted to 90000 ms) I have this log (after 1min and 1 sec from the table query request):

Session disconnected by timeout (last activity timeout is ....)

If I test my query with nxget.exe with this command:
nxget.exe -l ipmymachine -T -w 30 query_test

I have this result:

Received too large message CMD_REQUEST_COMPLETED (15703016 bytes)
408: Request timeout

It's possible to increase the message limit?

thanks,
Claudio.


civieroc