Data Collector queue size

Started by matthias, March 31, 2016, 03:26:43 PM

Previous topic - Next topic

matthias

Hello guys,

Our data collector queue floats between 3000-5000 items. Is that a normal amount or should this be closer to 0? If it is too high, do you have any idea on how we can finetune the settings to have a higher throughput of the data collection?

Thanks!

Kind regards,

Matthias

tomaskir

What version of NetXMS are you running?
How many nodes and how many DCIs in the system?

'Data collection poller's request queue for last minute' should always be pretty low, and should definitely not keep growing for longer periods.

matthias

Hi,

The version is 2.0.2.
Total objects: 61187
Monitored nodes: 1263
DCI's: 10430

'Data collection poller's request queue for last minute' isn't growing but keeps going up and down between 3000-5000:

tomaskir

In 'Server Configuration', what are your settings for:
'PollerThreadPoolBaseSize'
'PollerThreadPoolMaxSize'
'NumberOfDataCollectors'

How do the graphs for 'Server thread pool POLLERS' DCIs on the NetXMS server look like?

matthias

'PollerThreadPoolBaseSize' = 10
'PollerThreadPoolMaxSize' = 500 (I upgraded this from 250 today to troubleshoot the issue)
'NumberOfDataCollectors' = 512 (I upgraded this from 256 today to troubleshoot the issue)

I don't have those DCI's on the netXMS server. I am going to create them now. Which should I best create? Current Size and Load?
When I do show threads in the console I get the following value for POLLERS:


tomaskir

Ok, so after a bit of digging, Poller thread pool is not related to your issue.
The poller thread pool is used for all polls (topology, status, etc.).

NumberOfDataCollectors is the number of threads used for data collection.
There shouldnt be a need to increase it above 25 for an installation like yours.

What could cause this is too big agent/SNMP timeout combined with multiple DCIs timing out.

Other than that, I suggest going directly to Raden and opening a support ticket with them to resolve the issue, since its quite strange.
You can use the https://www.netxms.org/contact/ form.

StanHubble

I know this is an old thread but what a difference tuning these values makes!!!!!

from around 8000-10000 in the data collection queue to less than 100.

The symptoms were timeouts in nxmc when changing anything or even just getting current dci values on a node.

So I went looking at the dci's on the server they looked normal or there was not much change since we started using netxms........or at least 2.1RC1.  But we have added a lot of objects lately, so you need to keep an eye on these values.  I increased the value of NumberOfDataCollectors from 20 -> 40 -> 80 ->160 which leveled it off at about 1000.  I finally increased it to 200 and now it seems to be able to keep up.
--------------

FYI:
netxmsd: sho stat
Total number of objects:     38382
Number of monitored nodes:   4535
Number of collectable DCIs:  22988

netxmsd: sho q
Data collector                   : 0
DCI cache loader                 : 0
Database writer                  : 0
Database writer (IData)          : 0
Database writer (raw DCI values) : 0
Event processor                  : 0
Node poller                      : 0
Syslog processing                : 0
Syslog writer                    : 0

Hope this helps someone.