NetXMS CPU Usage High

clifford · August 10, 2018, 10:01:09 AM

Hi Victor,

I rebuild the server with-zlib option but situation is same. after few minutes CPU goes high. I have captured the threads with script which you had shared earlier. Please find the attached screenshot with thread output files.

Thank you
Clifford.

Victor Kirhenshtein · August 10, 2018, 11:11:46 AM

It hangs in same place. It looks like zlib bug, for deeper debugging we need actual data being compressed. As a workaround you can try to disable NXCP compression for client sessions by commenting out lines 1894 and 1895 (should looks like

Code Select


            m_dwFlags |= CSF_COMPRESSION_ENABLED;
            msg.setField(VID_ENABLE_COMPRESSION, true);

in src/server/core/session.cpp and recompile server.

Best regards,
Victor

clifford · August 10, 2018, 12:57:33 PM

Hi Victor,

I check in src/server/core/session.cpp line 1894 abd 1895 are already commented out. Please find the attached snapshot.

Victor Kirhenshtein · August 10, 2018, 01:54:43 PM

No, they are not. Put // in front of each line.

Best regards,
Victor

clifford · August 10, 2018, 02:00:12 PM

Hi Victor,

I recompiled after commenting out the 2 lines as suggested. However the problem remains the same.
Actually we have 2 different Servers for which we recently upgraded from 2.2.5 to 2.2.7 one of these server is running fine after upgrade. Issue is with only this Server, earlier this Server was working Good.

Thank you
Clifford

clifford · August 21, 2018, 08:15:46 AM

Hello Victor,

Any update regarding the issue which we are facing?

Thanks
Clifford.

Tatjana Dubrovica · August 22, 2018, 05:46:20 PM

What is difference between servers? Number of nodes, different OS?

clifford · August 23, 2018, 02:17:53 PM

Hi Tatjana,

Both Servers are Virtual with same configuration and OS (Centos 7). Please see the node stats of both Server below.

Server with issue:

NetXMS Server Remote Console V2.2.7 Ready
Enter "help" for command list

netxmsd: show stats
Objects............ 35197
Monitored nodes.... 1122
Collectible DCIs... 2146
Active alarms...... 416

Server working fine:

NetXMS Server Remote Console V2.2.7 Ready
Enter "help" for command list

netxmsd: show stats
Objects............ 32410
Monitored nodes.... 853
Collectible DCIs... 288
Active alarms...... 232

Thanks
Clifford

Tatjana Dubrovica · August 27, 2018, 01:44:07 PM

Looks like you have really a lot of object updates and server is unable to delivere all updates to the client so everything just stuck. I'll rework object update messages, but can't promise the release number where it will get in.

clifford · August 28, 2018, 07:34:31 AM

Hi

I didn't get the "server is unable to deliver all updates to the client" part, cause almost all objects are switches and routers, so all we expect from the NMS is to report when the node or link is down. so all of it is SNMP Queries to the clients to check for update status

also it is important for me to view the link status as in my below query

https://www.netxms.org/forum/configuration/unable-to-get-bandwidth-details-on-map/msg23166/#msg23166

hope this gets implemented, it would be the greatest thing for me,

My NMS would be super complete

Regards
Clifford

Tursiops · August 28, 2018, 08:04:35 AM

I believe Tatjana means updates to be sent from the server to the Management Console (client), i.e. either the server can't send new information to the client fast enough or the client can't accept it fast enough, thus throttling the server. Once that happens, the queue of updates to send just keeps increasing as it cannot catch up. It'll probably do that until the server runs out of resources and trips over - or until the console is closed. The latter explains why your server performance came good whenever you closed the console.

Have you considered using the Web Console for testing (sry, we're using Ubuntu, so can't give any guidance on the CentOS process)?
As that can be installed on the server itself, it would effectively remove the network from the server to console communication.

clifford · August 28, 2018, 08:44:14 AM

Hi

Cool! i'll try the web console

was wondering will shifting to ubuntu help?

Regards

Clifford Dsouza

Tursiops · August 29, 2018, 05:59:00 AM

I haven't run a NetXMS server on anything other than Ubuntu, so I really can't tell if this is any better/worse than the other options out there.
One of our reasons for using Ubuntu (other than personal preference) was the availability of NetXMS packages directly from the developers.

clifford · September 10, 2018, 09:33:56 AM

Hi,

We have just upgraded Netxms Server to 2.2.8, now the response seems normal from Management console however CPU utilization is showing 300% and above, below are the stats after the upgrade. Just wanted to know if it is normal.

NetXMS Server Remote Console V2.2.8 Ready
Enter "help" for command list

netxmsd: show threads
MAIN
Threads.............. 256 (8/256)
Load average......... 7838118.55 7292039.09 5187107.27
Current load......... 3084405%
Usage................ 100%
Active requests...... 7896078
Scheduled requests... 0
Total requests....... 12741590
Thread starts........ 248
Thread stops......... 0
Average wait time.... 1641274 ms

POLLERS
Threads.............. 250 (10/250)
Load average......... 103.76 109.10 117.88
Current load......... 83%
Usage................ 100%
Active requests...... 208
Scheduled requests... 0
Total requests....... 30786539
Thread starts........ 240
Thread stops......... 0
Average wait time.... 221 ms

DATACOLL
Threads.............. 96 (10/250)
Load average......... 4.58 5.48 5.53
Current load......... 1%
Usage................ 38%
Active requests...... 1
Scheduled requests... 0
Total requests....... 17575738
Thread starts........ 382
Thread stops......... 296
Average wait time.... 0 ms

SCHEDULER
Threads.............. 1 (1/64)
Load average......... 0.00 0.00 0.00
Current load......... 0%
Usage................ 1%
Active requests...... 0
Scheduled requests... 0
Total requests....... 1750
Thread starts........ 0
Thread stops......... 0
Average wait time.... 0 ms

AGENT
Threads.............. 4 (4/256)
Load average......... 0.00 0.00 0.00
Current load......... 0%
Usage................ 1%
Active requests...... 0
Scheduled requests... 0
Total requests....... 0
Thread starts........ 0
Thread stops......... 0
Average wait time.... 0 ms

CLIENT
Threads.............. 16 (16/512)
Load average......... 0.00 0.00 0.00
Current load......... 0%
Usage................ 3%
Active requests...... 0
Scheduled requests... 0
Total requests....... 4586
Thread starts........ 0
Thread stops......... 0
Average wait time.... 0 ms

Thank you
Clifford.

Victor Kirhenshtein · September 10, 2018, 10:47:22 AM

Hi,

load on thread pool MAIN is definitely not normal. Could you please capture thread stack traces using attached script (you will need gdb installed)?

Best regards,
Victor

NetXMS Support Forum

News:

NetXMS CPU Usage High

clifford

Victor Kirhenshtein

clifford

Victor Kirhenshtein

clifford

clifford

Tatjana Dubrovica

clifford

Tatjana Dubrovica

clifford

Tursiops

clifford

Tursiops

clifford

Victor Kirhenshtein