node down status

Started by k_teru, January 04, 2011, 07:35:29 AM

Previous topic - Next topic

k_teru

Dear all,

Agent Status & status returns a normal value "0" though has passed since the server was downed ten minutes from five minutes.

Why is not abnormality detected?

Which configuration does only have to be changed though it wants to be going to detect abnormality of the server in about one minute?

The down of "NIC" detection is also slow. It takes 15 minutes from about five minutes.
Isn't there improvement method?

Manager ----- OS Windows server 2008  64bit
                    NetXMS Manager 1.0.8     32bit
                    NetXMS Agent 1.1.0        64bit
Agent   ----- OS Redhat 5.5                 64bit
                   NetXMS Agent 1.0.8         64bit

Victor Kirhenshtein

Hi!

Most likely server is overloaded for some reason. How many nodes you are monitoring? Please check queue sizes by running

nxadm -c "sh q"

on NetXMS server.

Best regards,
Victor

k_teru

#2
Hello!

Thank you for your reply.

The number of watch nodes is 30.  
The number of interfaces is 380.
In the object node, there is an interface of 110.

nxadm -c "sh q"
--------------------
Condition poller                 : 0
Configuration poller             : 0
Data collector                   : 0
Database writer                  : 0
Event processor                  : 0
Network discovery poller         : 0
Node poller                      : 0
Routing table poller             : 0
Status poller                    : 0
-----------------------
When it is normal.

-----------------------
Condition poller                 : 0
Configuration poller             : 0
Data collector                   : 710
Database writer                  : 0
Event processor                  : 0
Network discovery poller         : 0
Node poller                      : 0
Routing table poller             : 0
Status poller                    : 0
---------------------
Node down to its previous state NetXMS recognized.

nxadm -c "sh p"
---------------------
S   05/Jan/2011 22:05:48   poll: xxxxx [89] - child poll      <----"Cluster"
---------------------
This status after the server shuts down, NetXMS be detected until you are down.

Best regards,

Victor Kirhenshtein

Do I understand correctly that problem is with cluster node? Or this can happen when any node is down? Also, do you use DCI transformation scripts?

Best regards,
Victor

k_teru

Hi Victor,

It re-tested. They are Cluster and Container. a result -- having been the same .
Cliuster did not have a reaction by "check agent" by Cluster for about 5 minutes.

Container did not have a reaction by "check agent" by NODE for about 10 minutes.

Cluster
nxadm -c "sh p"
PT  TIME                   STATE
S   07/Jan/2011 12:10:02   wait
------- cut -------
S   07/Jan/2011 12:04:45   poll: CLUSTER [89] - check agent
S   07/Jan/2011 12:10:17   wait
------- cut -------
R   07/Jan/2011 12:05:32   poll: SERVER NODE [50]
------- cut -------

Container
nxadm -c "sh p"
PT  TIME                   STATE
------- cut -------
S   07/Jan/2011 12:36:57   wait
S   07/Jan/2011 12:28:15   poll: SERVER NODE [50] - check agent
------- cut -------
R   07/Jan/2011 12:28:17   poll: SERVER NODE [50]
------- cut -------

I am using the DCI transformation scripts.
However, it doesn't use it in the object node.