Status Return Codes for Windows Services

Started by maxknight, December 18, 2007, 03:25:12 PM

Previous topic - Next topic

maxknight

Hi,

I just setup NetXMS in a test lab and I must say that it is a wonderful product. The problem that I face is that when I add a service check for World Wide Web Publishing services with the following parameter - System.ServiceState("W3SVC") - I do not get the proper result.

Can anyone help or suggest what status codes need to be used?

Thanks,
Maxknight

Victor Kirhenshtein

Hello!

System.ServiceState can return the following codes:

0 - service running;
1 - service paused;
2 - service starting (start pending);
3 - service pausing (pause pending);
4 - service starting after pause (continue pending);
5 - service stopping (stop pending);
6 - service stopped;
255 - unable to get current service state.

Hope this helps!

Best regards,
Victor

maxknight

Hi Victor,

Can't thank you enough.  :)

Wish you a Happy New Year.

weec


StarryTripper

How would I create alarms for a monitored service?

If I use "get" with the ServiceState, I get back 0...which means running.

So I create a threshold that says when not equal to 0.

That's all good, when it's true it triggers an event I created COP_SERVICE_DOWN. But actually I guess it  would be better to say "unknown" instead of down, as if 1 was returned it would actually be paused and not down.

Recommendations?

Victor Kirhenshtein

Hello!

If you wish to distinguish between different states, you can create multiple thresholds and events - for example, create separate event SERVICE_PAUSED and for service status DCI create additional threshold "when equal 1" and generate event SERVICE_PAUSED from it. It depends on your environment and requirements - there are no universal tips. From my experience, in most cases considering service as down if it's not running is ok, because it cannot process requests anyway, in whatever state it is, and it's the most important information - actual state is not so important.

Best regards,
Victor

StarryTripper

I guess for the most part, if it's not "started" then an alert should be generated.

Let's say I did want the detailed monitoring capability, and I create events for the different conditions...what would be appropriate to set the "when false" to

So lets say I have "when equal to 2" Generates event COP_SERVICE_STARTING

What would false be, COP_SERVICE_UNKNOWN?

But then it would report unknown for 0, which means started?

Are all of the thresholds evaluated (I know there's a check box, but I didn't really understand what it changed) and the one that matches best triggered?

I am making this difficult aren't I?

The only reason I ask is because some of the VMware Services on the Virtual Infrastructure Server I have had trouble with getting stuck in "Starting" and I would like to see that logged.

Victor Kirhenshtein

I suggest the following scheme:


conditionevent when trueevent when false
equal 1COP_SERVICE_STARTINGCOP_SERVICE_OK
not equal 0COP_SERVICE_DOWNCOP_SERVICE_OK

Uncheck "always process all thresholds". Threshold order is important.

Then you will get COP_SERVICE_STARTING if service is in starting state, COP_SERVICE_DOWN if service in state other than running or starting, and COP_SERVICE_OK when it returns to running state.

If you wish to create and terminate appropriate alarms accordingly, you can add the following rules to event processing policy:

Rule 1
Event: COP_SERVICE_STARTING, COP_SERVICE_DOWN
Alarm: Generate alarm, text: %m, key: SERVICE_PROBLEM_%i_%5

Rule 2
Event: COP_SERVICE_OK
Alarm: terminate alarm with key SERVICE_PROBLEM_%i_%3


Then you will have active alarm with appropriate text when service is not running, and it will be automatically terminated when service goes back online. If service will go from starting to, for example, stopped state, text of alarm generated by previous COP_SERVICE_STARTING event will be replaced by message text of COP_SERVICE_DOWN event, so you will have actual information in your alarm browser.

Hope this helps!

Best regards,
Victor

P.S. Also there is a description of threshold checking algorithm in NetXMS user manual, in section 5.2.3.3.

StarryTripper

Thanks

Couple more questions

in Rule 1 there is "SERVICE_PROBLEM_%i_%5" but in Rule 2 that can clear the alarm generated by Rule 1 it is "SERVICE_PROBLEM_%i_%3"

What are the %5 and %3 doing? Why are they different?

In the built-in event processing, for example service down and the event that clears it both end in %i_%1.

I understand %i is the unique ID of the event, so isn't that all that is really needed?

Victor Kirhenshtein

Hello!

%i is a unique identifier of the event's source object (usually node). If you plan to monitor only one service per node with these events, then using just %i is ok, but if you monitor more then one service running on same node and generate same events for diferent services, than you have to distinguish alarms nnot only by node id, but also by service. %3 and %5 is event-specific parameters, number 3 and 5 respectively. For events generated when threshold condition becomes true, parameters are following:

1) Parameter name
2) Item description
3) Threshold value
4) Actual value
5) Data collection item ID
6) Instance

For events generated when threshold condition returns to false, parameters are following:

1) Parameter name
2) Item description
3) Data collection item ID
4) Instance

So in my example I construct alarm key from node id and DCI id.

You can find list of parameters for any predefined event by opening Control Panel -> Events -> Edit appropriate event record and looking at the description field.
All possible macros for event processing policy can be found in NetXMS user manual or here: https://www.netxms.org/documentation/macros.shtml

Best regards,
Victor

StarryTripper

Everything is starting to make more sense now and you already answered my next question about the variables when a threshold is false, I didn't remember seeing that in the manual (I'm guessing I just missed it).

Thanks for the quick response.

I am currently evaluating network management solutions to implement. I have currently installed and configures HypericHQ, OpenNMS, (and the two aforementioned integrated with one another), Zabbix and ZenOSS. I actually on stumbled upon NetXMS while looking for how to do something in OpenNMS.

Victor Kirhenshtein

Quote from: StarryTripper on March 14, 2008, 02:05:32 PM
I am currently evaluating network management solutions to implement. I have currently installed and configures HypericHQ, OpenNMS, (and the two aforementioned integrated with one another), Zabbix and ZenOSS. I actually on stumbled upon NetXMS while looking for how to do something in OpenNMS.

And what's your impression for now? What is good or bad?

StarryTripper

The need for an agent is certainly a turn off. Especially since it seems that many of the values the agent allows you to retreive are available through SNMP (CPU utilization, Network Utilization, Disk Space) while I understand others are not (service status).

The lack of a robust Web interface and a W32 only management console is also a downside. Though, I am happy with the speed of the client server model as opposed to an AJAX GUI.

The ability to easily visualize the chain of events is nice. The alarm features in OpenNMS for example are cumbersome.

I will continue to work with it in my spare time as well as OpenNMS. I hope to reach a conclusion by July and purchase a support agreement with whomever I choose.

Victor Kirhenshtein

Quote from: StarryTripper on March 14, 2008, 09:45:03 PM
The need for an agent is certainly a turn off. Especially since it seems that many of the values the agent allows you to retreive are available through SNMP (CPU utilization, Network Utilization, Disk Space) while I understand others are not (service status).

Usage of NetXMS agents is not mandatory - you can use NetXMS in SNMP only environment as far as installed SNMP agents provides you with all required information. We create our own agents because it is usually easier to configure data collection from agent, and usage of an agent gives you some additional benefits:
- Strong encryption of connection between server and agents if needed (using AES-256, IDEA, Blowfish or 3DES);
- Proxy functionality - you cann access on host A via agent on host B, if host A not directly reacheable from NetXMS server (firewalled or NATed, etc.);
- SNMP proxy functionality - access remote SNMP devices not directly, but via NetXMS agent - can be useful if these SNMP devices not accesible directly or you wish to improve security;
- Execute commands on remote servers in reaction to events;
- You can extend agents easily;
- You can have centralized agent configs if you need.

Best regards,
Victor

Alex Kirhenshtein

Quote from: Victor Kirhenshtein on March 16, 2008, 09:05:15 PM
... and usage of an agent gives you some additional benefits:

Also, I should note, that you need to deploy them by hand only once. Later, when agent is up and running, you can upgrade it from management console in few clicks.