Clustering with SNMP Proxy

Started by Sumit Pandya, November 11, 2010, 08:06:34 AM

Previous topic - Next topic

Sumit Pandya

Hi,
   As we all know NetXMS doesnot support Port into SNMP and we end up setting NetSNMP as front-end. I spent almost 3 days to make a good NetSNMP setup for community based forwarding. For notes you must require 5.4 or higher for community based routing to remote.
   Now my real problem atart and before i'd like to explain my RADIUS (Remote Access Dial-In Server) cluster set-up

NetXMS-1.0.6 -> Net-SNMP-Proxy -> Firewall ---> RADIUS-1 (SNMP-Agent1-1:1161 and SNMP-Agent1-2:1162)
                                                                    RADIUS-2 (SNMP-Agent1-3:1161 and SNMP-Agent1-4:1162)
                                                                    RADIUS-3 (SNMP-Agent1-5:1161 and SNMP-Agent1-6:1162)

I created one node for "SNMP-Proxy" and a "RADIUS-cluster". I registered IP of all 3 RADIUS into "Object properties -> Resources" of RADIUS-Cluster. I created 3 nodes with-in RADIUS-cluster, where each node having "Community string" RADIUS-Auth-1, RADIUS-Auth-2, RADIUS-Auth-3 respectively. I disabled all pooling except "usage of SNMP for all pools" and "data collection" check-box.

Observation 1:
   Initially I defined SNMP OID at global RADIUS-cluster level and then from Net-SNMP log I see that request contains cluster level community anme "public

Observation 2:
   I removed cluster level SNMP setting and defined at individual node level only. Still i cannot see get-request with defined community names into net-snmp log.

   Please help me into setting up NetXMS.

P.S. When I start my WindowsXP without any network cable connected then NetXMS hangs my login/explorer. Once even I saw some "Illegal reference" related error as well. Please have a BUG-Fix.

Victor Kirhenshtein

Hi!

Am I understand your correctly - you have one three-node cluster, and each node has two SNMP agents?

Right now the only working solution I could imagine is to configure lot of external parameters on agent which will call nxsnmpget - but this cannot be considered as good solution. I think that easiest way is to allow change of SNMP port on node and DCI basis - this is relatively easy change and will solve this problem. I'll try to implement it in 1.0.8 release.

Best regards,
Victor

Sumit Pandya

Quote from: Victor Kirhenshtein on November 11, 2010, 07:36:09 PM
Hi!

Am I understand your correctly - you have one three-node cluster, and each node has two SNMP agents?

Right now the only working solution I could imagine is to configure lot of external parameters on agent which will call nxsnmpget - but this cannot be considered as good solution. I think that easiest way is to allow change of SNMP port on node and DCI basis - this is relatively easy change and will solve this problem. I'll try to implement it in 1.0.8 release.

Best regards,
Victor

Yes you have rightly understtod my setup. Further I'm sure that having each SNMP-OID/DCI basis port is eagerly awaited by many of NetXMS users. Parallelly please look into possible BUG of "not using configured community-name while communicating with SNMP-Proxy".

Sumit Pandya

Quote from: Victor Kirhenshtein on November 11, 2010, 07:36:09 PM
I'll try to implement it in 1.0.8 release.
Victor, When 1.0.8 will be made available with SNMP port option? Without that I cannot use NextXMS at all. I'm quite eager to monitor our software with NetXMS and I've to prepare my team as end user as well.
Thanks for your kind consideration.

Victor Kirhenshtein

Hi!

I plan 1.0.8 to be mostly a bugfix release (plus custom SNMP port feature), and if everything will go as expected, I'll will release it in a week.

Best regards,
Victor

Sumit Pandya

I installed 1.0.8 and configure a node with custom port number. I see that NetXMS still not able to do data collection over SNMP custom ports. As soon as I add node NetXMS send SNMP-Query for 1.3.6.1.2.1.1.2.0 (4 retransmission with V1 and 4 retransmission with V2c) and 1.3.6.1.4.1.2620.1.1.10.0 (4 retransmission with V1 and no V2c)
Then I went into node-properties and selected SNMP UDP Port to 1161 under "Communication" tab. Further I added my RADIUS-OID i.e. .1.3.6.1.2.1.67.1.1.1.1.5 with proper custom SNMP port selection. Now I could see snmp requests (custom port) for OID .1.3.6.1.2.1.67.1.1.1.1.5  and 1.3.6.1.2.1.1.2.0 into Wireshark. I could see proper response from my agent as well.
Note that my SNMP Agent support only RADIUS-MIB and it sends "noSuchName" error-response to unsupported OIDs like 1.3.6.1.2.1.1.2.0 and 1.3.6.1.4.1.2620.1.1.10.0. My agent is behind firewall and ping is also blocked.
I think NetXMS is dependent on response for those 2 basic OIDs i.e. 1.3.6.1.2.1.1.2.0 and 1.3.6.1.4.1.2620.1.1.10.0.
I see some UDP packets for destination-port 260. Why that traffic is seen/originated?
Please advise on how to configure a node with custom SNMP Port.

Sumit Pandya

#6
In the last try, i recreated node with "Unmanaged object". I first configured everything and disabled NetXMS agent, ICMP ping, status poll, configuration poll, routing table poll, ifXTable for pooling interfacec into "Polling" tab of properties. Then I started "Manage" this node. I'm happy to see NetXMS working properly. Now SNMP-Query with my OID is only traffic b/w NetXMS and our RADIUS server.


Victor, for your kind attention I'd like to see a graph with combined (sum) value from 2 (or more) similar nodes. As you know I've 2 machines running same software and there is a load-balancer. There is no "official" clustering but it can be seen as server-farm. Please guide me how to achieve that?

Victor Kirhenshtein

Hi!

During configuration poll, NetXMS server tries to detect SNMP agent by sending requests for OID .1.3.6.1.2.1.1.2.0. If agent present, but does not respond to this request, it is considered as unavailable. Later during data collection, if node is not marked as SNMP-capable, SNMP DCIs not collected - the only exception is DCIs with custom SNMP ports. If you SNMP agent which does not support standard MIB-2 OIDs, then better way is to disable SNMP for given node and set port number explicitly for each SNMP DCI.

SNMP requests to port 260 generated during configuration poll in attempt to detect Check Point SNMP agent.

Best regards,
Victor

Sumit Pandya

Victor,
    Any though/suggestion for combined/sum graph?

Victor Kirhenshtein

Hi!

You can use transformation scripts to get values from different nodes and sum them up to value of current DCI. For example, if you have DCIs named "Param1" on nodes node1 and node2, you can create the following transformation script on node1 to add up value from node2:


node2 = FindNodeObject($node, "node2");
value = GetDCIValue(node2, FindDCIByDescription(node2, "Param1"));
return $1 + value;


See also the following topics:
https://www.netxms.org/forum/configuration/findnodeobject-not-working-%28for-me%29/
https://www.netxms.org/forum/configuration/nxsl-and-dci/

Please also note that if you are using server older than 1.0.8, you must explicitly define main function in your script, i.e.


sub main()
{
   node2 = FindNodeObject($node, "node2");
   value = GetDCIValue(node2, FindDCIByDescription(node2, "Param1"));
   return $1 + value;
}


Best regards,
Victor

Sumit Pandya

#10
Dear Victor, The script you shared did not worked. I tried to "Test" above transformation with Test value = 2(random), I get error "Error 14 in line 2: Left argument of -> must be a reference to object". I further added a check
if (node2 == null)
      return 0;   // No such node or access denied

and surprisingly I get this "0" returned. Please note that my nodes are part of cluster. Is that could be limitation?

Using the approach you suggest, I may loose information of my one node. I just wonder if you can have some push method from scripting. Can I create a dummy node, define data collection with origion "Push Agent". Then define a schedule to run every minute "1 * * * *". Finally do something like below into Transformation?

node1 = FindNodeObject($node, "system-1");
value1 = GetDCIValue(node1, FindDCIByDescription(node1, "Auth-Req"));
node2 = FindNodeObject($node, "system-2");
value2 = GetDCIValue(node2, FindDCIByDescription(node2, "Auth-Req"));
return value1 + value2;

I'd even fine if database triger level of twick can be done as some very advance level.


Victor Kirhenshtein

Quote from: Sumit Pandya on December 06, 2010, 04:33:54 PM
Dear Victor, The script you shared did not worked. I tried to "Test" above transformation with Test value = 2(random), I get error "Error 14 in line 2: Left argument of -> must be a reference to object". I further added a check
if (node2 == null)
      return 0;   // No such node or access denied

and surprisingly I get this "0" returned. Please note that my nodes are part of cluster. Is that could be limitation?

Did you set trusted nodes as described here: https://www.netxms.org/forum/configuration/nxsl-and-dci/msg4633/#msg4633? Most likely it's an access control issue.

Quote from: Sumit Pandya on December 06, 2010, 04:33:54 PM
Using the approach you suggest, I may loose information of my one node. I just wonder if you can have some push method from scripting. Can I create a dummy node, define data collection with origion "Push Agent". Then define a schedule to run every minute "1 * * * *". Finally do something like below into Transformation?

node1 = FindNodeObject($node, "system-1");
value1 = GetDCIValue(node1, FindDCIByDescription(node1, "Auth-Req"));
node2 = FindNodeObject($node, "system-2");
value2 = GetDCIValue(node2, FindDCIByDescription(node2, "Auth-Req"));
return value1 + value2;

I'd even fine if database triger level of twick can be done as some very advance level.

You can create additional DCI with source "Internal" and name "Dummy" (of course description can be any). It will always have value of 0, and you can schedule it like any other DCI and use transformation script to get values from two "real" DCIs and sum them.

Best regards,
Victor

Sumit Pandya

Script testing work but it doesnot gets scheduled. after some time DCI status get changed to "Not Supported". I did tried with both defaulting Interval to 60 seconds as well using advanced schedule "1 * * * *". Please guide me further.

About all other problems I think that databases changes takes very long time to get reflected into other modules/components. I request you to check commit type of activity at some configuration change activity. I suspect same could be causing database curruption while shutdown!!!

Victor Kirhenshtein

Could you please send me screenshots of your DCI configuration?

Best regards,
Victor

Sumit Pandya

#14
Find attached screen. Please note that Transformation script returns proper result while testing. Below is script for your quick reference (Though it is not required)
node1 = FindNodeObject($node, "SSTLDel52");
value1 = GetDCIValue(node1, FindDCIByDescription(node1, "AuthReq"));
node2 = FindNodeObject($node, "SSTLDel54");
value2 = GetDCIValue(node2, FindDCIByDescription(node2, "AuthReq"));
return value1 + value2;

After posting, I let "Parameter" name to be remained "Dummy". Now I see I can get proper values automatically by schedule. Cheers.

On another nodes: I followed your instruction about "Cluster node disapper" with patch. Now my nodes remains after restart.

I hope you have enough BUG registrar generate. Though all of my problems got resolved by workarounds :-)