NetXMS Support Forum

English Support => General Support => Topic started by: Michal Hanajik on January 10, 2019, 03:23:35 PM

Title: After upgrade to 2.2.11 - Status of DCI changed to UNSUPPORTED
Post by: Michal Hanajik on January 10, 2019, 03:23:35 PM
Hello,

we recently upgraded to version 2.2.11 and after we are getting at night (around 1:30am) massive amount of warning (see attached picture).
Devices are mainly UPS but there are servers as well - different hardware with different versions of Windows.

Weird thing is, we have 3 same servers (hardware and software) but this alarm comes only from one of them. We tried to update agents to most recent version, but that did not helped.
All database upgrades, checks and fixes were applied.

Any tips or hints, where to look for problem or solution?

Thank you!
Title: Re: After upgrade to 2.2.11 - Status of DCI changed to UNSUPPORTED
Post by: Victor Kirhenshtein on January 11, 2019, 07:39:23 PM
Hi,

do those DCIs recover by itself afterwards? Could you set debug level for server to 7 around that time and check for lines with text GetItemFromSNMP? But keep in mind that debug level 7 will produce huge amount of logging, so depending on your system size it may not be a good idea). Could it be that housekeeper start time set around 1:30?

Best regards,
Victor
Title: Re: After upgrade to 2.2.11 - Status of DCI changed to UNSUPPORTED
Post by: Michal Hanajik on January 18, 2019, 01:17:36 PM
Hello and sorry for a bit late answer. We have been trying to figure this out over the last days.

So it has definitely something to do with housekeeper. It was set at 2am, when we changed it to different time, all those warning arose about that new time.

So after that I was able to easy backup logs :)
So this are some of those SNMP errors from log. If it will be more helpful, I can attach you whole log file.
We even updated to version 2.2.12, but still this is happening. If you need some more information let me know.


2019.01.18 11:02:43.357 *D* Node::connectToAgent(TLA-RPi-agent [1283]): already connected
2019.01.18 11:02:43.358 *D* [agent.conn.113     ] Sending message CMD_DATA_COLLECTION_CONFIG (1304) to agent at 87.197.117.89
2019.01.18 11:02:43.359 *D* [db.cpool           ] Handle 0x7f9bfae45e60 released
2019.01.18 11:02:43.359 *D* [db.cpool           ] Handle 0x7f9bfae46220 acquired (call from dbwrite.cpp:250)
2019.01.18 11:02:43.359 *D* [db.cpool           ] Handle 0x7f9bfae46040 released
2019.01.18 11:02:43.359 *D* [event.proc         ] Event 9218196 with code 53 passed event processing policy
2019.01.18 11:02:43.359 *D* [event.corr         ] CorrelateEvent: event SYS_DCI_UNSUPPORTED id 9218197 source GIM-CCTV-UPS1 [1365]
2019.01.18 11:02:43.359 *D* [event.corr         ] CorrelateEvent: finished, rootId=0
2019.01.18 11:02:43.359 *D* [event.proc         ] EVENT SYS_DCI_UNSUPPORTED [53] (ID:9218197 F:0x0001 S:2 TAG:"") FROM GIM-CCTV-UPS1: Status of DCI 9009 (SNMP: .1.3.6.1.4.1.935.1.1.1.3.2.1.0) changed to UNSUPPORTED
2019.01.18 11:02:43.359 *D* [event.policy       ] EPP: processing event 9218197
2019.01.18 11:02:43.359 *D* [event.policy       ] Event 9218197 match EPP rule 22
2019.01.18 11:02:43.359 *D* AlarmManager: adding new active alarm, current alarm count 18



2019.01.18 11:02:43.330 *D* [agent.conn.174     ] Sending message CMD_SNMP_REQUEST (43037) to agent at 212.55.237.190
2019.01.18 11:02:43.335 *D* [db.cpool           ] Handle 0x7f9bfae45aa0 released
2019.01.18 11:02:43.335 *D* [db.cpool           ] Handle 0x7f9bfae45e60 acquired (call from dbwrite.cpp:250)
2019.01.18 11:02:43.335 *D* [db.cpool           ] Handle 0x7f9bfae45c80 released
2019.01.18 11:02:43.335 *D* [event.proc         ] Event 9218195 with code 53 passed event processing policy
2019.01.18 11:02:43.335 *D* [event.corr         ] CorrelateEvent: event SYS_DCI_UNSUPPORTED id 9218196 source GIM-GXSW26-IVZ-EKON RIADITEL [1393]
2019.01.18 11:02:43.335 *D* [event.corr         ] CorrelateEvent: finished, rootId=0
2019.01.18 11:02:43.335 *D* [event.proc         ] EVENT SYS_DCI_UNSUPPORTED [53] (ID:9218196 F:0x0001 S:2 TAG:"") FROM GIM-GXSW26-IVZ-EKON RIADITEL: Status of DCI 11388 (SNMP: .1.3.6.1.4.1.25506.2.6.1.1.1.1.12.12) changed to UNSUPPORTED
2019.01.18 11:02:43.336 *D* [event.policy       ] EPP: processing event 9218196
2019.01.18 11:02:43.336 *D* [event.policy       ] Event 9218196 match EPP rule 22
2019.01.18 11:02:43.336 *D* AlarmManager: adding new active alarm, current alarm count 17


Thank you,
Michal
Title: Re: After upgrade to 2.2.11 - Status of DCI changed to UNSUPPORTED
Post by: pzandvoort on February 06, 2019, 04:17:23 PM
We have the exact same issue.

Every time the housekeeper runs (2:00am by default) it seems to re-apply the templates. Since not every DCI in the template is supported by all nodes, "SYS_DCI_UNSUPPORTED" fires and we get the "Status of DCI changed to UNSUPPORTED" alarm. This makes perfect sense the first time the template gets applied, since it's a new DCI for that node and it just figured out that the DCI isn't supported. But it shouldn't happen if the node already has that DCI and that DCI is already known to be unsupported or disabled.

2.2.10 did this correctly. The logic seems broken in 2.2.11 and up.

Did you figure out a workaround to this? We can obviously suppress the alarm by changing the response to SYS_DCI_UNSUPPORTED, but that seems wrong.

Peter
Title: Re: After upgrade to 2.2.11 - Status of DCI changed to UNSUPPORTED
Post by: Michal Hanajik on February 06, 2019, 04:51:21 PM
Hey Peter,

no we haven't found any solution. Still waiting for Victor or someone with much bigger insight to advice or check if there is bug.
We can only always after cleanup delete those UNSUPPORTED alarms and go with the rest as usual.
Title: Re: After upgrade to 2.2.11 - Status of DCI changed to UNSUPPORTED
Post by: Victor Kirhenshtein on February 07, 2019, 11:10:02 AM
This is caused by housekeeper re-applying templates. This was added to automatically fix issues when not all DCIs were applied or updated correctly or was accidentally deleted (had those issues in few big deployments). The problem here is that template re-apply also resets DCI status to "active", which on next data collection run changes to "unsupported" and causes SYS_DCI_UNSUPPORTED event generation. Correct approach would be to leave unsupported DCIs in unsupported state. We will fix it before next release.

Best regards,
Victor
Title: Re: After upgrade to 2.2.11 - Status of DCI changed to UNSUPPORTED
Post by: pzandvoort on February 07, 2019, 03:48:35 PM
Victor,

That makes perfect sense and matches exactly what we're seeing. For now, we've disabled the generation of the alarm on SYS_DCI_UNSUPPORTED to suppress the result, but if you can make it work like you describe that'd be awesome! Looking forward to the next release.
Thanks!

Peter
Title: Re: After upgrade to 2.2.11 - Status of DCI changed to UNSUPPORTED
Post by: Michal Hanajik on April 16, 2019, 10:22:24 AM
Hello,

has been this fixed? I have 2.2.13 and the problem is still persisting.
Title: Re: After upgrade to 2.2.11 - Status of DCI changed to UNSUPPORTED
Post by: lweidig on June 17, 2019, 04:50:58 PM
This issue still exists in 2.2.15.
Title: Re: After upgrade to 2.2.11 - Status of DCI changed to UNSUPPORTED
Post by: Michal Hanajik on October 01, 2019, 01:19:17 PM
Hello, this problem for us persists in bigger scale after upgrade to netxms 3.

Do you have any suggestion?


Minor Outstanding Status of DCI 442 (Internal: PingTime) changed to UNSUPPORTED     1 0 30.09.19 12:31:09 30.09.19 12:31:09
Minor Outstanding Status of DCI 409 (Internal: PingTime) changed to UNSUPPORTED   1 0 30.09.19 12:31:09 30.09.19 12:31:09
Minor Outstanding Status of DCI 842 (Internal: PingTime) changed to UNSUPPORTED   1 0 30.09.19 12:31:09 30.09.19 12:31:09
Minor Outstanding Status of DCI 1043 (Internal: PingTime) changed to UNSUPPORTED   1 0 30.09.19 12:31:09 30.09.19 12:31:09
Minor Outstanding Status of DCI 846 (Internal: PingTime) changed to UNSUPPORTED   1 0 30.09.19 12:31:09 30.09.19 12:31:09
Minor Outstanding Status of DCI 834 (Internal: PingTime) changed to UNSUPPORTED   1 0 30.09.19 12:31:09 30.09.19 12:31:09
Minor Outstanding Status of DCI 829 (Internal: PingTime) changed to UNSUPPORTED   1 0 30.09.19 12:31:09 30.09.19 12:31:09
Minor Outstanding Status of DCI 826 (Internal: PingTime) changed to UNSUPPORTED   1 0 30.09.19 12:31:09 30.09.19 12:31:09
Minor Outstanding Status of DCI 831 (Internal: PingTime) changed to UNSUPPORTED   1 0 30.09.19 12:31:09 30.09.19 12:31:09
Minor Outstanding Status of DCI 550 (Internal: PingTime) changed to UNSUPPORTED   1 0 30.09.19 12:31:09 30.09.19 12:31:09
Minor Outstanding Status of DCI 568 (Internal: PingTime) changed to UNSUPPORTED   1 0 30.09.19 12:31:09 30.09.19 12:31:09

Title: Re: After upgrade to 2.2.11 - Status of DCI changed to UNSUPPORTED
Post by: lweidig on October 01, 2019, 02:16:21 PM
Yes, we have not seen it resolved either and each release seems to actually grow.
Title: Re: After upgrade to 2.2.11 - Status of DCI changed to UNSUPPORTED
Post by: StanHubble on October 02, 2019, 09:35:28 PM
In my opinion this is not broken....rather your templates are including nodes that they shouldn't or you have dci's that are too specific for the template.

Nodes can appear in multiple templates and we have some templates that depend on firmware versions and others that depend on application versions.  Each deployment will be different, but in general you should define templates from the general to the specific.
Title: Re: After upgrade to 2.2.11 - Status of DCI changed to UNSUPPORTED
Post by: lweidig on October 02, 2019, 09:48:24 PM
Stan:

I would completely agree with you if what you were describing was the problem, but it is NOT!  We use many different templates as well to fine tune what is being collected.

The issue is that it is changing items to UNSUPPORTED that are 100% legitimate!  In one of our cases it is the polling of a SNMP value to show GPS sync state of a device.  Have verified many times the OID is correct and the console has no issues polling that OID withing the MIB Explorer.  We have also looked to make sure that more than one template does not have the OID and that we are looking at the wrong one. 

It is simply broken!
Title: Re: After upgrade to 2.2.11 - Status of DCI changed to UNSUPPORTED
Post by: StanHubble on October 02, 2019, 10:02:40 PM
My bad then... if i misread the problem description.
Title: Re: After upgrade to 2.2.11 - Status of DCI changed to UNSUPPORTED
Post by: Michal Hanajik on October 03, 2019, 12:06:27 PM
In shortcut ... We have 3 identical servers. 2 of them show unsupported DCI's and 3rd one is completely OK. All use same templates.

It's very strange and semi-random I'd say.
Title: Re: After upgrade to 2.2.11 - Status of DCI changed to UNSUPPORTED
Post by: Filipp Sudanov on October 04, 2019, 01:01:06 PM
There we changes with internal parameters - in version 3 PingTime was deprecated and a number of ICMS.ResponseTime. parameters were introduced that provide much more options.
Please try to replace PingTime with ICMP.ResponseTime.Last in your template.
Title: Re: After upgrade to 2.2.11 - Status of DCI changed to UNSUPPORTED
Post by: gammy69er on January 22, 2021, 01:59:48 AM
Just as a Bit of Info for nayone working with SNMP Monitoring - May count as Different Topic - But this was where I found the Idea for my fix.

Just had this issue - Had a Site move from Ubiquiti Airmax gear to Unifi - IP range remained, but Devices were changed.  Some of them picked up old IP's, but as they already had a device in XMS previously - that device was updated with the new info - so Templates etc remained.

The problem for me is that while my template was searching for the OID - I had been returning null on it - therefor it was just skipping the check... basically (from what I understand).  Changing to False has fixed that.

However the code later has another "if-else" with a false - so not quite sure whats going on here... It certainly picks up the Positive, but doesn't appear to drop the negative.  But haven't touched this in months - so not the biggest autority on all this.

Heres some code for a look over (Regarding Ubiquiti Airmax 2ghz devices - this is the edited version with the airMaxFrequency - now returning false...)  Hope this helps a fellow forum trawler.


oid_airMaxFrequency = ".1.3.6.1.4.1.41112.1.4.1.1.4.1";
oid_airMaxSSID = ".1.3.6.1.4.1.41112.1.4.5.1.2.1";

querySNMP = CreateSNMPTransport($node);
if (querySNMP == null)
return null;

airMaxFrequency = SNMPGetValue(querySNMP, oid_airMaxFrequency);
if (airMaxFrequency == null)
return false;

/* airMaxSSID = SNMPGetValue(querySNMP, oid_airMaxSSID); */
/* if (airMaxSSID ~= "HQBH") */
/* return false; */

if (airMaxFrequency == "2412" || airMaxFrequency == "2417" || airMaxFrequency == "2422" || airMaxFrequency == "2427" || airMaxFrequency == "2432" || airMaxFrequency == "2437" || airMaxFrequency == "2442" || airMaxFrequency == "2447" || airMaxFrequency == "2452" || airMaxFrequency == "2457" || airMaxFrequency == "2462" || airMaxFrequency == "2467" || airMaxFrequency == "2472") {
return true;
} else {
return false;
}