Menu

Show posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Show posts Menu

Messages - mk

#1
So far, it looks like it almost works.
One remaining issue is that while the event gets triggered and I get the appropriate notification email, the "Terminate alarm" action on the event does not do anything. The threshold violation remains in the alarm list until cleared manually.
Another (minor) issue is that if a row gets inserted into the table above an existing row is that a deactivation and a violation event are both triggered. That could probably be resolved by not using thresholds and also sending the violation events from the transformation script as well.
#2
Thanks, I'll try that. I've come up with the following code, I'll let you know how that goes.
idxCode = $1->getColumnIndex("Code");
idxDescr = $1->getColumnIndex("Description");
dciName = "prtAlert";
dciId = FindDCIByDescription($node, dciName);
dciObj = GetDCIObject($node, dciId);

tableOld = GetDCIValue($node, dciId);

if (tableOld != null)
{
for (j = 0; j < tableOld->rowCount; j++)
{
keepAlarm = false;
for (i = 0; i < $1->rowCount; i++)
{
if ($1->get(i, idxCode) == tableOld->get(j, idxCode))
{
keepAlarm = true;
}
   }
   if (!keepAlarm)
   {
    PostEvent($node, 70, null, dciObj->name, dciObj->description, dciId, j, tableOld->get(j, idxCode), tableOld->get(j, idxDescr));
   }
}
}
#3
Using ExternalParametersProvider also does not help: according to Executing Powershell with Tabel result.:
Quote from: Victor KirhenshteinThere are no external table support in agent.
#4
I have a table data collection item configured which pulls the alert messages from our printers. It consists of two columns: SNMP parameter .1.3.6.1.2.1.43.18.1.1.7 is the integer alert code (and I've set that as instance column), and .1.3.6.1.2.1.43.18.1.1.8 is the textual description of the alert.
I have a threshold set on that table which raises the SYS_TABLE_THRESHOLD_ACTIVATED event when the description is neither "Sleep" nor "Sleep mode on". The alarm gets raised properly when the printer has an alert message, but the SYS_TABLE_THRESHOLD_DEACTIVATED event is never raised. The reason for this is that the alert just gets removed from the table because it is no longer reported via SNMP and this does not trigger any events.

Is there any way to raise an SYS_TABLE_THRESHOLD_DEACTIVATED event when the line that triggered the SYS_TABLE_THRESHOLD_ACTIVATED event is removed from a table DCI?

One thing I've thought about is writing a transformation script that does something like this:

idxCode = $1->getColumnIndex("Code");
nodeId = d2x($node->id, 8);
tableId = d2x(FindDCIByName($node, "printerAlert"), 8);
prefix = "DCTTHR_" . nodeId . "_" . tableId . "_"; // the keys look like DCTTHR_<node ID in hex>_<DCI ID in hex>_<instance ID>

alarms = GetAlarmWithKeyStartingWith(prefix);
foreach(alarm : alarms)
{
    removeAlarm = true;
    for (i = 0; i < $1->rowCount; i++)
    {
        if ($1->get(i, idxCode) == substr(alarm->key, length(prefix)) // instance part of the alarm key matches the code in the current row
        {
            removeAlarm = false;
        }
    }
    if (removeAlarm)
    {
        alarm->terminate();
    }
}

Of course, this doesn't work for the simple reason that NXSL does not offer any way to access the  alarm list (i.e. the list which is shown in the Alarm Browser in the GUI), at least I couldn't find one.
There would need to be some way to get an alarm object associated with a specific node and DCI (e.g. GetAlarmWithKeyStartingWith as used above), and the NXSL alarm class would need to have at least a key attribute and a terminate() method.
#5
How could I possibly have overlooked that...? Of course, it's working just fine with the -- instead of the ++. Thanks!
#6
I figured it out. GetDCIValueByName is able to return table objects, so it's quite simple:

sub main()
{
table = GetDCIValueByName($node, $event->parameters[1]);
if (table == null)
{
return "[GetDCIValueByName(" . $node->name . ", " . $event->parameters[1] . ") failed.]";
}
colNum = 1;
alertMsg = table->get($event->parameters[4], colNum);
if (alertMsg == null)
{
return "[table->get(" . $event->parameters[4] . ", " . colNum . ") failed.]";
}
return alertMsg;
}
#7
Right now I get alarm messages like
Quote
Threshold activated on table "prtAlert" row 1 (12)
but I'd prefer to get messages like
Quote
Threshold activated on table "prtAlert": YELLOW CARTRIDGE LOW
i.e. it should pull the message from a cell in row 1 in the table. A screenshot of this table is attached to this post.





I'm trying to put together a script to add more information to the SYS_TABLE_THRESHOLD_ACTIVATED event message. The script is called Table_Alarm_Description, so I added %[Table_Alarm_Description] to the Message field in the Generate alarm action of event number 18 (Generate alarm on table threshold violation). The script gets called when the alarm is triggered and all is well in that regard.

This is the script I'm using:

sub main()
{
table = AgentReadTable($node, $event->parameters[2]);
if (table == null)
{
return "[AgentReadTable(" . $node->name . ", " . $event->parameters[2] . ") failed.]";
}
colNum = 1;
alertMsg = table->get($event->parameters[4], colNum);
if (alertMsg == null)
{
return "[table->get(" . $event->parameters[4] . ", " . colNum . ") failed.]";
}
return alertMsg;
}

I'm seeing that AgentReadTable always fails. I suspect this is due to the fact that my node is monitored via SNMP and not via the agent. How can I fix this? Unfortunately, there is no function named something like SNMPReadTable that could be used instead.

Alternatively, do you know any other way to add more details as in the example above to a table threshold alarm?
Please note that I CANNOT use anything like

transport = CreateSNMPTransport($node);
oid = ".1.3.6.1.2.1.43.18.1.1.8." . $event->parameters[5];
varbind = SNMPGet(transport, oid);
varbind->value

in my script because the OID is different for different tables I have; also the value of $event->parameters[5] is not always the same as the final number in the OID.
I need to have the script read the value from the table.
#8
Well, after waiting a few hours the CPU load seems to have gone down to normal. So I went back to see if I could cause the issue again. This is the script fragment that triggers it:

idxVoltage = $1->getColumnIndex("Voltage");
for (i = $1->rowCount -1; i >= 0; i++)
{
if ($1->get(i, idxVoltage) == null || $1->get(i, idxVoltage) < 0)
{
$1->deleteRow(i);
}
}

Here I was trying to remove those rows from the DCI table that had a negative value in the Voltage field. This is an 8-outlet power distribution unit that reports 42 outlets via SNMP, but only the first 8 actually report sensible values.

I see that NXSL:Table has the deleteRow method, but it's not documented much and not used in any examples, so is it broken and shouldn't be used? The 1.2.9 release notes mention the introduction of this method:
Quote from: ChangeLog
*
* 1.2.9
*
[...]
- New methods deleteColumn and deleteRow in NXSL class Table
#9
The template is only applied to two nodes, not all ~30. I have another template that's applied to two different nodes which is working fine.
#10
General Support / All requests time out, 100% CPU usage
November 02, 2014, 07:01:14 PM
I just set up NetXMS 1.2.17 on Debian 7 with MySQL and added some hosts and configured some SNMP DCIs. While I was editing a transformation script on a DCI table in a template, the client stopped responding and everything I did would only trigger a "Request Timeout" message. I ended up restarting netxmsd, but that didn't help: whenever I started the client again, I got the same "Request Timeout" messages on everything and the netxmsd process was running at 100% CPU. I somehow managed to close the open tabs in the client and even deleted the device that was associated with the template I was editing. While this did solve the "Request Timeout" issues, I still see 100% CPU usage.
The device eventually came back and became associated to the same template again, and at that time the "Request Timeout" issues came back.

When running netxmsd in debug mode, I saw that it threw several SYS_THREAD_HANG events related to the "Item Poller" and the "Syncer Thread". How do I go about resolving this issue? I'd like to get the CPU usage back down to normal numbers, and I'd of course like to figure out what caused the issue originally.