Menu

Show posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Show posts Menu

Messages - johnny

#1
General Support / Re: Question about backup
November 29, 2018, 03:20:48 PM
ok I see, great thanks for the info.
I've just inserted in the existing rule in event processing Policy "Generate an alarm when one of the system threads hangs or stops unexpectedly"
I've added in the filtering script the following:
now = localtime();
return (now->hour == 1 && (now->min >= 00 && now->min =< 10));
if ( $node->name ="NODE_NAME" ){
Action: Stop processing}

I just want to stop only for the netxms server so replacing the NODE_NAME with netxms name would it work?

edit:
it didn't work I've understand why, it's because we say to the whole event to stop process

In order to do a specific for that node I should create a new rule and add as a source object that node right?
#2
General Support / Question about backup
November 29, 2018, 09:25:44 AM
Hello guys,
I have a cronjob evey night for backing up everything, mysql and configuration files. The mysqldump takes some time though, around 6 minutes +
netxms reports a critical event after 3 minutes of the start of dump: Thread "Syncer Thread" is not responding
I assume that this is because of the mysqldump.
Is there another approach of doing daily backups?
should you stop/pause netxms service and then backup?
or is there a way to disable that check?
#3
Finally I've done it with Hook:ConfigurationPoll and I check for both "node name" and "Interface name" as well.
If it matches both of them then I set Interface Expected State to Ignore

interfaces = GetNodeInterfaces($node);
foreach (interface : interfaces)
{
if ( $node->name ~="(?i)Node1Name" && interface->name ~= "(?i)tun.*" ){
SetInterfaceExpectedState(interface, "IGNORE");
}else if ( $node->name ~="(?i)Node2Name" && interface->name ~= "(?i)tun.*|vmnet.*|docker.*|docker_gwbridge|vboxnet.*" ){
SetInterfaceExpectedState(interface, "IGNORE");}
}
#4
Great thank you guys.
I got it working very helpful information.
I also filter the nodes with $node variable.

but why is it that when interface is down and do a status poll it says cannot poll from that interface, and when interface is back on it still says it cannot poll data from that interface?
also if I do a configuration poll when interface is down (even restart the interface) netxms deletes that interface.
#5
Hello all,
I'm facing a problem, on a pc that has tunnel interfaces for VPN connections. After the interface get up/down then it deletes the interface and recreates it, and it also gives it an expected state up so I get notifications and email warnings.
I've done delete the interface, obviously doesn't helped. On interfaces I've done unmanage and exclude this interface from network topology with expected state ignore. Nothing of that helps.
Is there another way to disable all other interfaces except the ethernet interface, or at least make for that specific node(pc) default expected state for new interfaces to be ignonre?
here is a part of event log of that node:

ID Time Source DCI Event Sevirety Message Root ID
591861 06.11.2018 12:25:32 PC 0 SYS_IF_UP Normal Interface "tun1" changed state to UP (IP Addr: 10.0.1.6/32) 0
591895 06.11.2018 13:15:34 PC 0 SYS_IF_DOWN Minor Interface "tun1" changed state to DOWN (IP Addr: 10.0.1.6/32) 0
591916 06.11.2018 13:24:17 PC 0 SYS_IF_ADDED Normal Interface "tun1" added (IP Addr: 10.0.1.6/32, IfIndex: 36) 0
591915 06.11.2018 13:24:17 PC 0 SYS_IF_DELETED Normal Interface "tun1" deleted (IP Addr: 10.0.1.6/32, IfIndex: 35) 0
591917 06.11.2018 13:24:47 PC 0 SYS_IF_EXPECTED_STATE_UP Normal Expected state for interface "tun1" set to UP 0
591918 06.11.2018 13:25:52 PC 0 SYS_IF_UP Normal Interface "tun1" changed state to UP (IP Addr: 10.0.1.6/32) 0
592005 06.11.2018 14:54:23 PC 0 SYS_IF_EXPECTED_STATE_IGNORE Normal Expected state for interface "tun1" set to IGNORE 0
592078 06.11.2018 16:24:32 PC 0 SYS_IF_ADDED Normal Interface "tun1" added (IP Addr: 10.0.1.6/32, IfIndex: 37) 0
592079 06.11.2018 16:24:42 PC 0 SYS_IF_EXPECTED_STATE_UP Normal Expected state for interface "tun1" set to UP 0
592080 06.11.2018 16:25:47 PC 0 SYS_IF_UP Normal Interface "tun1" changed state to UP (IP Addr: 10.0.1.6/32) 0
#6
I've done it little different,
What I was trying to do, is from a new event policy I selected the specific interfaces I wanted as source objects, but it was not working for some reason, It excecuted the general policy I had. What I mean, I had that specific policy if it match to send an email to one account. The general policy for interface down has another email account as action.
When the interface changed state it only send email to the general policy.
What I then did is in the event policy I removed the source object and add a filtering script.
What I've written down is:

IF ((($node ==Node-Name) && ($object == Interface-Name)) || (($node == Node-Name) && ($object == Interface-Name)))

well this works fine as I want, it sends email on both email addresses as I want but netxms gives me a Minor Alarm.
Event: SYS_SCRIPT_ERROR
message :Script(EPP::6) execution error: Error 11 in line 1:Function not found


Do you have any idea why this is an error as function not found, despite the fact the the policy works fine.
Also why the policy does not work when I select the source objects from the Source object field

Thank you in advance


1st Problem Found:
the function should be in lowercases: if
if ((($node ==Node-Name) && ($object == Interface-Name)) || (($node == Node-Name) && ($object == Interface-Name)))

Well I speaked too soon, the problem is now that for whatever interface that gets down the second emails gets all alerts, so it does not work as expected

Found the solution:
if (($1 == Interface-ID)||($1 == Interface-ID)||($1 == Interface-ID)) return true;
return false;

So now it works as it should.
When on those interfaces it sends on both emails, on every other interface it sends only on the general email
#7
Hello all,
I have a question, is there a way for some specific interfaces from a node send email alerts on another additional email address, when the interface change state other than expected?
thank you in advance
#8
Dear Tursiops,
thank you for your answer. I will check the global poll count and see how it will operate. It confused me because on per interface it said poll count 0.
Do you have reference where to read on how to do the Filtering Script? I would like to get email notification of unexpected interface change status, but if it too complicated I would set it only to unexpected up :\
#9
Bump,
Any Idea of the issue?
Or if it's not possible, is there a way for false alarm to configure it. What I mean is, could I config for 2 or 3 polls for the interface state and if it does not match the expected state then to trigger the alarm. Because for some reason I have a lot of false alarms, or even it could give broken data to netxms and I have a lot of emails for returning back to normal for interface state.
#10
General Support / Re: Agent polling issues
March 20, 2018, 12:48:31 PM
The problem appeared again when someone did a restart on the pc,
The agent is up, i can telnet on the port but imideately it closes the connection, while on another pc which I can poll data the connection remains on when I telnet.
I did a configuration full poll and the problem still remains.

[20.03.2018 12:45:52] **** Poll request sent to server ****
[20.03.2018 12:45:52] Poll request accepted
[20.03.2018 12:45:52] Starting configuration poll for node XXX PC
[20.03.2018 12:45:52] Capability reset
[20.03.2018 12:45:52] Checking node's capabilities...
[20.03.2018 12:45:52]    Checking NetXMS agent...
[20.03.2018 12:45:52] Capability check finished
[20.03.2018 12:45:52] Checking interface configuration...
[20.03.2018 12:45:52] Unable to get interface list from node
[20.03.2018 12:45:52]    Interface "unknown" is no longer exist
[20.03.2018 12:45:53] Interface configuration check finished
[20.03.2018 12:45:53] Checking node name
[20.03.2018 12:45:53] Node name is OK
[20.03.2018 12:45:53] Finished configuration poll for node XXX PC
[20.03.2018 12:45:53] Node configuration was not changed after poll
[20.03.2018 12:45:53] **** Poll completed successfully ****


EDIT:
on agent debug it says
connection from x.x.x.x rejected

2nd edit:
the problem was the DNS resolving
on netxms agent I had as Master server the dns and the pc could not resolve it.
I added also the IP and the issue resolved for now
#11
General Support / Re: Agent polling issues
March 16, 2018, 01:57:12 PM
Dear Tursiops,
thank you for you answer. I haven't check to do a full configuration poll on the problematic node.
I will try to recreate the problem and see what will happen.

The node's agent is working fine. I can telnet from netxms server to node and see that to debug,
and also when I changed the ip address to a not valild to that node, and created another new node with corresponding ip I polled the data.
#12
General Support / Re: Agent polling issues
March 15, 2018, 12:29:35 PM
sorry,
the problem with the new device was that it was wrong the ip address of the device
but the problem still remains on the previous device was I couldn't poll data before restart

ADDED:
From a centos machine that I cannot poll data I start the agent with debug.
Server still cannot poll data, but when I telnet to netxms port I get it open and from client debug I get:
[15-Mar-2018 13:08:27.419] [DEBUG] DataCollector: sleeping for 60 seconds
[15-Mar-2018 13:09:27.419] [DEBUG] DataCollector: sleeping for 60 seconds
[15-Mar-2018 13:10:06.673] [DEBUG] Incoming connection from X.X.X.X
[15-Mar-2018 13:10:06.673] [DEBUG] Connection from X.X.X.X accepted
[15-Mar-2018 13:10:06.673] [DEBUG] Session registered for X.X.X.X
[15-Mar-2018 13:10:09.537] [DEBUG] [CS-0(1)] Communication channel closed by peer
[15-Mar-2018 13:10:09.537] [DEBUG] [CS-0(1)] writer thread stopped
[15-Mar-2018 13:10:09.537] [DEBUG] [CS-0(1)] Session with X.X.X.X closed
[15-Mar-2018 13:10:09.537] [DEBUG] [CS-0(1)] Session unregistered
[15-Mar-2018 13:10:09.537] [DEBUG] [CS-0(1)] Receiver thread stopped

so as you see it seems from the agent, when I poll data agent from debug doesn't get connections.
When I've done that test before some months I remember that at polling I see that from debug

2nd Edit:
How I kind of solved the problem:
from console to the current problematic node at netxms server I've changed the ip to something else.
Then I created a new node with the correct IP and it polled the data normally.
I also saw that from the agent debug.
Then I delete the test(temp) node with the correct IP, and changed back to the correct IP at the problematic node and after 2 seconds everything went back to normal.
Do you have any idea why is that happening?
#13
General Support / Re: Agent polling issues
March 15, 2018, 11:03:56 AM
Dear all,
an update on the case
I haven't restart netxms server yet so I still have the problem I've described above.
Today I tried to add a new network device with snmp and it seems that netxms server is not able to connect and collect data from the device.

I then killed netxms proccess and started at debug level
the previous devices still cannot poll data and also from the new network device.
At debug level 6 I've checked that the messages seem same to the working and non working devices
example:
Sending message CMD_POLLING_INFO (128 bytes)
Sending compressed message CMD_POLLING_INFO (120 bytes)
Sending compressed message CMD_POLLING_INFO (104 bytes)
Sending compressed message CMD_POLLING_INFO (112 bytes)


also after a system restart I still cannot poll data from the devices.
Firewall is down on the devices that netxms server cannot poll.
Any ideas?

also is there a way to start netxms with more logging at the log file?
I'm currently getting:
2018.03.15 09:45:40.503 *I* NetXMS Server started
2018.03.15 09:45:40.503 *I* SocketListener/Clients: listening on 0.0.0.0:4701
2018.03.15 09:45:40.503 *I* SocketListener/MobileDevices: listening on 0.0.0.0:4747
2018.03.15 09:45:40.503 *I* SocketListener/Clients: listening on [0.0.0.0]:4701
2018.03.15 09:45:40.503 *I* SocketListener/AgentTunnels: listening on [0.0.0.0]:4703
2018.03.15 09:45:40.503 *I* SocketListener/MobileDevices: listening on [0.0.0.0]:4747
#14
Hello again,
I was wondering if there is a way to group email alerts.
The reason I'm asking is because in a device ex. a switch I could have set a lot of data collection items with thresholds as well.
When a critical problem could happen on the switch ex. power failure the switch is unavailable and I'm getting alerts from thresholds port state and device status which it will result from around 7-10 warning/critical email per device, and same when it gets back to normal.
Is there a possible way to group alerts, for example in that situation to group everything and tell netxms to send only 1 email with all the problems or a status problem?
Thank you in advance
#15
General Support / Agent polling issues
March 05, 2018, 12:46:36 PM
Hello all,
I've been using netxms as snmp server for more than a year. I'm quite pleased with the service but I'm having an issue regarding to agent polling.
I've configured several devices. On the servers(centos and ubuntu) I use the netxms agent for polling.
what happens sometimes is after netxms server reboot (and maybe when the "clients" or agents are not yet ready) I'm having problem on polling data.
When I do a status poll on the agent "clinet" I get this:
[05.03.2018 12:28:43] **** Poll request sent to server ****
[05.03.2018 12:28:43] Poll request accepted
[05.03.2018 12:28:43] Starting status poll for node sftptest
[05.03.2018 12:28:43]    Starting status poll on interface lo
[05.03.2018 12:28:43]       Current interface status is UNKNOWN
[05.03.2018 12:28:43]       Interface status cannot be determined
[05.03.2018 12:28:43]       Interface is UNKNOWN for 21239 polls (1 poll required for status change)
[05.03.2018 12:28:43]       Interface status after poll is UNKNOWN
[05.03.2018 12:28:43]    Finished status poll on interface lo
[05.03.2018 12:28:43]    Starting status poll on interface eth0
[05.03.2018 12:28:43]       Current interface status is NORMAL
[05.03.2018 12:28:43]       Starting ICMP ping
[05.03.2018 12:28:43]       Interface is NORMAL for 21238 polls (1 poll required for status change)
[05.03.2018 12:28:43]       Interface status after poll is NORMAL
[05.03.2018 12:28:43]    Finished status poll on interface eth0
[05.03.2018 12:28:43] Node is connected
[05.03.2018 12:28:43] Finished status poll for node sftptest
[05.03.2018 12:28:43] Node status after poll is NORMAL
[05.03.2018 12:28:43] **** Poll completed successfully ****

If I disable usage of ICMP pings for status polling I get this:
[05.03.2018 12:37:25] **** Poll request sent to server ****
[05.03.2018 12:37:25] Poll request accepted
[05.03.2018 12:37:25] Starting status poll for node sftptest
[05.03.2018 12:37:25]    Starting status poll on interface lo
[05.03.2018 12:37:25]       Current interface status is UNKNOWN
[05.03.2018 12:37:25]       Interface status cannot be determined
[05.03.2018 12:37:25]       Interface is UNKNOWN for 21248 polls (1 poll required for status change)
[05.03.2018 12:37:25]       Interface status after poll is UNKNOWN
[05.03.2018 12:37:25]    Finished status poll on interface lo
[05.03.2018 12:37:25]    Starting status poll on interface eth0
[05.03.2018 12:37:25]       Current interface status is UNKNOWN
[05.03.2018 12:37:25]       Interface status cannot be determined
[05.03.2018 12:37:25]       Interface is UNKNOWN for 1 poll (1 poll required for status change)
[05.03.2018 12:37:25]       Interface status after poll is UNKNOWN
[05.03.2018 12:37:25]    Finished status poll on interface eth0
[05.03.2018 12:37:25] Node is still unreachable
[05.03.2018 12:37:25] Finished status poll for node sftptest
[05.03.2018 12:37:25] Node status after poll is UNKNOWN
[05.03.2018 12:37:25] **** Poll completed successfully ****


From what I see is like server is not even trying to poll from netxms agent on that machine. Also on the overview of the machine on capabilities the isAgent has switched to No.
Firewall ports are open
At the Switches and router that I'm doing snmp polls I've never had that issue.
Normally if I do a reboot on netxms server it will solve the issue, but is there anyway to track the problem in order not to happen again, because it could happen and I could notice it after a few days.

On netxms server and the other machine's agents I don't get any logs.
Netxms current version is 2.2.1, but I've got this issue from previous versions as well.
Netxms server is on centos.