Menu

Show posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Show posts Menu

Messages - Borgso

#46
General Support / Re: Upgrade Failed
February 19, 2018, 11:26:13 PM
Having the same problem with SQLite as Database.

Starting with upgrade to 2.2.x series.
After "nxdbmgr upgrade -X" i get server up and running but without any nodes and this in log

QuoteSQL query failed (Query = "INSERT INTO nodes (primary_ip,primary_name,snmp_port,node_flags,snmp_version,community,status_poll_type,agent_port,auth_method,secret,snmp_oid,uname,agent_version,platform_name,poller_node_id,zone_guid,proxy_node,snmp_proxy,icmp_proxy,required_polls,use_ifxtable,usm_auth_password,usm_priv_password,usm_methods,snmp_sys_name,bridge_base_addr,down_since,driver_name,rack_image_front,rack_position,rack_height,rack_id,boot_time,agent_cache_mode,snmp_sys_contact,snmp_sys_location,last_agent_comm_time,syslog_msg_count,snmp_trap_count,node_type,node_subtype,ssh_login,ssh_password,ssh_proxy,chassis_id,port_rows,port_numbering_scheme,agent_comp_mode,tunnel_id,lldp_id,fail_time_snmp,fail_time_agent,runtime_flags,rack_orientation,rack_image_rear,id) VALUES (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)"): table nodes has no column named rack_image_front
#47
General Support / Re: RabbitMQ Monitoring
September 15, 2017, 02:15:12 PM
RabbitMQ is a more functionel MQTT broker as i understand.
What are you wanting to monitor?
Could you use process count?
You can also make a external script to sniff traffic or test send/ack messages..
#48
Announcements / Re: NetXMS 2.0.8 released
January 13, 2017, 09:00:23 AM
http://askubuntu.com/questions/601/the-following-packages-have-been-kept-back-why-and-how-do-i-solve-it
If the dependencies have changed on one of the packages you have installed so that a new package must be installed to perform the upgrade then that will be listed as "kept-back".

Use:
apt-get dist-upgrade

And make sure you backup your current server db first ;)
#49
General Support / NetXMS crashes, out of sockets
November 13, 2016, 09:05:51 AM
Our server have been unstable since upgrading to 2.0.x branch.
We have been getting more nodes at same time, so problem could exist on older versions too..

Server Setup:
OS: Ubuntu 14.04.05-LTS (ESXi)
CPU: 4x E5-2690 @ 2.90GHz
Mem: 8GB

Server stats:
Total number of objects:     10490
Number of monitored nodes:   3737
Number of collectable DCIs:  33514

Server config:
PollerThreadPoolBaseSize: 300
PollerThreadPoolMaxSize: 800
NumberOfDataCollectors: 800


Been talking on Telegram about this, and this night one of our NOC had some time to do debug and found this:

-- Quote --
It seems that Netxms doesn't handle more than 1024 sockets very well and crashes if an attempt to retransmit data when the send buffers are full on a fd equal to or larger than 1024.

_opt_netxms206_bin_netxmsd.0.crash

(gdb) bt
#0  0x00007ff8ffe79c37 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1  0x00007ff8ffe7d028 in __GI_abort () at abort.c:89
#2  0x00007ff8ffeb62a4 in __libc_message (do_abort=do_abort@entry=2, fmt=fmt@entry=0x7ff8fffc2113 "*** %s ***: %s terminated\n") at ../sysdeps/posix/libc_fatal.c:175
#3  0x00007ff8fff4dbbc in __GI___fortify_fail (msg=<optimized out>, msg@entry=0x7ff8fffc20aa "buffer overflow detected") at fortify_fail.c:38
#4  0x00007ff8fff4ca90 in __GI___chk_fail () at chk_fail.c:28
#5  0x00007ff8fff4db07 in __fdelt_chk (d=<optimized out>) at fdelt_chk.c:25
#6  0x00007ff9004603bb in SendEx (hSocket=1149, data=data@entry=0x7ff8b226e580, len=1016, flags=flags@entry=0, mutex=0x7ff8b4172160) at tools.cpp:1084
#7  0x00007ff90097dd1f in ClientSession::sendMessage (this=0x7ff8b418a910, msg=<optimized out>) at session.cpp:1588
#8  0x00007ff900980060 in ClientSession::sendAllObjects (this=this@entry=0x7ff8b418a910, pRequest=pRequest@entry=0x7ff8b02cbcf0) at session.cpp:2294
#9  0x00007ff90099f08d in ClientSession::processingThread (this=0x7ff8b418a910) at session.cpp:798
#10 0x00007ff90099f219 in ClientSession::processingThreadStarter (pArg=<optimized out>) at session.cpp:215
#11 0x00007ff900210184 in start_thread (arg=0x7ff7c5359700) at pthread_create.c:312
#12 0x00007ff8fff3d37d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111


2016-11-12_22-34

(gdb) bt
#0  0x00007f47494aac37 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1  0x00007f47494ae028 in __GI_abort () at abort.c:89
#2  0x00007f47494e72a4 in __libc_message (do_abort=do_abort@entry=2, fmt=fmt@entry=0x7f47495f3113 "*** %s ***: %s terminated\n") at ../sysdeps/posix/libc_fatal.c:175
#3  0x00007f474957ebbc in __GI___fortify_fail (msg=<optimized out>, msg@entry=0x7f47495f30aa "buffer overflow detected") at fortify_fail.c:38
#4  0x00007f474957da90 in __GI___chk_fail () at chk_fail.c:28
#5  0x00007f474957eb07 in __fdelt_chk (d=<optimized out>) at fdelt_chk.c:25
#6  0x00007f4749a913bb in SendEx (hSocket=1180, data=data@entry=0x7f470036ab00, len=424, flags=flags@entry=0, mutex=0x7f470c262ff0) at tools.cpp:1084
#7  0x00007f4749faed1f in ClientSession::sendMessage (this=0x7f470c17dce0, msg=<optimized out>) at session.cpp:1588
#8  0x00007f4749faf0a5 in ClientSession::updateThread (this=0x7f470c17dce0) at session.cpp:658
#9  0x00007f4749faf2b9 in ClientSession::updateThreadStarter (pArg=<optimized out>) at session.cpp:224
#10 0x00007f4749841184 in start_thread (arg=0x7f4627cc1700) at pthread_create.c:312
#11 0x00007f474956e37d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111


2016-11-12_22-46.crash

(gdb) bt
#0  0x00007f77edd68c37 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1  0x00007f77edd6c028 in __GI_abort () at abort.c:89
#2  0x00007f77edda52a4 in __libc_message (do_abort=do_abort@entry=2, fmt=fmt@entry=0x7f77edeb1113 "*** %s ***: %s terminated\n") at ../sysdeps/posix/libc_fatal.c:175
#3  0x00007f77ede3cbbc in __GI___fortify_fail (msg=<optimized out>, msg@entry=0x7f77edeb10aa "buffer overflow detected") at fortify_fail.c:38
#4  0x00007f77ede3ba90 in __GI___chk_fail () at chk_fail.c:28
#5  0x00007f77ede3cb07 in __fdelt_chk (d=<optimized out>) at fdelt_chk.c:25
#6  0x00007f77ee34f3bb in SendEx (hSocket=1125, data=data@entry=0x7f77980f3380, len=424, flags=flags@entry=0, mutex=0x7f77a004f3e0) at tools.cpp:1084
#7  0x00007f77ee86cd1f in ClientSession::sendMessage (this=0x7f77a0239b70, msg=<optimized out>) at session.cpp:1588
#8  0x00007f77ee86d0a5 in ClientSession::updateThread (this=0x7f77a0239b70) at session.cpp:658
#9  0x00007f77ee86d2b9 in ClientSession::updateThreadStarter (pArg=<optimized out>) at session.cpp:224
#10 0x00007f77ee0ff184 in start_thread (arg=0x7f76c3891700) at pthread_create.c:312
#11 0x00007f77ede2c37d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111



excerpt of SendEx(SOCKET hSocket, const void *data, size_t len, int flags, MUTEX mutex) in tools.cpp:

do
{
retry:
#ifdef MSG_NOSIGNAL
nRet = send(hSocket, ((char *)data) + (len - nLeft), nLeft, flags | MSG_NOSIGNAL);
#else
nRet = send(hSocket, ((char *)data) + (len - nLeft), nLeft, flags);
#endif
if (nRet <= 0)
{
if ((WSAGetLastError() == WSAEWOULDBLOCK)
#ifndef _WIN32
    || (errno == EAGAIN)
#endif
   )
{
// Wait until socket becomes available for writing
struct timeval tv;
fd_set wfds;

tv.tv_sec = 60;
tv.tv_usec = 0;
FD_ZERO(&wfds);
FD_SET(hSocket, &wfds);
nRet = select(SELECT_NFDS(hSocket + 1), NULL, &wfds, NULL, &tv);
if ((nRet > 0) || ((nRet == -1) && (errno == EINTR)))
goto retry;
}
break;
}
nLeft -= nRet;
} while (nLeft > 0);


line 1084 is FD_SET(hSocket, &wfds);


To quote "man select":
Quote
       An  fd_set is a fixed size buffer.  Executing FD_CLR() or FD_SET() with
       a value of fd that is negative or is equal to or larger than FD_SETSIZE
       will result in undefined behavior.


hSocket is 1149, 1180 and 1125 in our crashdumps.


FD_SETSIZE on Linux is 1024:
Quote
    /usr/include/sys/select.h:#define   FD_SETSIZE      __FD_SETSIZE
    /usr/include/bits/typesizes.h:#define   __FD_SETSIZE        1024


Also consider the conditions for a crash. send() must fail with WSAEWOULDBLOCK, meaning that the send buffers are full. This can happen if the network is saturated or if the other side simply doesn't acknowledge the received data. Only then and iif the socket fd is equal to or larger than 1024 would lead to this crash. This would explain the inconsistent behaviour and perceived correlation with external factors.
#51
Announcements / Re: NetXMS 2.0.6 released
September 16, 2016, 08:23:48 AM
.deb packages not in repo, could you create and upload?

Thanks :-)
#52
General Support / Re: Agent Cache
June 22, 2016, 01:12:41 PM
Yes, we are using this node to poll external against nodes that have in-house web-app presenting a simple webpage that proxynode parses to get out the value we want for each DCI.

I have monitored the log from proxynode script, and can see it starts with around 0.1-0.5 sec time used but when agent starts polling more and more nodes at same time this time used are around 4-5sec..

If there have been a agentcache max poll thread this could been fixed?..
#53
General Support / Re: Agent Cache
June 16, 2016, 11:31:18 PM
PM'ed you log+db
#54
General Support / Re: Agent Cache
June 14, 2016, 10:53:33 AM
Hi thanks for fast reply

Turned debug level 7 on and restartet, proxynode is not having problem with load right after restart so i will wait tomorrow before i send you log+sqlite file.
#55
General Support / Agent Cache
June 14, 2016, 09:02:23 AM
Im running a proxy/source-node against multiply nodes that have a webpage to display values/statues (Inhouse applications/systems)

Each system is added to netxms as individual node, using a external DCI call on proxynode.
This is a python script that download webpage and parse out the values for each DCI.
There is a cache for this so node that i want more then 1 value from same page does not download it for each DCI as long age of cached download isnt more than X-time, making it goes faster on +2nd poll.

When AgentCache is off, this will slow down DCI poll because of serial(?) polling to the node used as proxy?
Turning on AgentCache the proxynode is going bananas on load and some DCI's get overpopulated with records every minute even if interval is 5min
Looking at Agent log, i can see its polling that many times..

It looks like Agent just spawns threads to it breaks.. Is there a config for MaxPollThread on agent?
And are there any
#56
Sorry for necroing this topic, we have the same issue and just upgraded to 2.0.3 we are now able to use this AgentCache function.

On Source node, there is a script polling custom Services for values.
Without agent cache this would give 1 poll each second, with cache on it polls much more within one second.
But it does not save any data to DCI history, do "Origin" need to be "Push" and not "NetXMS agent" when AgentCache function is On?

Is this function the reason of SQLite requirment on agent and are there any agent side configuration needed (ie db location)?
#57
General Support / Re: Disable DCI in script?
May 23, 2016, 11:42:07 AM
I have same question, managed to break different DCI on 1000 no-agent nodes when upgrading nxagent i use as Proxy (Specific request against inhouse webservice/http on equipment..)
The DCI's was auto disabled due to unsupported, and i would love to bulk enable all.. I have list of all DCI IDs..

EDIT: nevermind, autoactivated on next poll..
#58
General Support / "Maintenance" mode permissions
May 11, 2016, 09:49:04 AM
Im able to use Maintenance Enter/Leave/Schedule with all "System Rights" for a user, but do not manage to set this on a user with only specific rights.

I could find "Schedule object maintenance time interval" system rights, but this only allows a user to schedule a maintenanche and not put it direct into it or leave.
#59
Is this a bug or should it be like this?

#60
We just upgraded from 1.2.17 to 2.0.3 and discovered that Object Tools is no longer found in dropdown menu from AlarmBrowser (Rightclick alarm)

Do we need to enable a option or is it removed?

I still get Object Tools when rightclick node in Infrastructure.