Hi all!
NetXMS version 1.2.12 is out. Changes since previous release:
- Support for MetaSystem UPS in UPS subagent
- Timed (temporary) alarm acknowledgement
- New subagent DBQuery - replacement for ODBCQuery
- DCI access functions in NXSL works correctly with table DCIs
- Fixed bugs with instance discovery DCIs created from templates
- New property "runtimeFlags" in NXSL class "Node"
- New event SYS_IF_PEER_CHANGED (sent when peer change detected in interface)
- New system permission: Manage Image Library
- Object level access control can be enabled for logs
- New NXSL function FindAllDCIs
- Driver for Allied Telesis switches improved
- Management console:
- Fixed bug with red zone display in "last value" dashboard element
- Edit and delete for alarm comments are working now
- Fixed Y axis range can be set for line and bar charts
- In alarm menue are not shown incompatible for selected alarm statuses.
- Alarm status flow can be changed to strict (terminate status can be set only
after alarm is resolved). To change flow set "StrictAlarmStatusFlow" parameter to 1.
- SNMP MIB loaded into memory on first access
- Android Agent:
- Implemented "Connection notification" in status bar (feature #481)
- Fix bug in resetting switch preference (settings)
- Android Console:
- Fix bug in resetting switch preference (settings)
- Implemented "Entire network" root (feature #482)
- Manage last alarm from status bar: acknowledge, resolve, terminate (only for Android >= 4.1)
- Fixed issues: #79, #88, #280, #285, #393, #415, #470, #475, #481, #482, #483, #484, #486, #487, #490, #497, #500, #502, #504
Best regards,
Victor
Tnx.
Troubles building the source on "CentOS release 6.3 (Final)"
Making all in netxmsd
make[4]: Entering directory `/usr/local/src/netxms-1.2.12/src/server/netxmsd'
CXXLD netxmsd
../core/.libs/libnxcore.so: undefined reference to `xmpp_stanza_add_child_ex'
collect2: ld returned 1 exit status
make[4]: *** [netxmsd] Error 1
ldconfig didn't help, can you please indicate what i'm missing ?
Hi!
Try to add --disable-xmpp to configure as a workaround. I'll check what the problem is in the meantime.
Best regards,
Victor
Installation from deb on debian squeeze:
Setting up netxms-server (1.2.12) ...
NetXMS: compiling MIB files
/usr/bin/nxmibc: error while loading shared libraries: libcrypto.so.1.0.0: cannot open shared object file: No such file or directory
NetXMS: upgrading database
/usr/bin/nxdbmgr: error while loading shared libraries: libcrypto.so.1.0.0: cannot open shared object file: No such file or directory
/usr/bin/netxmsd: error while loading shared libraries: libssl.so.1.0.0: cannot open shared object file: No such file or directory
Setting up netxms-server-oracle (1.2.12) ...
looks like you migrated to wheezly? :)
--disable-xmpp solved the build error.
After starting netxmsd it was 100% cpu for quiet some time (hours !), now its usable.
Thanks, Peter.
Hi Victor,
Is this version supports H3c and shoretel? it was previously request before :)
https://www.netxms.org/forum/announcements/netxms-1-2-7-released/msg10340/#msg10340 (https://www.netxms.org/forum/announcements/netxms-1-2-7-released/msg10340/#msg10340)
Thanks.
Yes, NetXMS 1.2.12 built on Debian 7 (wheezy). I'll publish Debian 6 build shortly as it seems that there are still many Debian 6 installations.
Best regards,
Victor
Debian 6 binaries available at https://www.netxms.org/apt/dists/squeeze/main/ (https://www.netxms.org/apt/dists/squeeze/main/)
Best regards,
Victor
as always, good to have new version full of new features and can't wait for the next release.
what about the "help" issue ? for me, thing are going to slow with configuration due to lack of good documentation of everything.
Quote from: lindeamon on February 19, 2014, 11:34:08 AM
as always, good to have new version full of new features and can't wait for the next release.
what about the "help" issue ? for me, thing are going to slow with configuration due to lack of good documentation of everything.
+1 to that.
A help button that points to sertain places in online documentation would be nice. maybe to points in a wiki.
You could make it a community project so NetXMS enthousiasts can add to the documentation.. that will certainly save you some time.
It's a free product so no harm in making a bit of use of the community. I would even love to help out with that and I'm probably not the only one.
Any help with documentation is very welcome :) You can edit manual in wiki (I'll move it to pdf then), or just create separate articles about how to configure things.
Best regards,
Victor
I just updated to V1.2.12 (on linux 64bit from sources). When I want to upload a new image to the Image Library, I get the error 'Access denied'. I checked then the user group for the new permission 'Manage Image Library', but there is no such item in the list (in java console). See attached printscreen.
Thank you Victor,
personally, i would have helped out more but i believe that i have a lot of missing knowledge to begin with.
an idea for a rapid solution will be to create some sort of a list of all the abilities and possibilities of the system {for example, a list of all the possible subagents} so that at least we will know how to better use the program and with the help of us, the users, fill out every section of the list along the way.
awesome work, awesome software, great excitement with every new release for me. keep up the good work.
Lindeamon
Quote from: Dani@M3T on February 19, 2014, 10:18:56 PM
I just updated to V1.2.12 (on linux 64bit from sources). When I want to upload a new image to the Image Library, I get the error 'Access denied'. I checked then the user group for the new permission 'Manage Image Library', but there is no such item in the list (in java console). See attached printscreen.
Yes, seems that GUI update was forgotten. We will fix it in next patch release.
Best regards,
Victor
Hi Victor
Is there some kind of workaround to upload images in the meanwhile?
thanks, Dani
There are two workarounds:
1. use built-in "admin" account - it always has all access rights;
2. Assign access rights directly in database. To do so, stop NetXMS server, then execute query like this:
UPDATE users SET system_access=system_access+134217728 WHERE id=insert_user_id_here
and start server.
Best regards,
Victor
that works. thank you
I have tried to install netxms on clean Debian 6 and Debian 7 from https://www.netxms.org/apt/dists/
I have found next errors:
Debian 7:
https://www.netxms.org/apt/dists/wheezy/main/binary-i386/netxms-base_1.2.12_i386.deb
Needs libssl 0.9 which is only in Debian6.
https://www.netxms.org/apt/dists/wheezy/main/binary-i386/netxms-server-oracle_1.2.12_i386.deb
needs libio1 to start. It is not set in dependencies.
Contib folder is not contained in the packages.
Debian 6.
https://www.netxms.org/apt/dists/squeeze/main/binary-i386/netxms-server_1.2.12_i386.deb
contains netxmsd and nxdbmgr with 11 versions.
daemon does not start after initialisation of sql database.
At both:
packages do not contain images folder.
Debian 7 + Oracle
Errors in event monitor
2014-02-24 10:58:13 i0debnxms Critical SYS_DB_QUERY_FAILED Database query failed (Query: INSERT INTO object_properties (name,status,is_deleted,inherit_access_rights,last_modified,status_calc_alg,status_prop_alg,status_fixed_val,status_shift,status_translation,status_single_threshold,status_thresholds,comments,is_system,location_type,latitude,longitude,location_accuracy,location_timestamp,guid,image,submap_id,object_id) VALUES (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?); Error: ORA-01400: cannot insert NULL into ("NETXMS"."OBJECT_PROPERTIES"."NAME"))
see attach
Похоже что каким-то образом создался обьект с пустым именем. Можете прислать результат команды
nxadm -c "show objects"
?
Ну и после этого можно попробовать найти этот обьект и дать ему какое-то имя руками.
Quote from: Victor Kirhenshtein on February 24, 2014, 11:05:12 AM
Похоже что каким-то образом создался обьект с пустым именем. Можете прислать результат команды
nxadm -c "show objects"
?
Ну и после этого можно попробовать найти этот обьект и дать ему какое-то имя руками.
Это достаточно проблематично...
nxadm -c "show stat"
Total number of objects: 22425
Number of monitored nodes: 898
Number of collectable DCIs: 913
POLL ERROR: Software caused connection abort: socket write error **** Poll failed ****
Console doesn't work after 5..10min.
After upgrading to 1.2.12 Server & console same version.
anyone? same issue ?
Regards
Quote from: d-ray on February 26, 2014, 05:17:26 PM
POLL ERROR: Software caused connection abort: socket write error **** Poll failed ****
Console doesn't work after 5..10min.
After upgrading to 1.2.12 Server & console same version.
anyone? same issue ?
Regards
It's a known issue. It is already fixed in current development version, we will publish next patch release soon.
Best regards,
Victor
Quote from: Victor Kirhenshtein on February 26, 2014, 11:44:49 PM
Quote from: d-ray on February 26, 2014, 05:17:26 PM
POLL ERROR: Software caused connection abort: socket write error **** Poll failed ****
Console doesn't work after 5..10min.
After upgrading to 1.2.12 Server & console same version.
anyone? same issue ?
Regards
It's a known issue. It is already fixed in current development version, we will publish next patch release soon.
Best regards,
Victor
ooh that's nice.
Where can I find the patch release when it is available?
Best Regards,
Denis
It will be published on web site and announced on forum as usual.
Best regards,
Victor
Will this version allow me to do what is in the following post and if so, how do I set it up?
https://www.netxms.org/forum/general-support/dci-thresold-on-string/ (https://www.netxms.org/forum/general-support/dci-thresold-on-string/)
Quote from: Rabid on February 28, 2014, 10:31:21 PM
Will this version allow me to do what is in the following post and if so, how do I set it up?
https://www.netxms.org/forum/general-support/dci-thresold-on-string/ (https://www.netxms.org/forum/general-support/dci-thresold-on-string/)
Sorry, I forgot to do this in 1.2.12. I've added service list and service table to Windows agent in upcoming 1.2.13 release.
Best regards,
Victor
Debian 7 + mysql
В клиенте под дебиан
Object Details -> Last values
Окошко пустое, несмотря на то что данные собираются.
В виндовом клиенте все работает.
см скриншоты.
Debian 7 + oracle 11g
Сегодня утром обнаружил netxms в нерабочем состоянии.
Последние записи в логах:
$ tail /var/log/netxms
[05-Mar-2014 04:39:53.636] [ERROR] SQL query failed (Query = "INSERT INTO object_properties (name,status,is_deleted,inherit_access_rights,last_modified,status_calc_alg,status_prop_alg,status_fixed_val,status_shift,status_translation,status_single_threshold,status_thresholds,comments,is_system,location_type,latitude,longitude,location_accuracy,location_timestamp,guid,image,submap_id,object_id) VALUES (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)"): ORA-01400: cannot insert NULL into ("NETXMS"."OBJECT_PROPERTIES"."NAME")
[05-Mar-2014 04:39:53.638] [ERROR] SQL query failed (Query = "INSERT INTO object_properties (name,status,is_deleted,inherit_access_rights,last_modified,status_calc_alg,status_prop_alg,status_fixed_val,status_shift,status_translation,status_single_threshold,status_thresholds,comments,is_system,location_type,latitude,longitude,location_accuracy,location_timestamp,guid,image,submap_id,object_id) VALUES (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)"): ORA-01400: cannot insert NULL into ("NETXMS"."OBJECT_PROPERTIES"."NAME")
[05-Mar-2014 04:39:53.639] [ERROR] SQL query failed (Query = "INSERT INTO object_properties (name,status,is_deleted,inherit_access_rights,last_modified,status_calc_alg,status_prop_alg,status_fixed_val,status_shift,status_translation,status_single_threshold,status_thresholds,comments,is_system,location_type,latitude,longitude,location_accuracy,location_timestamp,guid,image,submap_id,object_id) VALUES (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)"): ORA-01400: cannot insert NULL into ("NETXMS"."OBJECT_PROPERTIES"."NAME")
[05-Mar-2014 04:39:53.641] [ERROR] SQL query failed (Query = "INSERT INTO object_properties (name,status,is_deleted,inherit_access_rights,last_modified,status_calc_alg,status_prop_alg,status_fixed_val,status_shift,status_translation,status_single_threshold,status_thresholds,comments,is_system,location_type,latitude,longitude,location_accuracy,location_timestamp,guid,image,submap_id,object_id) VALUES (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)"): ORA-01400: cannot insert NULL into ("NETXMS"."OBJECT_PROPERTIES"."NAME")
[05-Mar-2014 04:39:53.642] [ERROR] SQL query failed (Query = "INSERT INTO object_properties (name,status,is_deleted,inherit_access_rights,last_modified,status_calc_alg,status_prop_alg,status_fixed_val,status_shift,status_translation,status_single_threshold,status_thresholds,comments,is_system,location_type,latitude,longitude,location_accuracy,location_timestamp,guid,image,submap_id,object_id) VALUES (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)"): ORA-01400: cannot insert NULL into ("NETXMS"."OBJECT_PROPERTIES"."NAME")
[05-Mar-2014 04:39:53.644] [ERROR] SQL query failed (Query = "INSERT INTO object_properties (name,status,is_deleted,inherit_access_rights,last_modified,status_calc_alg,status_prop_alg,status_fixed_val,status_shift,status_translation,status_single_threshold,status_thresholds,comments,is_system,location_type,latitude,longitude,location_accuracy,location_timestamp,guid,image,submap_id,object_id) VALUES (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)"): ORA-01400: cannot insert NULL into ("NETXMS"."OBJECT_PROPERTIES"."NAME")
[05-Mar-2014 04:39:53.646] [ERROR] SQL query failed (Query = "INSERT INTO object_properties (name,status,is_deleted,inherit_access_rights,last_modified,status_calc_alg,status_prop_alg,status_fixed_val,status_shift,status_translation,status_single_threshold,status_thresholds,comments,is_system,location_type,latitude,longitude,location_accuracy,location_timestamp,guid,image,submap_id,object_id) VALUES (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)"): ORA-01400: cannot insert NULL into ("NETXMS"."OBJECT_PROPERTIES"."NAME")
[05-Mar-2014 04:39:53.647] [ERROR] SQL query failed (Query = "INSERT INTO object_properties (name,status,is_deleted,inherit_access_rights,last_modified,status_calc_alg,status_prop_alg,status_fixed_val,status_shift,status_translation,status_single_threshold,status_thresholds,comments,is_system,location_type,latitude,longitude,location_accuracy,location_timestamp,guid,image,submap_id,object_id) VALUES (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)"): ORA-01400: cannot insert NULL into ("NETXMS"."OBJECT_PROPERTIES"."NAME")
[05-Mar-2014 04:39:53.649] [ERROR] SQL query failed (Query = "INSERT INTO object_properties (name,status,is_deleted,inherit_access_rights,last_modified,status_calc_alg,status_prop_alg,status_fixed_val,status_shift,status_translation,status_single_threshold,status_thresholds,comments,is_system,location_type,latitude,longitude,location_accuracy,location_timestamp,guid,image,submap_id,object_id) VALUES (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)"): ORA-01400: cannot insert NULL into ("NETXMS"."OBJECT_PROPERTIES"."NAME")
[05-Mar-2014 04:39:53.650] [ERROR] SQL query failed (Query = "INSERT INTO object_properties (name,status,is_deleted,inherit_access_rights,last_modified,status_calc_alg,status_prop_alg,status_fixed_val,status_shift,status_translation,status_single_threshold,status_thresholds,comments,is_system,location_type,latitude,longitude,location_accuracy,location_timestamp,guid,image,submap_id,object_id) VALUES (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)"): ORA-01400: cannot insert NULL into ("NETXMS"."OBJECT_PROPERTIES"."NAME")
Указанные выше ошибки почти ничего не дают.
Сделал дополнительный запрос:
$ grep -v 'SQL query failed (Query = "INSERT INTO object_properties' /var/log/netxms
[05-Mar-2014 03:11:49.955] Log file truncated.
[05-Mar-2014 03:13:50.710] [ERROR] SQL query failed (Query = "INSERT INTO raw_dci_values (item_id,raw_value,last_poll_time) VALUES (30893,'CF E0 EA E5 F2 20 E4 F0 E0 E9 E2 E5 F0 EE E2 20 57 69 6E 64 6F 77 73 20 2D 20 41 76 74 6F 72 20 4C 74 64 2E 20 28 52 45 41 44 45 52 33 37 38 30 29 20 53 6D 61 72 74 43 61 72 64 5????"): ORA-01756: quoted string not properly terminated
[05-Mar-2014 03:13:50.818] [ERROR] SQL query failed (Query = "INSERT INTO raw_dci_values (item_id,raw_value,last_poll_time) VALUES (30914,'C1 E0 E7 EE E2 FB E9 20 EF E0 EA E5 F2 20 EF EE F1 F2 E0 E2 F9 E8 EA E0 20 F1 EB F3 E6 E1 FB 20 EA F0 E8 EF F2 EE E3 F0 E0 F4 E8 E8 20 F1 EC E0 F0 F2 2D EA E0 F0 F2 20 28 4D 69 6????"): ORA-01756: quoted string not properly terminated
[05-Mar-2014 03:13:50.822] [ERROR] SQL query failed (Query = "INSERT INTO raw_dci_values (item_id,raw_value,last_poll_time) VALUES (30915,'CE E1 ED EE E2 EB E5 ED E8 E5 20 E1 E5 E7 EE EF E0 F1 ED EE F1 F2 E8 20 E4 EB FF 20 EF F0 EE E8 E3 F0 FB E2 E0 F2 E5 EB FF 20 57 69 6E 64 6F 77 73 20 4D 65 64 69 61 20 2D 20 28 4????"): ORA-01756: quoted string not properly terminated
[05-Mar-2014 03:13:50.830] [ERROR] SQL query failed (Query = "INSERT INTO raw_dci_values (item_id,raw_value,last_poll_time) VALUES (30917,'CE E1 ED EE E2 EB E5 ED E8 E5 20 E1 E5 E7 EE EF E0 F1 ED EE F1 F2 E8 20 E4 EB FF 20 EF F0 EE E8 E3 F0 FB E2 E0 F2 E5 EB FF 20 57 69 6E 64 6F 77 73 20 4D 65 64 69 61 20 36 2E 34 2????"): ORA-01756: quoted string not properly terminated
[05-Mar-2014 03:13:50.839] [ERROR] SQL query failed (Query = "INSERT INTO raw_dci_values (item_id,raw_value,last_poll_time) VALUES (30919,'CE E1 ED EE E2 EB E5 ED E8 E5 20 E1 E5 E7 EE EF E0 F1 ED EE F1 F2 E8 20 E4 EB FF 20 EF F0 EE E8 E3 F0 FB E2 E0 F2 E5 EB FF 20 57 69 6E 64 6F 77 73 20 4D 65 64 69 61 20 31 31 20 2????"): ORA-01756: quoted string not properly terminated
[05-Mar-2014 03:13:50.863] [ERROR] SQL query failed (Query = "INSERT INTO raw_dci_values (item_id,raw_value,last_poll_time) VALUES (30920,'C8 F1 EF F0 E0 E2 EB E5 ED E8 E5 20 E4 EB FF 20 EF F0 EE E8 E3 F0 FB E2 E0 F2 E5 EB FF 20 57 69 6E 64 6F 77 73 20 4D 65 64 69 61 20 31 31 20 2D 20 28 4B 42 39 33 39 36 38 33 29 '????"): ORA-00917: missing comma
[05-Mar-2014 03:13:50.894] [ERROR] SQL query failed (Query = "INSERT INTO raw_dci_values (item_id,raw_value,last_poll_time) VALUES (30925,'CE E1 ED EE E2 EB E5 ED E8 E5 20 E1 E5 E7 EE EF E0 F1 ED EE F1 F2 E8 20 E4 EB FF 20 57 69 6E 64 6F 77 73 20 49 6E 74 65 72 6E 65 74 20 45 78 70 6C 6F 72 65 72 20 37 20 28 4B 42 3????"): ORA-01756: quoted string not properly terminated
[05-Mar-2014 04:06:45.258] [ERROR] SQL query failed (Query = "UPDATE alarms SET alarm_state=0,ack_by=0,term_by=0,last_change_time=1393992405,current_severity=2,repeat_count=4,hd_state=0,hd_ref='',timeout=0,timeout_event=43,message='Invalid network mask 255.255.248.0 on interface "Realtek RTL8139 Family PCI Fast Ethernet NIC - ???????? ???????", should be 255.255.240.0',resolved_by=0, ack_timeout=0 WHERE alarm_id=25392"): ORA-24373: invalid length specified for statement
Syslog
Mar 5 04:39:55 i0debnxms kernel: [66247.765920] netxmsd invoked oom-killer: gfp_mask=0x201da, order=0, oom_adj=0, oom_score_adj=0
Mar 5 04:39:55 i0debnxms kernel: [66247.765924] netxmsd cpuset=/ mems_allowed=0
Mar 5 04:39:55 i0debnxms kernel: [66247.765929] Pid: 3468, comm: netxmsd Not tainted 3.2.0-4-amd64 #1 Debian 3.2.54-2
Mar 5 04:39:55 i0debnxms kernel: [66247.765931] Call Trace:
Mar 5 04:39:55 i0debnxms kernel: [66247.765939] [<ffffffff810b742a>] ? dump_header+0x78/0x1bd
Mar 5 04:39:55 i0debnxms kernel: [66247.765943] [<ffffffff8134fdc7>] ? _raw_spin_unlock_irqrestore+0xe/0xf
Mar 5 04:39:55 i0debnxms kernel: [66247.765947] [<ffffffff81097dce>] ? delayacct_end+0x72/0x7d
Mar 5 04:39:55 i0debnxms kernel: [66247.765951] [<ffffffff81164a56>] ? security_real_capable_noaudit+0x40/0x4f
Mar 5 04:39:55 i0debnxms kernel: [66247.765953] [<ffffffff8134fdc7>] ? _raw_spin_unlock_irqrestore+0xe/0xf
Mar 5 04:39:55 i0debnxms kernel: [66247.765956] [<ffffffff810b77f3>] ? oom_kill_process+0x49/0x271
Mar 5 04:39:55 i0debnxms kernel: [66247.765958] [<ffffffff810b7eee>] ? out_of_memory+0x2ea/0x337
Mar 5 04:39:55 i0debnxms kernel: [66247.765962] [<ffffffff810bbb85>] ? __alloc_pages_nodemask+0x629/0x7aa
Mar 5 04:39:55 i0debnxms kernel: [66247.765967] [<ffffffff810e5246>] ? alloc_pages_current+0xc7/0xe4
Mar 5 04:39:55 i0debnxms kernel: [66247.765970] [<ffffffff810b6b2f>] ? filemap_fault+0x24f/0x33e
Mar 5 04:39:55 i0debnxms kernel: [66247.765974] [<ffffffff8106e102>] ? futex_wait+0x1fe/0x236
Mar 5 04:39:55 i0debnxms kernel: [66247.765978] [<ffffffff810ced10>] ? __do_fault+0xc8/0x3ac
Mar 5 04:39:55 i0debnxms kernel: [66247.765981] [<ffffffff810d12c7>] ? handle_pte_fault+0x298/0x79f
Mar 5 04:39:55 i0debnxms kernel: [66247.765984] [<ffffffff810ce865>] ? pte_offset_kernel+0x16/0x35
Mar 5 04:39:55 i0debnxms kernel: [66247.765988] [<ffffffff81352d2e>] ? do_page_fault+0x320/0x345
Mar 5 04:39:55 i0debnxms kernel: [66247.765992] [<ffffffff81048cd2>] ? release_task+0x31b/0x331
Mar 5 04:39:55 i0debnxms kernel: [66247.765995] [<ffffffff81036628>] ? should_resched+0x5/0x23
Mar 5 04:39:55 i0debnxms kernel: [66247.765998] [<ffffffff8106fb91>] ? sys_futex+0x120/0x153
Mar 5 04:39:55 i0debnxms kernel: [66247.766001] [<ffffffff81350335>] ? page_fault+0x25/0x30
Mar 5 04:39:55 i0debnxms kernel: [66247.766004] Mem-Info:
Mar 5 04:39:55 i0debnxms kernel: [66247.766005] Node 0 DMA per-cpu:
Mar 5 04:39:55 i0debnxms kernel: [66247.766007] CPU 0: hi: 0, btch: 1 usd: 0
Mar 5 04:39:55 i0debnxms kernel: [66247.766008] Node 0 DMA32 per-cpu:
Mar 5 04:39:55 i0debnxms kernel: [66247.766010] CPU 0: hi: 186, btch: 31 usd: 0
Mar 5 04:39:55 i0debnxms kernel: [66247.766014] active_anon:118067 inactive_anon:118119 isolated_anon:0
Mar 5 04:39:55 i0debnxms kernel: [66247.766015] active_file:9 inactive_file:24 isolated_file:0
Mar 5 04:39:55 i0debnxms kernel: [66247.766016] unevictable:0 dirty:0 writeback:0 unstable:0
Mar 5 04:39:55 i0debnxms kernel: [66247.766016] free:12260 slab_reclaimable:792 slab_unreclaimable:2812
Mar 5 04:39:55 i0debnxms kernel: [66247.766017] mapped:1 shmem:0 pagetables:1187 bounce:0
Mar 5 04:39:55 i0debnxms kernel: [66247.766019] Node 0 DMA free:4660kB min:680kB low:848kB high:1020kB active_anon:5572kB inactive_anon:5660kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15688kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:20kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:51 all_unreclaimable? yes
Mar 5 04:39:55 i0debnxms kernel: [66247.766028] lowmem_reserve[]: 0 994 994 994
Mar 5 04:39:55 i0debnxms kernel: [66247.766031] Node 0 DMA32 free:44380kB min:44372kB low:55464kB high:66556kB active_anon:466696kB inactive_anon:466816kB active_file:36kB inactive_file:96kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:1018016kB mlocked:0kB dirty:0kB writeback:0kB mapped:4kB shmem:0kB slab_reclaimable:3168kB slab_unreclaimable:11248kB kernel_stack:1648kB pagetables:4728kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:7801 all_unreclaimable? yes
Mar 5 04:39:55 i0debnxms kernel: [66247.766040] lowmem_reserve[]: 0 0 0 0
Mar 5 04:39:55 i0debnxms kernel: [66247.766042] Node 0 DMA: 1*4kB 2*8kB 2*16kB 2*32kB 3*64kB 2*128kB 2*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 4660kB
Mar 5 04:39:55 i0debnxms kernel: [66247.766049] Node 0 DMA32: 133*4kB 79*8kB 45*16kB 20*32kB 20*64kB 17*128kB 4*256kB 3*512kB 1*1024kB 1*2048kB 8*4096kB = 44380kB
Mar 5 04:39:55 i0debnxms kernel: [66247.766055] 1145 total pagecache pages
Mar 5 04:39:55 i0debnxms kernel: [66247.766056] 1104 pages in swap cache
Mar 5 04:39:55 i0debnxms kernel: [66247.766058] Swap cache stats: add 324461, delete 323357, find 89314/108284
Mar 5 04:39:55 i0debnxms kernel: [66247.766059] Free swap = 0kB
Mar 5 04:39:55 i0debnxms kernel: [66247.766060] Total swap = 392188kB
Mar 5 04:39:55 i0debnxms kernel: [66247.768594] 262128 pages RAM
Mar 5 04:39:55 i0debnxms kernel: [66247.768596] 5436 pages reserved
Mar 5 04:39:55 i0debnxms kernel: [66247.768597] 13 pages shared
Mar 5 04:39:55 i0debnxms kernel: [66247.768598] 243869 pages non-shared
Mar 5 04:39:55 i0debnxms kernel: [66247.768599] [ pid ] uid tgid total_vm rss cpu oom_adj oom_score_adj name
Mar 5 04:39:55 i0debnxms kernel: [66247.768607] [ 339] 0 339 5378 1 0 -17 -1000 udevd
Mar 5 04:39:55 i0debnxms kernel: [66247.768610] [ 461] 0 461 5360 1 0 -17 -1000 udevd
Mar 5 04:39:55 i0debnxms kernel: [66247.768613] [ 462] 0 462 5360 1 0 -17 -1000 udevd
Mar 5 04:39:55 i0debnxms kernel: [66247.768616] [ 1744] 0 1744 4743 13 0 0 0 rpcbind
Mar 5 04:39:55 i0debnxms kernel: [66247.768619] [ 1775] 102 1775 5836 1 0 0 0 rpc.statd
Mar 5 04:39:55 i0debnxms kernel: [66247.768622] [ 1793] 0 1793 6323 0 0 0 0 rpc.idmapd
Mar 5 04:39:55 i0debnxms kernel: [66247.768625] [ 2075] 0 2075 30415 198 0 0 0 nxagentd
Mar 5 04:39:55 i0debnxms kernel: [66247.768628] [ 2093] 0 2093 13293 0 0 0 0 rsyslogd
Mar 5 04:39:55 i0debnxms kernel: [66247.768631] [ 2137] 0 2137 1029 0 0 0 0 acpid
Mar 5 04:39:55 i0debnxms kernel: [66247.768634] [ 2162] 0 2162 4168 0 0 0 0 atd
Mar 5 04:39:55 i0debnxms kernel: [66247.768636] [ 2207] 0 2207 5726 17 0 0 0 cron
Mar 5 04:39:55 i0debnxms kernel: [66247.768639] [ 2237] 106 2237 7451 0 0 0 0 dbus-daemon
Mar 5 04:39:55 i0debnxms kernel: [66247.768642] [ 2516] 101 2516 11702 4 0 0 0 exim4
Mar 5 04:39:55 i0debnxms kernel: [66247.768645] [ 2557] 0 2557 12482 0 0 -17 -1000 sshd
Mar 5 04:39:55 i0debnxms kernel: [66247.768648] [ 2590] 0 2590 4695 1 0 0 0 getty
Mar 5 04:39:55 i0debnxms kernel: [66247.768650] [ 2591] 0 2591 4695 1 0 0 0 getty
Mar 5 04:39:55 i0debnxms kernel: [66247.768653] [ 2592] 0 2592 4695 1 0 0 0 getty
Mar 5 04:39:55 i0debnxms kernel: [66247.768655] [ 2593] 0 2593 4695 1 0 0 0 getty
Mar 5 04:39:55 i0debnxms kernel: [66247.768658] [ 2594] 0 2594 4695 1 0 0 0 getty
Mar 5 04:39:55 i0debnxms kernel: [66247.768660] [ 2595] 0 2595 4695 1 0 0 0 getty
Mar 5 04:39:55 i0debnxms kernel: [66247.768663] [27084] 0 27084 469821 233310 0 0 0 netxmsd
Mar 5 04:39:55 i0debnxms kernel: [66247.768665] Out of memory: Kill process 27084 (netxmsd) score 902 or sacrifice child
Mar 5 04:39:55 i0debnxms kernel: [66247.768764] Killed process 27084 (netxmsd) total-vm:1879284kB, anon-rss:933240kB, file-rss:0kB
Похоже на memory leak.
Вывод от
$ps ax -F
Ничего подозрительного не показывает.
$ free -l -b
total used free shared buffers cached
Mem: 1051672576 96010240 955662336 0 14950400 41709568
Low: 1051672576 96010240 955662336
High: 0 0 0
-/+ buffers/cache: 39350272 1012322304
Swap: 401600512 9871360 391729152
На сервере не установлено ничего кроме netxms.
Netxms установлен из пакетов:
$ ls -lah
total 9.4M
drwxr-xr-x 2 andrey andrey 4.0K Feb 19 10:16 .
drwxr-xr-x 6 andrey andrey 4.0K Feb 19 12:41 ..
-rw-r--r-- 1 andrey andrey 1.1M Feb 18 08:50 netxms-agent_1.2.12_amd64.deb
-rw-r--r-- 1 andrey andrey 1.4M Feb 18 08:50 netxms-base_1.2.12_amd64.deb
-rw-r--r-- 1 andrey andrey 6.9M Feb 18 08:50 netxms-server_1.2.12_amd64.deb
-rw-r--r-- 1 andrey andrey 27K Feb 18 08:50 netxms-server-oracle_1.2.12_amd64.deb
Есть предположение что виновником является модуль БД (oracle).
Дополнительно был установлен:
instantclient-basic-linux.x64-11.2.0.4.0
После перезапуска в лог сыпятся ошибки:
[05-Mar-2014 08:31:11.443] [ERROR] SQL query failed (Query = "INSERT INTO raw_dci_values (item_id,raw_value,last_poll_time) VALUES (30920,'C8 F1 EF F0 E0 E2 EB E5 ED E8 E5 20 E4 EB FF 20 EF F0 EE E8 E3 F0 FB E2 E0 F2 E5 EB FF 20 57 69 6E 64 6F 77 73 20 4D 65 64 69 61 20 31 31 20 2D 20 28 4B 42 39 33 39 36 38 33 29 '????"): ORA-00917: missing comma
[05-Mar-2014 08:31:11.469] [ERROR] SQL query failed (Query = "INSERT INTO raw_dci_values (item_id,raw_value,last_poll_time) VALUES (30925,'CE E1 ED EE E2 EB E5 ED E8 E5 20 E1 E5 E7 EE EF E0 F1 ED EE F1 F2 E8 20 E4 EB FF 20 57 69 6E 64 6F 77 73 20 49 6E 74 65 72 6E 65 74 20 45 78 70 6C 6F 72 65 72 20 37 20 28 4B 42 3????"): ORA-01756: quoted string not properly terminated
[05-Mar-2014 08:31:57.546] [ERROR] Thread "Syncer Thread" does not respond to watchdog thread
Debian7 + oracle 11g
netxms съел всю память и опять упал.
$ uptime
09:19:17 up 55 min, 2 users, load average: 0.03, 0.25, 0.23
syslog:
Mar 5 09:15:21 i0debnxms kernel: [ 3105.416859] Out of memory: Kill process 2116 (netxmsd) score 888 or sacrifice child
Mar 5 09:15:21 i0debnxms kernel: [ 3105.416958] Killed process 2116 (netxmsd) total-vm:1928652kB, anon-rss:922256kB, file-rss:0kB
Удалил 75000 DCI.
Потребление памяти после запуска уменьшилось с 960 MB до 215 MB.
Нагрузка на ЦПУ упала с 50-60% до 0,5-2%.
В консоли на windows не работает переключение языка.
Quote from: andrey--k on March 05, 2014, 11:25:51 AM
Удалил 75000 DCI.
Потребление памяти после запуска уменьшилось с 960 MB до 215 MB.
Нагрузка на ЦПУ упала с 50-60% до 0,5-2%.
What type those DCIs was - SNMP or agent? What was collection interval?
Best regards,
Victor
It was snmp-based dynamic DCI (installed software).
Collect interval 20 000.
Was assigned to 200+ nodes.
Quote from: andrey--k on March 05, 2014, 05:09:29 PM
В консоли на windows не работает переключение языка.
Утром обнаружил консоль на русском.
Провел тест:
Переключаю на английский - консоль просит перезапуска.
Соглашаюсь, консоль перезапускается - остался русский.
Закрываю, открываю по новой - уже английский...
I have faced with memory-end trouble again.
This morning netxms was killed by system in couse of low memory. Syslog dump in attach.
System 3Gib RAM.
nxadm -c " sh sta"
Total number of objects: 36245
Number of monitored nodes: 2533
Number of collectable DCIs: 4172
DCI:
snmp-table.
After of almost hour work of netxms it used:
Quote
$ ps all x | grep netx
F UID PID PPID PRI NI VSZ RSS WCHAN STAT TTY TIME COMMAND
1 0 4826 1 20 0 2527320 1943124 ? Ssl ? 53:00 /usr/local/bin/netxmsd -d -D 5
$ free -l
total used free shared buffers cached
Mem: 3095372 2216484 878888 0 37024 180472
Low: 3095372 2216484 878888
High: 0 0 0
-/+ buffers/cache: 1998988 1096384
Swap: 731132 0 731132
Debian 7 + mysql
After element deletion, error messages appears until daemon restart:
[13-Mar-2014 21:11:17.846] [DEBUG] * Syncer * Unable to delete object with id 149 because it is being referenced 2 time(s)
[13-Mar-2014 21:11:17.846] [DEBUG] * Syncer * Unable to delete object with id 197 because it is being referenced 1 time(s)
[13-Mar-2014 21:11:17.846] [DEBUG] * Syncer * Unable to delete object with id 199 because it is being referenced 1 time(s)
[13-Mar-2014 21:12:17.866] [DEBUG] * Syncer * Unable to delete object with id 149 because it is being referenced 2 time(s)
[13-Mar-2014 21:12:17.866] [DEBUG] * Syncer * Unable to delete object with id 197 because it is being referenced 1 time(s)
[13-Mar-2014 21:12:17.866] [DEBUG] * Syncer * Unable to delete object with id 199 because it is being referenced 1 time(s)
netxms-snapshot-develop (2014-03-11) + Debian 7 + Oracle 11
This morning I have found netxmsd in down state.
there were no any warnings/errors in /val/log/netxms and syslog.
but dmesg show segfault in same time as last message in /val/log/netxms
[260980.857523] conftest[20746]: segfault at 0 ip 00007fb7e577384b sp 00007fffdab78470 error 4 in libc-2.13.so[7fb7e5702000+182000]
netxms-snapshot-develop (2014-03-17) + Debian 7 + Oracle 11
netxms опять упал по памяти...
syslog в аттаче.
Сбор dci выключен.
$ nxadm -c " show stats"
Total number of objects: 18144
Number of monitored nodes: 1023
Number of collectable DCIs: 1030
Что может быть?
Как бороться с падениями?
Edit:
Пересмотрел сообщения syslog.
На этот раз падение вызвано тем, что закончился swap. Добавил места на swap разделе... наблюдаем-с...
А как выглядит график использования памяти процессом netxmsd? Может там просто memory leak...
Quote from: Victor Kirhenshtein on March 18, 2014, 05:10:43 PM
А как выглядит график использования памяти процессом netxmsd? Может там просто memory leak...
Постоянно растет.
Точные цифры смогу привести завтра утром.
А под valgrind будет возможность запустить?
Quote from: Victor Kirhenshtein on March 18, 2014, 06:43:18 PM
А под valgrind будет возможность запустить?
Думаю, что да.
Там требуются какие-либо специфические действия перед запуском (перекомпиляция с доп-флагами, настройки среды)?
Нет, просто запустить. Для valgrind'а рекомендуемые параметры --leak-check=full --undef-value-errors=no
Для теста запустил на "тестовой" конфигурации.
Debian7 + mysql
Результат в аттаче.
Открыл Quick Start Guide по valgrind.
В рекомендациях:
Compile your program with -g to include debugging information so that Memcheck's error messages include exact line numbers. Using -O0 is also a good idea
Как правильно включить эти опции при ./configure
На сколько я понимаю для добавления "-g" используется --enable-debug
Как без переписывания configure включить -O0 вместо -O2 ?
Update:
Поменял вручную -O2 на -O0 в configure
перекомпилировал, запустил и остановил демон.
Вывод консоли по ссылке:
https://drive.google.com/file/d/0B0SS9KVzB6egT1BNSm9qUW0zQ0k/edit?usp=sharing
Результаты работы 40+ минут...
https://drive.google.com/file/d/0B0SS9KVzB6egbG9hSmE0MXZwZnM/edit?usp=sharing
то же для агента:
https://drive.google.com/file/d/0B0SS9KVzB6egQWtqaEMwWUtiYUk/edit?usp=sharing
Quote from: andrey--k on March 18, 2014, 10:48:40 AM
На этот раз падение вызвано тем, что закончился swap. Добавил места на swap разделе... наблюдаем-с...
Статистика на сегодня:
$ nxadm -c " show stats"
Total number of objects: 21784
Number of monitored nodes: 1306
Number of collectable DCIs: 1322
$ free
total used free shared buffers cached
Mem: 3095372 3016412 78960 0 1312 23180
-/+ buffers/cache: 2991920 103452
Swap: 6288380 3289264 2999116
Попробую запустить под valgrind.
[quote author=andrey--k link=topic=2922.msg13170#msg13170 date=1395132520]
netxms-snapshot-develop (2014-03-17) + Debian 7 + Oracle 11
Запустил под valgrind.
Результат во вложении.
На заметку:
под valgrid большая нагрузка на ЦПУ. Есть подозрение что netxms не успевал обрабатывать все очереди.
Ничего сильно подозрительного не нашел в логе. Был один маленький leak при открытии сессии, но к таким результатам он не должен был привести.
Можете попробовать погонять подольше систему? И второе - попробовать massif (http://valgrind.org/docs/manual/ms-manual.html)?
Что-то мне кажется, что это не утечка, а нормальное поведение...
Сегодня когда удалял несколько устройств (несколько маршрутизаторов) - потребление памяти упало (на графике). Учитывая что памяти на сервере 3ГБ - и удаление нескольких устройств заметно - то возникает закономерный вопрос расхода памяти на 1 устройство (многопортовая циска). А их много...
Появляется мысль промониторить пару-тройку дней, если % памяти расти перестанет (падать система в теории не должна, там в данный момент swap-раздел большой).
Под massif обязательно попробую запустить.
Offtop:
При запуске nxagentd вчитался в ошибки.
make install не создал каталог:
/usr/local/var/netxms/
Обещаный valgrind --tool=massif
см аттач.
За 2 дня, после обнуления БД, почему-то большая часть оборудования не стала под мониторинг, по этому нагрузка на сервер маленькая.
$ nxadm -c " sh sta"
Total number of objects: 7437
Number of monitored nodes: 566
Number of collectable DCIs: 574
Since this version has segmentation faults which could be devastating for mission critical environments - perhaps it's not a good idea to let 1.2.12 be available for download..
Hi!
I suppose you mean 1.2.13 - 1.2.12 already archived and not referenced directly on download page. I'll prepare 1.2.14 release in a next few days - it should fix crashes discovered in 1.2.12 and 1.2.13. Then 1.2.13 will be moved to archive.
Best regards,
Victor
Yes, you're correct. Sorry about the confusion. Looking forward to your new release. Cheers.