News:

We really need your input in this questionnaire

Main Menu
Menu

Show posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Show posts Menu

Messages - millerpaint

#31
Announcements / Re: NetXMS 1.2.6 released
February 27, 2013, 01:42:11 AM
QuoteSyslog should work. And I do some fixes related to memory corruption issues.

This is great news, thanks!


-Kevin C.
#32
Hi Testos,

This is something that we can use for sure.  Were you successful using the simplified 2 step implementation?


-Kevin C.
#33
General Support / Re: Out of memory NetXMS v1.2.5
February 12, 2013, 07:20:49 PM
OK thanks - all log files you have requested from valgrind have been sent to that email address.


-Kevin C.
#34
General Support / Re: Out of memory NetXMS v1.2.5
February 12, 2013, 06:49:59 PM
Hi Victor,

I am unable to attach the valgrind log from Step 1, it is about 600k in size, and your forum will not allow me to post it.

Can you please raise the limit of your attachment size on this forum, or else let me know your email address?


Thanks,

-Kevin C.
#35
General Support / Re: Out of memory NetXMS v1.2.5
February 10, 2013, 05:04:02 AM
Hi Victor,

OK, I can do step 1 additional options with no problem.

With Step 2, you are asking me to run valgrind's heap profiler.  I have questions on that:

1) Is step 2 option run after I have completed running the step 1 test, and it runs out of memory?
2) Do I need to start Alex's script before running step 2?


-Kevin C.
#36
General Support / Re: Out of memory NetXMS v1.2.5
February 08, 2013, 11:46:39 PM
FYI, I have completely disabled auto-discovery, and it is still running out of memory.


-Kevin C.
#37
General Support / Re: Out of memory NetXMS v1.2.5
February 08, 2013, 08:57:40 PM
Hi Alex,

I modified the script for a 2GB threshold and re-ran it, the new log file is attached.  Hopefully this will provide some clues as to what is going on.

When monitoring with top, it seems to start consuming RAM when the timer reaches 14:41:

netxmsd starts out using .5% available RAM of server.  Then:
14:41 - .6%
15:50 - .7%
16:53 - .8%
17:69 - .9%
18:60 - 1%
19:20 - 1.1%
20:05 - 1.2%
etc.


Thanks for your help!

-Kevin C.
#38
General Support / Re: Out of memory NetXMS v1.2.5
February 07, 2013, 09:31:06 PM
QuoteThis is normal, when running under valgrind program takes tens times more memory then when it run normally. Valgrind allocates extra memory around each dynamically allocated block to detect boundary violations, etc.

OK, that makes sense Victor.

I have attached a screenshot of my Network Discovery panel, so you can see more about the details of my configuration.  I also have (2) SNMP community strings listed, but they are not visible in the screenshot image.

I am using top to monitor the memory consumption of netxmsd - it seems to be consuming 1/10th of 1% of available RAM every few seconds, running in normal mode.


-Kevin C.
#39
General Support / Re: Out of memory NetXMS v1.2.5
February 07, 2013, 06:57:38 PM
OK, good news Alex.  The script you provided shut down NetXMS gracefully after it reached the 1GB of RAM threshold.  It took 12-15 minutes running under valgrind before it crashed.  I have attached the valgrind log.


Thanks,

-Kevin C.
#40
General Support / Re: Out of memory NetXMS v1.2.5
February 07, 2013, 06:33:43 PM
Hi,

One thing seems strange, it crashes with out of memory right away running under valgrind.  If I run netxmsd normally, it can run for hours before crashing.


-Kevin C.
#41
General Support / Re: Out of memory NetXMS v1.2.5
February 07, 2013, 05:55:16 PM
Hi,

Thanks for the feedback guys!

Testos,  I do not believe this is hardware related, as this is a virtual server running on an IBM x3550 M4 (ESXi 5.0) along side 12 other productions VM's, and they are having no issues.

Victor, discovery is limited to 50 subnets on our MPLS network, which is pretty much all that we have.  I do specify each subnet, I guess I could eliminate that list and just discover all subnets.  In the beginning, I started out adding 10 subnets at a time, I didn't want to risk overloading our network.

After that, I am filtering the discovery results for a specific IP address range on each subnet (IP .1 thru .100).  I'm really not using much scripting yet, just changing the names of nodes to match SNMP host names, a couple of email alerts, and that's about it.  The routing tables should not be huge on any of the routers that NetXMS discovers.  That being said, our network provider may be doing things on their Cisco routers (which they own) that I am unaware of.

Alex, I will edit and then run the attached script per your recommendation.


-Kevin C.
#42
General Support / Re: Out of memory NetXMS v1.2.5
February 06, 2013, 11:29:33 PM
Does anyone have any ideas on what may be causing this out of memory condition?

Any help would be greatly appreciated.


-Kevin C.
#43
General Support / Re: Out of memory NetXMS v1.2.5
February 06, 2013, 06:03:46 PM
The version of MySQL I am running is 5.1.61.

I'm not sure exactly what to look for in the log files.  There is some detailed information in messages:

=====================================================
Feb  5 10:02:58 netmgmt abrtd: Init complete, entering main loop
Feb  5 14:57:25 netmgmt kernel: mysqld invoked oom-killer: gfp_mask=0x201da, order=0, oom_adj=0, oom_score_adj=0
Feb  5 14:57:25 netmgmt kernel: mysqld cpuset=/ mems_allowed=0
Feb  5 14:57:25 netmgmt kernel: Pid: 1604, comm: mysqld Not tainted 2.6.32-279.9.1.el6.x86_64 #1
Feb  5 14:57:25 netmgmt kernel: Call Trace:
Feb  5 14:57:25 netmgmt kernel: [<ffffffff810c4c71>] ? cpuset_print_task_mems_allowed+0x91/0xb0
Feb  5 14:57:25 netmgmt kernel: [<ffffffff811173e0>] ? dump_header+0x90/0x1b0
Feb  5 14:57:25 netmgmt kernel: [<ffffffff81214a0c>] ? security_real_capable_noaudit+0x3c/0x70
Feb  5 14:57:25 netmgmt kernel: [<ffffffff81117862>] ? oom_kill_process+0x82/0x2a0
Feb  5 14:57:25 netmgmt kernel: [<ffffffff811177a1>] ? select_bad_process+0xe1/0x120
Feb  5 14:57:25 netmgmt kernel: [<ffffffff81117ca0>] ? out_of_memory+0x220/0x3c0
Feb  5 14:57:25 netmgmt kernel: [<ffffffff811279be>] ? __alloc_pages_nodemask+0x89e/0x940
Feb  5 14:57:25 netmgmt kernel: [<ffffffff8115c51a>] ? alloc_pages_current+0xaa/0x110
Feb  5 14:57:25 netmgmt kernel: [<ffffffff811147e7>] ? __page_cache_alloc+0x87/0x90
Feb  5 14:57:25 netmgmt kernel: [<ffffffff8112a40b>] ? __do_page_cache_readahead+0xdb/0x210
Feb  5 14:57:25 netmgmt kernel: [<ffffffff8112a561>] ? ra_submit+0x21/0x30
Feb  5 14:57:25 netmgmt kernel: [<ffffffff81115b13>] ? filemap_fault+0x4c3/0x500
Feb  5 14:57:25 netmgmt kernel: [<ffffffff81136b6f>] ? __inc_zone_state+0x1f/0x70
Feb  5 14:57:25 netmgmt kernel: [<ffffffff8113ef14>] ? __do_fault+0x54/0x510
Feb  5 14:57:25 netmgmt kernel: [<ffffffff8113f4c7>] ? handle_pte_fault+0xf7/0xb50
Feb  5 14:57:25 netmgmt kernel: [<ffffffff8100bc0e>] ? apic_timer_interrupt+0xe/0x20
Feb  5 14:57:25 netmgmt kernel: [<ffffffff811913fc>] ? core_sys_select+0x1ec/0x2c0
Feb  5 14:57:25 netmgmt kernel: [<ffffffff81140104>] ? handle_mm_fault+0x1e4/0x2b0
Feb  5 14:57:25 netmgmt kernel: [<ffffffff810444c9>] ? __do_page_fault+0x139/0x480
Feb  5 14:57:25 netmgmt kernel: [<ffffffff81278bec>] ? rb_erase+0x1bc/0x310
Feb  5 14:57:25 netmgmt kernel: [<ffffffff81012bd9>] ? read_tsc+0x9/0x20
Feb  5 14:57:25 netmgmt kernel: [<ffffffff8109cea9>] ? ktime_get_ts+0xa9/0xe0
Feb  5 14:57:25 netmgmt kernel: [<ffffffff8118fe58>] ? poll_select_copy_remaining+0xf8/0x150
Feb  5 14:57:25 netmgmt kernel: [<ffffffff810d6ad3>] ? audit_syscall_entry+0x63/0x2a0
Feb  5 14:57:25 netmgmt kernel: [<ffffffff8150380e>] ? do_page_fault+0x3e/0xa0
Feb  5 14:57:25 netmgmt kernel: [<ffffffff81500bc5>] ? page_fault+0x25/0x30
Feb  5 14:57:25 netmgmt kernel: Mem-Info:
Feb  5 14:57:25 netmgmt kernel: Node 0 DMA per-cpu:
Feb  5 14:57:25 netmgmt kernel: CPU    0: hi:    0, btch:   1 usd:   0
Feb  5 14:57:25 netmgmt kernel: Node 0 DMA32 per-cpu:
Feb  5 14:57:25 netmgmt kernel: CPU    0: hi:  186, btch:  31 usd:  61
Feb  5 14:57:25 netmgmt kernel: Node 0 Normal per-cpu:
Feb  5 14:57:25 netmgmt kernel: CPU    0: hi:  186, btch:  31 usd:  74
Feb  5 14:57:25 netmgmt kernel: active_anon:671184 inactive_anon:251894 isolated_anon:0
Feb  5 14:57:25 netmgmt kernel: active_file:74 inactive_file:923 isolated_file:0
Feb  5 14:57:25 netmgmt kernel: unevictable:0 dirty:0 writeback:0 unstable:0
Feb  5 14:57:25 netmgmt kernel: free:21743 slab_reclaimable:2109 slab_unreclaimable:13036
Feb  5 14:57:25 netmgmt kernel: mapped:125 shmem:0 pagetables:4921 bounce:0
Feb  5 14:57:25 netmgmt kernel: Node 0 DMA free:15684kB min:248kB low:308kB high:372kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15292kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
Feb  5 14:57:25 netmgmt kernel: lowmem_reserve[]: 0 3000 4010 4010
Feb  5 14:57:25 netmgmt kernel: Node 0 DMA32 free:54324kB min:50372kB low:62964kB high:75556kB active_anon:2236200kB inactive_anon:559016kB active_file:260kB inactive_file:3692kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:3072160kB mlocked:0kB dirty:0kB writeback:0kB mapped:492kB shmem:0kB slab_reclaimable:212kB slab_unreclaimable:260kB kernel_stack:0kB pagetables:6660kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:401 all_unreclaimable? yes
Feb  5 14:57:25 netmgmt kernel: lowmem_reserve[]: 0 0 1010 1010
Feb  5 14:57:25 netmgmt kernel: Node 0 Normal free:16964kB min:16956kB low:21192kB high:25432kB active_anon:448536kB inactive_anon:448560kB active_file:36kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:1034240kB mlocked:0kB dirty:0kB writeback:0kB mapped:8kB shmem:0kB slab_reclaimable:8224kB slab_unreclaimable:51884kB kernel_stack:3744kB pagetables:13024kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:73 all_unreclaimable? yes
Feb  5 14:57:25 netmgmt kernel: lowmem_reserve[]: 0 0 0 0
Feb  5 14:57:25 netmgmt kernel: Node 0 DMA: 1*4kB 4*8kB 2*16kB 2*32kB 3*64kB 0*128kB 0*256kB 0*512kB 1*1024kB 1*2048kB 3*4096kB = 15684kB
Feb  5 14:57:25 netmgmt kernel: Node 0 DMA32: 123*4kB 123*8kB 35*16kB 12*32kB 5*64kB 1*128kB 3*256kB 3*512kB 36*1024kB 6*2048kB 0*4096kB = 54324kB
Feb  5 14:57:25 netmgmt kernel: Node 0 Normal: 433*4kB 227*8kB 105*16kB 45*32kB 25*64kB 10*128kB 7*256kB 1*512kB 3*1024kB 1*2048kB 0*4096kB = 16972kB
Feb  5 14:57:25 netmgmt kernel: 1745 total pagecache pages
Feb  5 14:57:25 netmgmt kernel: 733 pages in swap cache
Feb  5 14:57:25 netmgmt kernel: Swap cache stats: add 1035805, delete 1035072, find 1381/1742
Feb  5 14:57:25 netmgmt kernel: Free swap  = 0kB
Feb  5 14:57:25 netmgmt kernel: Total swap = 4128760kB
Feb  5 14:57:25 netmgmt kernel: 1048560 pages RAM
Feb  5 14:57:25 netmgmt kernel: 67324 pages reserved
Feb  5 14:57:25 netmgmt kernel: 267 pages shared
Feb  5 14:57:25 netmgmt kernel: 955440 pages non-shared
Feb  5 14:57:25 netmgmt kernel: [ pid ]   uid  tgid total_vm      rss cpu oom_adj oom_score_adj name
Feb  5 14:57:25 netmgmt kernel: [  498]     0   498     2795        0   0     -17         -1000 udevd
Feb  5 14:57:25 netmgmt kernel: [ 1111]     0  1111     6909       28   0     -17         -1000 auditd
Feb  5 14:57:25 netmgmt kernel: [ 1136]     0  1136    62271       44   0       0             0 rsyslogd
Feb  5 14:57:25 netmgmt kernel: [ 1178]    32  1178     4743       15   0       0             0 rpcbind
Feb  5 14:57:25 netmgmt kernel: [ 1196]    29  1196     5836        1   0       0             0 rpc.statd
Feb  5 14:57:25 netmgmt kernel: [ 1208]     0  1208     1143       10   0       0             0 mdadm
Feb  5 14:57:25 netmgmt kernel: [ 1234]     0  1234     6290        1   0       0             0 rpc.idmapd
Feb  5 14:57:25 netmgmt kernel: [ 1328]    81  1328     7944        1   0       0             0 dbus-daemon
Feb  5 14:57:25 netmgmt kernel: [ 1340]     0  1340    47289        1   0       0             0 cupsd
Feb  5 14:57:25 netmgmt kernel: [ 1365]     0  1365     1019        0   0       0             0 acpid
Feb  5 14:57:25 netmgmt kernel: [ 1374]    68  1374     6323      111   0       0             0 hald
Feb  5 14:57:25 netmgmt kernel: [ 1375]     0  1375     4526        1   0       0             0 hald-runner
Feb  5 14:57:25 netmgmt kernel: [ 1403]     0  1403     5055        1   0       0             0 hald-addon-inpu
Feb  5 14:57:25 netmgmt kernel: [ 1414]    68  1414     4451        1   0       0             0 hald-addon-acpi
Feb  5 14:57:25 netmgmt kernel: [ 1435]     0  1435    96427       31   0       0             0 automount
Feb  5 14:57:25 netmgmt kernel: [ 1451]     0  1451     1564        0   0       0             0 mcelog
Feb  5 14:57:25 netmgmt kernel: [ 1463]     0  1463    16018        0   0     -17         -1000 sshd
Feb  5 14:57:25 netmgmt kernel: [ 1471]    38  1471     7540       30   0       0             0 ntpd
Feb  5 14:57:25 netmgmt kernel: [ 1507]     0  1507    27050        1   0       0             0 mysqld_safe
Feb  5 14:57:25 netmgmt kernel: [ 1596]    27  1596   176877     2710   0       0             0 mysqld
Feb  5 14:57:25 netmgmt kernel: [ 1687]     0  1687    19669       22   0       0             0 master
Feb  5 14:57:25 netmgmt kernel: [ 1697]    89  1697    19732       22   0       0             0 qmgr
Feb  5 14:57:25 netmgmt kernel: [ 1711]     0  1711    27543        1   0       0             0 abrtd
Feb  5 14:57:25 netmgmt kernel: [ 1719]     0  1719    27016        1   0       0             0 abrt-dump-oops
Feb  5 14:57:25 netmgmt kernel: [ 1727]     0  1727    29301       23   0       0             0 crond
Feb  5 14:57:25 netmgmt kernel: [ 1738]     0  1738     5363        5   0       0             0 atd
Feb  5 14:57:25 netmgmt kernel: [ 1751]     0  1751    19274        1   0       0             0 login
Feb  5 14:57:25 netmgmt kernel: [ 1753]     0  1753     1015        1   0       0             0 mingetty
Feb  5 14:57:25 netmgmt kernel: [ 1756]     0  1756     1015        1   0       0             0 mingetty
Feb  5 14:57:25 netmgmt kernel: [ 1757]     0  1757     3091        0   0     -17         -1000 udevd
Feb  5 14:57:25 netmgmt kernel: [ 1759]     0  1759     1015        1   0       0             0 mingetty
Feb  5 14:57:25 netmgmt kernel: [ 1760]     0  1760     3091        0   0     -17         -1000 udevd
Feb  5 14:57:25 netmgmt kernel: [ 1762]     0  1762     1015        1   0       0             0 mingetty
Feb  5 14:57:25 netmgmt kernel: [ 1764]     0  1764     1015        1   0       0             0 mingetty
Feb  5 14:57:25 netmgmt kernel: [ 1771]     0  1771   143729        1   0       0             0 console-kit-dae
Feb  5 14:57:25 netmgmt kernel: [ 1837]     0  1837    27083        1   0       0             0 bash
Feb  5 14:57:25 netmgmt kernel: [ 1857]     0  1857    92804      612   0       0             0 nxagentd
Feb  5 14:57:25 netmgmt kernel: [ 1871]     0  1871  2179534   918835   0       0             0 netxmsd
Feb  5 14:57:25 netmgmt kernel: [ 4627]    89  4627    19689       16   0       0             0 pickup
Feb  5 14:57:25 netmgmt kernel: Out of memory: Kill process 1871 (netxmsd) score 937 or sacrifice child
Feb  5 14:57:25 netmgmt kernel: Killed process 1871, UID 0, (netxmsd) total-vm:8718136kB, anon-rss:3674960kB, file-rss:380kB

==============================================
#44
General Support / Re: Out of memory NetXMS v1.2.5
February 05, 2013, 08:02:02 PM
OK, I stopped it after running for ~ 7 minutes - it was crashing in 10 minutes.  The log file is attached


-Kevin C.
#45
General Support / Re: Out of memory NetXMS v1.2.5
February 05, 2013, 06:45:50 PM
It crashed again, within 25 minutes and before I had a chance to stop it   :(

I will start it again, and stop it after 15 minutes.


-Kevin C.