NetXMS Support Forum

English Support => General Support => Topic started by: Nikk on November 01, 2013, 12:17:15 PM

Title: Server problems
Post by: Nikk on November 01, 2013, 12:17:15 PM
Hi,

I'm experiencing problems with two netxms servers.

Problem 1.

NetXMS running on VMware Windows server 2012.

Starting server takes about an 20-30min, first time it occured is when I raised the datacollector amount by 5! I lowered back to default amount but nothing, then I thought maybe it's because of SNMP tables or logwatch scripts which I added, but when removing all of that, anyway the same problem.
I just get - Error 1053: The service did not respond to the start or control request in a timely fashion.
and after a long time it starts. Nothing in logs but something i got in crashdump:

QuoteNETXMSD CRASH DUMP
????????????

EXCEPTION: C0000005 (Access violation) at 0000000000000000

NetXMS Version: 1.2.9
OS Version: Windows NT 6.2 Build 9200
Processor architecture: AMD64 (Intel EM64T)

Call stack:
  [libnxdb:00000000005F4A3C]: class String __cdecl DBPrepareString(struct db_handle_t * __ptr64,wchar_t const * __ptr64,int)
  [nxcore:0000000180011E8C]: void __cdecl WriteAuditLog(wchar_t const * __ptr64,int,unsigned int,wchar_t const * __ptr64,unsigned int,wchar_t const * __ptr64,...)
  [nxcore:000000018008D9BD]: private: void __cdecl ClientSession::readThread(void) __ptr64
  [nxcore:000000018008C5F2]: private: static unsigned int __cdecl ClientSession::readThreadStarter(void * __ptr64)
  [libnetxms:000000000020452F]: unsigned int __cdecl SEHThreadStarter(void * __ptr64)
  [MSVCR80:000000005D0937D7]: _endthreadex
  [MSVCR80:000000005D093894]: _endthreadex
  [KERNEL32:000007FA941A1832]: BaseThreadInitThunk
  [ntdll:000007FA9662D609]: RtlUserThreadStart

Problem 2.

NetXMS running on Ubuntu server 12.04 x64.

I'm getting segmentation fault, strace:
QuoteProgram received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0xb48f5b40 (LWP 19381)]
0xb7d1cf66 in ?? () from /lib/i386-linux-gnu/libc.so.6
(gdb) bt
#0  0xb7d1cf66 in ?? () from /lib/i386-linux-gnu/libc.so.6
#1  0xb7effca0 in DCTable::processNewValue (this=0x80b4b70,
    nTimeStamp=1383298291, value=0xb59199c0) at dctable.cpp:389
#2  0xb7f0292c in DataCollectionTarget::processNewDCValue (this=0x80cd218, dco=
    0x80b4b70, currTime=1383298291, value=0xb59199c0) at dctarget.cpp:366
#3  0xb7eefbb7 in DataCollector (pArg=0x0) at datacoll.cpp:254
#4  0xb7e48d4c in start_thread () from /lib/i386-linux-gnu/libpthread.so.0
#5  0xb7d87bae in clone () from /lib/i386-linux-gnu/libc.so.6


And in management console more than often i get Software caused connection abort: socket write error and then i must restart console to be able to do something again.

And, once I exported a template from server, and know, when I want to import it back, I get timed out, and  timed out on everything. Here is trace:
Quote======= Backtrace: =========
/lib/i386-linux-gnu/libc.so.6(+0x75ee2)[0xb742dee2]
/usr/local/lib/libnxcore.so.1(_Z14ValidateConfigP6ConfigjPci+0x628)[0xb76306b8]
/usr/local/lib/libnxcore.so.1(_ZN13ClientSession19importConfigurationEP11CSCPMessage+0x24e)[0xb76879ce]
/usr/local/lib/libnxcore.so.1(_ZN13ClientSession16processingThreadEv+0xd59)[0xb7692fb9]
/usr/local/lib/libnxcore.so.1(_ZN13ClientSession23processingThreadStarterEPv+0x1b)[0xb769387b]
/lib/i386-linux-gnu/libpthread.so.0(+0x6d4c)[0xb7568d4c]
/lib/i386-linux-gnu/libc.so.6(clone+0x5e)[0xb74a7bae]


Program received signal SIGABRT, Aborted.
[Switching to Thread 0xadc89b40 (LWP 23803)]
0xb7fdd424 in __kernel_vsyscall ()
(gdb) bt
#0  0xb7fdd424 in __kernel_vsyscall ()
#1  0xb7cc61df in raise () from /lib/i386-linux-gnu/libc.so.6
#2  0xb7cc9825 in abort () from /lib/i386-linux-gnu/libc.so.6
#3  0xb7d0339a in ?? () from /lib/i386-linux-gnu/libc.so.6
#4  0xb7d0dee2 in ?? () from /lib/i386-linux-gnu/libc.so.6
#5  0xb7f106b8 in ~ConfigEntryList (this=<optimized out>,
    __in_chrg=<optimized out>) at ../../../include/nxconfig.h:111
#6  ValidateTemplate (errorTextLen=1024, errorText=0xadc8897c "@",
    root=<optimized out>, config=0xb560a428) at import.cpp:119
#7  ValidateConfig (config=0xb560a428, flags=0, errorText=0xadc8897c "@",
    errorTextLen=1024) at import.cpp:209
#8  0xb7f679ce in ClientSession::importConfiguration (this=0xb6206f28,
    pRequest=0xb530aff0) at session.cpp:8918
#9  0xb7f72fb9 in ClientSession::processingThread (this=0xb6206f28)
    at session.cpp:1133
#10 0xb7f7387b in ClientSession::processingThreadStarter (pArg=0xb6206f28)
    at session.cpp:203
#11 0xb7e48d4c in start_thread () from /lib/i386-linux-gnu/libpthread.so.0
#12 0xb7d87bae in clone () from /lib/i386-linux-gnu/libc.so.6

Here is the template:
Quote<?xml version="1.0" encoding="UTF-8"?>
<configuration>
   <formatVersion>3</formatVersion>
   <description>Ethernet Statistics</description>
   <events>
   </events>
   <templates>
      <template id="1824">
         <name>Ethernet Statistics</name>
         <flags>0</flags>
         <dataCollection>
            <dctable id="213">
               <name>.1.3.6.1.2.1.16.1.1.1.1</name>
               <description>Ethernet Statistics</description>
               <origin>2</origin>
               <interval>60</interval>
               <retention>30</retention>
               <systemTag></systemTag>
               <advancedSchedule>0</advancedSchedule>
               <rawValueInOctetString>0</rawValueInOctetString>
               <snmpPort>0</snmpPort>
               <transformation></transformation>
               <columns>
                  <column id="1">
                     <name>Index</name>
                     <displayName>Index</displayName>
                     <snmpOid>.1.3.6.1.2.1.16.1.1.1.1</snmpOid>
                     <flags>256</flags>
                  </column>
                  <column id="2">
                     <name>Broadcast Packets</name>
                     <displayName>Broadcast Packets</displayName>
                     <snmpOid>.1.3.6.1.2.1.16.1.1.1.6</snmpOid>
                     <flags>33</flags>
                  </column>
                  <column id="3">
                     <name>Multicast Packets</name>
                     <displayName>Multicast Packets</displayName>
                     <snmpOid>.1.3.6.1.2.1.16.1.1.1.7</snmpOid>
                     <flags>1</flags>
                  </column>
                  <column id="4">
                     <name>Packets</name>
                     <displayName>Packets</displayName>
                     <snmpOid>.1.3.6.1.2.1.16.1.1.1.5</snmpOid>
                     <flags>1</flags>
                  </column>
                  <column id="5">
                     <name>Collisions</name>
                     <displayName>Collisions</displayName>
                     <snmpOid>.1.3.6.1.2.1.16.1.1.1.13</snmpOid>
                     <flags>0</flags>
                  </column>
                  <column id="6">
                     <name>CRC Align Errors</name>
                     <displayName>CRC Align Errors</displayName>
                     <snmpOid>.1.3.6.1.2.1.16.1.1.1.8</snmpOid>
                     <flags>0</flags>
                  </column>
                  <column id="7">
                     <name>Drop Events</name>
                     <displayName>Drop Events</displayName>
                     <snmpOid>.1.3.6.1.2.1.16.1.1.1.3</snmpOid>
                     <flags>0</flags>
                  </column>
               </columns>
               <thresholds>
               </thresholds>
               <perfTabSettings></perfTabSettings>
            </dctable>
         </dataCollection>
      </template>
   </templates>
   <traps>
   </traps>
</configuration>
Is the template wrong, or is there something else?

Thanks in advance,
Nikk
Title: Re: Server problems
Post by: Victor Kirhenshtein on November 01, 2013, 08:01:05 PM
Hi!

Problem #2 seems to be the same as in https://www.netxms.org/forum/general-support/segfault-2648 (https://www.netxms.org/forum/general-support/segfault-2648). You can try to apply suggested patch and recompile if you was building from sources. I'll take a look at other two problems.

I plan to release 1.2.10 somewhere next week, so hopefully all those crashes will be fixed.

Best regards,
Victor
Title: Re: Server problems
Post by: Nikk on November 02, 2013, 12:47:44 AM
Okey, i'll try that patch, and will let you know about the progress!

Nice to hear that, big thanks!

Nikk
Title: Re: Server problems
Post by: Nikk on November 04, 2013, 12:28:03 PM
Hi,
I tried this:
Quote from: Victor Kirhenshtein on November 01, 2013, 08:01:05 PM
Problem #2 seems to be the same as in https://www.netxms.org/forum/general-support/segfault-2648 (https://www.netxms.org/forum/general-support/segfault-2648). You can try to apply suggested patch and recompile if you was building from sources.
and it worked, thanks :)

Nikk
Title: Re: Server problems
Post by: Alex Kirhenshtein on November 04, 2013, 01:55:07 PM
Template import fixed in current trunk and will be released in 1.2.10
Title: Re: Server problems
Post by: Nikk on November 04, 2013, 01:56:30 PM
Thank you  :)

Nikk
Title: Re: Server problems
Post by: Nikk on November 21, 2013, 01:18:59 PM
Hi,

Any changes regarding to problem #1? It is annoying, that each time, when I want to restart server, I have to wait 20-30 min :/.

Thanks in advance,
Nikk
Title: Re: Server problems
Post by: ericq on December 16, 2013, 11:58:33 AM
I solved problem #1 by cahnging the startup path of the service.
The service NetXMSAgentdW32 will start the follow.
"C:\NetXMS\bin\nxagentd.exe" -d -c "C:\NetXMS\etc\nxagentd.conf" -n "NetXMSAgentdW32" -e "NetXMS Win32 Agent" -D -1 -M "192.168.0.41"
The Debug level is set to -D -1. this has to be -D 1.
Edit the service to start this
"C:\NetXMS\bin\nxagentd.exe" -d -c "C:\NetXMS\etc\nxagentd.conf" -n "NetXMSAgentdW32" -e "NetXMS Win32 Agent" -D 1 -M "192.168.0.41"

Run regedt32.exe then navigate to the key for the service found in;
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services
and edit the Reg_Expand_Sz string named 'ImagePath'
Title: Re: Server problems
Post by: Nikk on December 16, 2013, 01:00:25 PM
Hi ericq,

In my case, agent is starting fine, but core is the guilty one.
And core is starting this:
C:\NetXMS\bin\netxmsd.exe" --config "C:\NetXMS\etc\netxmsd.conf" -d

Thank you anyway :)

Nikk