3.3 does not reconnect to postgres

Started by jermudgeon, May 05, 2020, 06:56:29 PM

Previous topic - Next topic

jermudgeon

Postgres backend is functioning, but netxms reports 'Server has lost connection with backend database'.

sh dbcp
netxmsd: sh dbcp
0x7f8745e44780 04.May.2020 17:42:14 dbwrite.cpp:457
0x7f873cbd5d80 04.May.2020 17:42:15 dcitem.cpp:1309
0x7f873cbd5960 04.May.2020 17:42:05 syncer.cpp:238
3 database connections in use

Stuck since yesterday.

sh q

netxmsd: sh q
Data collector                   : 0
DCI cache loader                 : 46673
Template updates                 : 0
Database writer                  : 5
Database writer (IData)          : 3354870
Database writer (raw DCI values) : 28133
Event processor                  : 1334
Event log writer                 : 0
Poller                           : 1319
Node discovery poller            : 1187
Syslog processing                : 0
Syslog writer                    : 0
Scheduler                        : 0


dbcp reset doesn't do much:

netxmsd: dbcp reset
Resetting database connection pool
Database connection pool reset completed
netxmsd: sh dbcp
0x7f8745e44780 04.May.2020 17:42:14 dbwrite.cpp:457
0x7f873cbd5d80 04.May.2020 17:42:15 dcitem.cpp:1309
0x7f873cbd5960 04.May.2020 17:42:05 syncer.cpp:238
3 database connections in use


Filipp Sudanov

If it's still in that condition, can you run this script to get thread statuses:
https://github.com/netxms/netxms/blob/master/tools/capture_netxmsd_threads.sh

Please run it twice with about 1 minute interval and attach the files it creates in /tmp.
The script requires gdb to be installed.

jermudgeon

The grep could afford to be a bit more detailed, for example, 'bin/netxmsd', as any open/tailed netxms log files will pass the grep and cause script failure later on.