nxdbmgr failes to repair

Started by Borgso, April 06, 2021, 02:05:39 PM

Previous topic - Next topic

Borgso

Have been running a server with timescaledb on Ubuntu 20.04 for 2months now.
Its been abit unstable compared to have running default PostGresql over years on other setups.

Could i have a currupt database?

nxdbmgr will give this "in loop" when running "nxdbmgr check -d -f" as it fails to commit the fix:

Found collected data for non-existing DCI [6152]. Delete invalid records? (Yes/No/All/Skip) Y
* DCI history records                                                  [  66% ]

SQL query failed (25P02 ERROR:  current transaction is aborted, commands ignored until end of transaction block):
SELECT distinct(item_id) FROM idata_sc_30                                                                                                                                                                                                                                                       75SQL query failed (25P02 ERROR:  current transaction is aborted, commands ignored until end of transaction block):
SELECT distinct(item_id) FROM idata_sc_90                                                                                                                                                                                                                                                       83SQL query failed (25P02 ERROR:  current transaction is aborted, commands ignored until end of transaction block):
SELECT distinct(item_id) FROM idata_sc_180                                                                                                                                                                                                                                                      91SQL query failed (25P02 ERROR:  current transaction is aborted, commands ignored until end of transaction block):
SELECT distinct(item_id) FROM idata_sc_other                                                                                                                                                                                                                                                 [FIXED ]
* Table DCI history records                                            [   0SQL query failed (25P02 ERROR:  current transaction is aborted, commands ignored until end of transaction block):
SELECT count(*) FROM tdata_sc_default WHERE tdata_timestamp > to_timestamp(1617706540)                                                                                                                                                                                                           8SQL query failed (25P02 ERROR:  current transaction is aborted, commands ignored until end of transaction block):
SELECT count(*) FROM tdata_sc_7 WHERE tdata_timestamp > to_timestamp(1617706540)                                                                                                                                                                                                                16SQL query failed (25P02 ERROR:  current transaction is aborted, commands ignored until end of transaction block):
SELECT count(*) FROM tdata_sc_30 WHERE tdata_timestamp > to_timestamp(1617706540)                                                                                                                                                                                                               25SQL query failed (25P02 ERROR:  current transaction is aborted, commands ignored until end of transaction block):
SELECT count(*) FROM tdata_sc_90 WHERE tdata_timestamp > to_timestamp(1617706540)                                                                                                                                                                                                               33SQL query failed (25P02 ERROR:  current transaction is aborted, commands ignored until end of transaction block):
SELECT count(*) FROM tdata_sc_180 WHERE tdata_timestamp > to_timestamp(1617706540)                                                                                                                                                                                                              41SQL query failed (25P02 ERROR:  current transaction is aborted, commands ignored until end of transaction block):
SELECT count(*) FROM tdata_sc_other WHERE tdata_timestamp > to_timestamp(1617706540)                                                                                                                                                                                                            50SQL query failed (25P02 ERROR:  current transaction is aborted, commands ignored until end of transaction block):
SELECT distinct(item_id) FROM tdata_sc_default                                                                                                                                                                                                                                                  58SQL query failed (25P02 ERROR:  current transaction is aborted, commands ignored until end of transaction block):
SELECT distinct(item_id) FROM tdata_sc_7                                                                                                                                                                                                                                                        66SQL query failed (25P02 ERROR:  current transaction is aborted, commands ignored until end of transaction block):
SELECT distinct(item_id) FROM tdata_sc_30                                                                                                                                                                                                                                                       75SQL query failed (25P02 ERROR:  current transaction is aborted, commands ignored until end of transaction block):
SELECT distinct(item_id) FROM tdata_sc_90                                                                                                                                                                                                                                                       83SQL query failed (25P02 ERROR:  current transaction is aborted, commands ignored until end of transaction block):
SELECT distinct(item_id) FROM tdata_sc_180                                                                                                                                                                                                                                                      91SQL query failed (25P02 ERROR:  current transaction is aborted, commands ignored until end of transaction block):
SELECT distinct(item_id) FROM tdata_sc_other                                                                                                                                                                                                                                                 [PASSED]
106 errors was found, 106 errors was corrected
All errors in database was fixed
Commit changes? (Yes/No) Y

Victor Kirhenshtein

Hi,

looks like bug in nxdbmgr. Somebody from dev team will check.

Best regards,
Victor

Borgso

Thanks.

Also got this when doing "Clear collected data" on 5 DCIs

06.04.2021 15:14:59 netxms01 Normal SYS_NODE_NORMAL Node status changed to NORMAL
06.04.2021 15:14:59 netxms01 Critical SYS_DB_QUERY_FAILED Database query failed (Query: SELECT avg(idata_value::double precision) FROM idata_sc_7 WHERE item_id=? AND idata_timestamp BETWEEN to_timestamp(?) AND to_timestamp(?) AND idata_value~E'^\\d+(\\.\\d+)*$'; Error: Internal error (pResult is NULL in UnsafeDrvSelect))
06.04.2021 15:14:59 netxms01 Normal SYS_DB_CONN_RESTORED Connection with backend database engine restored
06.04.2021 15:14:58 netxms01 Critical SYS_DB_QUERY_FAILED Database query failed (Query: SELECT avg(idata_value::double precision) FROM idata_sc_7 WHERE item_id=? AND idata_timestamp BETWEEN to_timestamp(?) AND to_timestamp(?) AND idata_value~E'^\\d+(\\.\\d+)*$'; Error: server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.)
06.04.2021 15:14:58 netxms01 Critical SYS_DB_QUERY_FAILED Database query failed (Query: SELECT avg(idata_value::double precision) FROM idata_sc_7 WHERE item_id=? AND idata_timestamp BETWEEN to_timestamp(?) AND to_timestamp(?) AND idata_value~E'^\\d+(\\.\\d+)*$'; Error: server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.)
06.04.2021 15:14:58 netxms01 Critical SYS_DB_QUERY_FAILED Database query failed (Query: SELECT avg(idata_value::double precision) FROM idata_sc_7 WHERE item_id=? AND idata_timestamp BETWEEN to_timestamp(?) AND to_timestamp(?) AND idata_value~E'^\\d+(\\.\\d+)*$'; Error: server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.)
06.04.2021 15:14:58 netxms01 Critical SYS_NODE_CRITICAL Node status changed to CRITICAL
06.04.2021 15:14:58 netxms01 Critical SYS_DB_QUERY_FAILED Database query failed (Query: INSERT INTO alarms (alarm_id,parent_alarm_id,creation_time,last_change_time,source_object_id,zone_uin,source_event_code,message,original_severity,current_severity,alarm_key,alarm_state,ack_by,resolved_by,hd_state,hd_ref,repeat_count,term_by,timeout,timeout_event,source_event_id,ack_timeout,dci_id,alarm_category_ids,rule_guid,event_tags,rca_script_name,impact) VALUES (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?); Error: server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.)
06.04.2021 15:14:58 netxms01 Critical SYS_DB_QUERY_FAILED Database query failed (Query: SELECT port FROM snmp_ports WHERE zone=? OR zone=-1 ORDER BY zone DESC, id ASC; Error: server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.)
06.04.2021 15:14:58 netxms01 Critical SYS_DB_QUERY_FAILED Database query failed (Query: SELECT avg(idata_value::double precision) FROM idata_sc_7 WHERE item_id=? AND idata_timestamp BETWEEN to_timestamp(?) AND to_timestamp(?) AND idata_value~E'^\\d+(\\.\\d+)*$'; Error: server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.)
06.04.2021 15:14:58 netxms01 Critical SYS_DB_QUERY_FAILED Database query failed (Query: SELECT avg(idata_value::double precision) FROM idata_sc_7 WHERE item_id=? AND idata_timestamp BETWEEN to_timestamp(?) AND to_timestamp(?) AND idata_value~E'^\\d+(\\.\\d+)*$'; Error: server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.)
06.04.2021 15:14:55 netxms01 Critical SYS_DB_CONN_LOST Lost connection with backend database engine

Victor Kirhenshtein

Hi,

that one looks like PostgreSQL problem - server drops connection for some reason. Check in Postgres or system log if it was restarted.

Best regards,
Victor

Borgso

I have manually deleted the problems, but still get the 25P02 error that can be solved with "savepoint": https://www.postgresql.org/docs/current/sql-savepoint.html

SQL query failed (25P02 ERROR:  current transaction is aborted, commands ignored until end of transaction block):
SELECT distinct(item_id) FROM idata_sc_30                                                                                                                                                                                                                                                                                                                                                                              75>>> SELECT distinct(item_id) FROM idata_sc_90
SQL query failed (25P02 ERROR:  current transaction is aborted, commands ignored until end of transaction block):
SELECT distinct(item_id) FROM idata_sc_90                                                                                                                                                                                                                                                                                                                                                                              83>>> SELECT distinct(item_id) FROM idata_sc_180
SQL query failed (25P02 ERROR:  current transaction is aborted, commands ignored until end of transaction block):
SELECT distinct(item_id) FROM idata_sc_180                                                                                                                                                                                                                                                                                                                                                                             91>>> SELECT distinct(item_id) FROM idata_sc_other
SQL query failed (25P02 ERROR:  current transaction is aborted, commands ignored until end of transaction block):
SELECT distinct(item_id) FROM idata_sc_other                                                                                                                                                                                                                                                                                                                                                                        [PASSED]
* Table DCI history records                                            [   0>>> SELECT count(*) FROM tdata_sc_default WHERE tdata_timestamp > to_timestamp(1618041778)
SQL query failed (25P02 ERROR:  current transaction is aborted, commands ignored until end of transaction block):
SELECT count(*) FROM tdata_sc_default WHERE tdata_timestamp > to_timestamp(1618041778)                                                                                                                                                                                                                                                                                                                                  8>>> SELECT count(*) FROM tdata_sc_7 WHERE tdata_timestamp > to_timestamp(1618041778)
SQL query failed (25P02 ERROR:  current transaction is aborted, commands ignored until end of transaction block):
SELECT count(*) FROM tdata_sc_7 WHERE tdata_timestamp > to_timestamp(1618041778)                                                                                                                                                                                                                                                                                                                                       16>>> SELECT count(*) FROM tdata_sc_30 WHERE tdata_timestamp > to_timestamp(1618041778)
SQL query failed (25P02 ERROR:  current transaction is aborted, commands ignored until end of transaction block):
SELECT count(*) FROM tdata_sc_30 WHERE tdata_timestamp > to_timestamp(1618041778)                                                                                                                                                                                                                                                                                                                                      25>>> SELECT count(*) FROM tdata_sc_90 WHERE tdata_timestamp > to_timestamp(1618041778)
SQL query failed (25P02 ERROR:  current transaction is aborted, commands ignored until end of transaction block):
SELECT count(*) FROM tdata_sc_90 WHERE tdata_timestamp > to_timestamp(1618041778)                                                                                                                                                                                                                                                                                                                                      33>>> SELECT count(*) FROM tdata_sc_180 WHERE tdata_timestamp > to_timestamp(1618041778)
SQL query failed (25P02 ERROR:  current transaction is aborted, commands ignored until end of transaction block):
SELECT count(*) FROM tdata_sc_180 WHERE tdata_timestamp > to_timestamp(1618041778)                                                                                                                                                                                                                                                                                                                                     41>>> SELECT count(*) FROM tdata_sc_other WHERE tdata_timestamp > to_timestamp(1618041778)
SQL query failed (25P02 ERROR:  current transaction is aborted, commands ignored until end of transaction block):
SELECT count(*) FROM tdata_sc_other WHERE tdata_timestamp > to_timestamp(1618041778)                                                                                                                                                                                                                                                                                                                                   50>>> SELECT distinct(item_id) FROM tdata_sc_default
SQL query failed (25P02 ERROR:  current transaction is aborted, commands ignored until end of transaction block):
SELECT distinct(item_id) FROM tdata_sc_default                                                                                                                                                                                                                                                                                                                                                                         58>>> SELECT distinct(item_id) FROM tdata_sc_7
SQL query failed (25P02 ERROR:  current transaction is aborted, commands ignored until end of transaction block):
SELECT distinct(item_id) FROM tdata_sc_7                                                                                                                                                                                                                                                                                                                                                                               66>>> SELECT distinct(item_id) FROM tdata_sc_30
SQL query failed (25P02 ERROR:  current transaction is aborted, commands ignored until end of transaction block):
SELECT distinct(item_id) FROM tdata_sc_30                                                                                                                                                                                                                                                                                                                                                                              75>>> SELECT distinct(item_id) FROM tdata_sc_90
SQL query failed (25P02 ERROR:  current transaction is aborted, commands ignored until end of transaction block):
SELECT distinct(item_id) FROM tdata_sc_90                                                                                                                                                                                                                                                                                                                                                                              83>>> SELECT distinct(item_id) FROM tdata_sc_180
SQL query failed (25P02 ERROR:  current transaction is aborted, commands ignored until end of transaction block):
SELECT distinct(item_id) FROM tdata_sc_180                                                                                                                                                                                                                                                                                                                                                                             91>>> SELECT distinct(item_id) FROM tdata_sc_other
SQL query failed (25P02 ERROR:  current transaction is aborted, commands ignored until end of transaction block):
SELECT distinct(item_id) FROM tdata_sc_other                                                                                                                                                                                                                                                                                                                                                                        [PASSED]
Database doesn't contain any errors
>>> UPDATE config SET var_value='UNLOCKED' WHERE var_name='DBLockStatus'
>>> UPDATE config SET var_value='' WHERE var_name='DBLockInfo'
>>> UPDATE config SET var_value='0' WHERE var_name='DBLockPID'
Database check completed
>>> SELECT var_value FROM metadata WHERE var_name='PendingOnlineUpgrades'

Filipp Sudanov

Can you have a look into log of Postgres DB. There should be a ERROR: line prior to the moment when
"current transaction is aborted, commands ignored until end of transaction block"
started to occur. This should give us some understanding of what's happened.

Borgso

I increased max_locks_per_transaction from 64 to 512 in postgresql.conf after finding this errors in postgresql.log as you pointed me too Filip.
After this change, nxdbmgr completed without problems - Thanks!

Logs showed this:
Quote2021-04-13 13:59:31.878 CEST [834154] usr_netxms@db_netxms ERROR:  out of shared memory
2021-04-13 13:59:31.878 CEST [834154] usr_netxms@db_netxms HINT:  You might need to increase max_locks_per_transaction.
2021-04-13 13:59:31.878 CEST [834154] usr_netxms@db_netxms STATEMENT:  SELECT distinct(item_id) FROM idata_sc_30

Quote2021-04-13 14:07:51.986 CEST [834154] usr_netxms@db_netxms ERROR:  current transaction is aborted, commands ignored until end of transaction block
2021-04-13 14:07:51.986 CEST [834154] usr_netxms@db_netxms STATEMENT:  SELECT distinct(item_id) FROM idata_sc_30
2021-04-13 14:07:51.986 CEST [834154] usr_netxms@db_netxms ERROR:  current transaction is aborted, commands ignored until end of transaction block
2021-04-13 14:07:51.986 CEST [834154] usr_netxms@db_netxms STATEMENT:  SELECT distinct(item_id) FROM idata_sc_90
2021-04-13 14:07:51.986 CEST [834154] usr_netxms@db_netxms ERROR:  current transaction is aborted, commands ignored until end of transaction block
2021-04-13 14:07:51.986 CEST [834154] usr_netxms@db_netxms STATEMENT:  SELECT distinct(item_id) FROM idata_sc_180
2021-04-13 14:07:51.986 CEST [834154] usr_netxms@db_netxms ERROR:  current transaction is aborted, commands ignored until end of transaction block
2021-04-13 14:07:51.986 CEST [834154] usr_netxms@db_netxms STATEMENT:  SELECT distinct(item_id) FROM idata_sc_other
2021-04-13 14:07:51.986 CEST [834154] usr_netxms@db_netxms ERROR:  current transaction is aborted, commands ignored until end of transaction block
2021-04-13 14:07:51.986 CEST [834154] usr_netxms@db_netxms STATEMENT:  SELECT count(*) FROM tdata_sc_default WHERE tdata_timestamp > to_timestamp(1618315671)
2021-04-13 14:07:51.987 CEST [834154] usr_netxms@db_netxms ERROR:  current transaction is aborted, commands ignored until end of transaction block
2021-04-13 14:07:51.987 CEST [834154] usr_netxms@db_netxms STATEMENT:  SELECT count(*) FROM tdata_sc_7 WHERE tdata_timestamp > to_timestamp(1618315671)
2021-04-13 14:07:51.987 CEST [834154] usr_netxms@db_netxms ERROR:  current transaction is aborted, commands ignored until end of transaction block
2021-04-13 14:07:51.987 CEST [834154] usr_netxms@db_netxms STATEMENT:  SELECT count(*) FROM tdata_sc_30 WHERE tdata_timestamp > to_timestamp(1618315671)
2021-04-13 14:07:51.987 CEST [834154] usr_netxms@db_netxms ERROR:  current transaction is aborted, commands ignored until end of transaction block
2021-04-13 14:07:51.987 CEST [834154] usr_netxms@db_netxms STATEMENT:  SELECT count(*) FROM tdata_sc_90 WHERE tdata_timestamp > to_timestamp(1618315671)
2021-04-13 14:07:51.987 CEST [834154] usr_netxms@db_netxms ERROR:  current transaction is aborted, commands ignored until end of transaction block
2021-04-13 14:07:51.987 CEST [834154] usr_netxms@db_netxms STATEMENT:  SELECT count(*) FROM tdata_sc_180 WHERE tdata_timestamp > to_timestamp(1618315671)
2021-04-13 14:07:51.987 CEST [834154] usr_netxms@db_netxms ERROR:  current transaction is aborted, commands ignored until end of transaction block
2021-04-13 14:07:51.987 CEST [834154] usr_netxms@db_netxms STATEMENT:  SELECT count(*) FROM tdata_sc_other WHERE tdata_timestamp > to_timestamp(1618315671)
2021-04-13 14:07:51.987 CEST [834154] usr_netxms@db_netxms ERROR:  current transaction is aborted, commands ignored until end of transaction block
2021-04-13 14:07:51.987 CEST [834154] usr_netxms@db_netxms STATEMENT:  SELECT distinct(item_id) FROM tdata_sc_default
2021-04-13 14:07:51.987 CEST [834154] usr_netxms@db_netxms ERROR:  current transaction is aborted, commands ignored until end of transaction block
2021-04-13 14:07:51.987 CEST [834154] usr_netxms@db_netxms STATEMENT:  SELECT distinct(item_id) FROM tdata_sc_7
2021-04-13 14:07:51.988 CEST [834154] usr_netxms@db_netxms ERROR:  current transaction is aborted, commands ignored until end of transaction block
2021-04-13 14:07:51.988 CEST [834154] usr_netxms@db_netxms STATEMENT:  SELECT distinct(item_id) FROM tdata_sc_30
2021-04-13 14:07:51.988 CEST [834154] usr_netxms@db_netxms ERROR:  current transaction is aborted, commands ignored until end of transaction block
2021-04-13 14:07:51.988 CEST [834154] usr_netxms@db_netxms STATEMENT:  SELECT distinct(item_id) FROM tdata_sc_90
2021-04-13 14:07:51.988 CEST [834154] usr_netxms@db_netxms ERROR:  current transaction is aborted, commands ignored until end of transaction block
2021-04-13 14:07:51.988 CEST [834154] usr_netxms@db_netxms STATEMENT:  SELECT distinct(item_id) FROM tdata_sc_180
2021-04-13 14:07:51.988 CEST [834154] usr_netxms@db_netxms ERROR:  current transaction is aborted, commands ignored until end of transaction block
2021-04-13 14:07:51.988 CEST [834154] usr_netxms@db_netxms STATEMENT:  SELECT distinct(item_id) FROM tdata_sc_other