NetXMS 2.2.6 DB Upgrade Fails

Started by Staj, June 06, 2018, 01:11:13 PM

Previous topic - Next topic

Staj

Trying to upgrade from NetXMS 2.2.3 to 2.2.6, I'm up to the stage where I'm upgrading the database (MariaDB/MySQL) but:
QuoteC:\NetXMS\bin>nxdbmgr.exe upgrade
NetXMS Database Manager Version 2.2.6 Build 9513 (2.2.6) (UNICODE)

Upgrading database...
Upgrading from version 22.20 to 22.21
SQL query failed (Duplicate column name 'zone_uin'):
ALTER TABLE alarms ADD zone_uin integer
Rolling back last stage due to upgrade errors...
Database upgrade failed

Tursiops

That looks like it already tried the upgrade once before. The zone_uin column didn't exist in the alarms table until 2.2.6, so there shouldn't be a duplicate column.
Not sure if it's safe to simply drop that column from the database manually, so the upgrade can re-add it - or if you need to restore your database to a stage prior to the first upgrade.

Staj

#2
You were right, I tried again with the -X flag and it finished (more or less) but it says an online-upgrade is still required (which fails)
QuoteC:\NetXMS\bin>nxdbmgr.exe online-upgrade -t
NetXMS Database Manager Version 2.2.6 Build 9513 (2.2.6) (UNICODE)

>>> SELECT var_value FROM metadata WHERE var_name='PendingOnlineUpgrades'
Running online upgrade procedure for version 22.21
Updating zone UIN in table alarms...
Updating zone UIN in table event_log...
Updating zone UIN in table snmp_trap_log...
Updating zone UIN in table syslog...
Online upgrade procedure for version 22.21 failed
>>> SELECT var_value FROM metadata WHERE var_name='PendingOnlineUpgrades'

WARNING: Online upgrades pending. Please run nxdbmgr online-upgrade when possible.

Tursiops

No sure about that one.
After the db upgrade I started our server and triggered the online upgrade. It ran in the background for about three days (we have tables with 100+ million rows), but finished ok.
I am assuming the database connection didn't drop out during the upgrade?
From what I understood, it should be safe to re-run the online-upgrade. It is not altering tables, but populating the new zone_uin fields with the correct data, so should just pick up where it left when it failed.

Staj

#4
Unfortunately it doesn't (continue from where it left off) and the system remains down. Message remains the same on reruns, I'm running nxdbmgr.exe online-upgrade -t -X now.

Staj

After looking at upgrade_v22.cpp I think I know what happened.
When the installer automatically tried to upgrade the database, it got stuck on ALTER TABLE syslog ADD zone_uin integer which caused the installer to exit abnormally and the subsequent errors from that.
I believe this happened because the syslog table is so large that ALTER TABLE syslog ADD zone_uin integer was taking a long time and the installer didn't like that too much. Is there an installer option to run the install without it doing the database upgrade within the installer? I think we'd prefer to run it manually ourselves in the future.

Tursiops

Hi,

Not sure what database system you are using. Just adding the column to the table did not take any time at all in our case despite the 100+ million rows (we're using Postgres). It was the subsequent updating of the column with actual data that took forever. Probably more of a developer question at this point, as other db systems might behave differently?

Cheers

Staj

We're using MariaDB, ALTER TABLE eventually failed, we just cleared the table in the end.
It would be useful if nxdbmgr check [check.cpp CheckDatabase() ?] would do a schema sanity check to see if the database matches the version is claims to be, I don't think it does this?