One of my recent activities has been an upgrade of our monitoring (we are using Zabbix for this) database from postgres 13 to postgres 15. To achieve high performance on time series data we are using the timescaledb extensions on postgres.
The database has around 3TB of data and is therefore running on a dedicated machine. To avoid installing packages and stuff around we are running it in a container environment - in this case we use docker to manage it.
Failing with my own guides
So - I went actually using my own guide to upgrade postgres (https://blog.nuvotex.de/upgrade-timescaledbs-postgres/) and I failed. But why?
The problem is quite nasty: The timescaledb container image builds on top of the postgres container image which in turn builds on top of alpine.
Using my guide the migration will run on ubuntu. And running pg_ugprade on ubuntu seemingly applies the collation matching a version on the operating system (which might again relate to glibc).
Long story short: Using another (non-musl) architecture during upgrading introduces some collation information in the database which will then cause issues when starting the upgraded database. The error message show is:
database "zabbix" has no actual collation version, but a version was recorded
I will make a dedicated blog post on this.
How to upgrade using containers - approach number two
I will try to make a compact guide on how the upgrade procedure looks like in general.
- Run an ephermal container using the current version of your database. Attach a volume (to /pgbin) to it where you can store some files
- Copy over the bin directory of the current postgres installation to the attached volume (on alpine just take: /usr/local/bin)
- Stop the ephermal container
- Stop the container running the production server
- Start a new container using the target version of your database. Attach the volume from above (/pgbin) and the existing data volume (/pgdata)
- Follow this procedure to upgrade postgres (shown below)
- Start new database server
- Upgrade extensions
- Rebuild statistics on the new server
Postgres upgrade procedure in more detail
The steps to perform the upgrade are shown in the script. Basically it's preparing the new database and running pg_upgrade - if possible use the --link option because this avoids copying over vast amounts of data.
The output in my case looked like this.
First step is to init the new database directory:
Then do the dry run of the upgrade
And finally the upgrade itself
/pgbin/pg15/bin/pg_upgrade -b /pgbin/pg13/bin -B /pgbin/pg15/bin -d /pgdata/pg13/ -D /pgdata/pg15 --link Performing Consistency Checks ----------------------------- Checking cluster versions ok Checking database user is the install user ok Checking database connection settings ok Checking for prepared transactions ok Checking for system-defined composite types in user tables ok Checking for reg* data types in user tables ok Checking for contrib/isn with bigint-passing mismatch ok Checking for user-defined encoding conversions ok Checking for user-defined postfix operators ok Checking for incompatible polymorphic functions ok Creating dump of global objects ok Creating dump of database schemas ok Checking for presence of required libraries ok Checking database user is the install user ok Checking for prepared transactions ok Checking for new cluster tablespace directories ok If pg_upgrade fails after this point, you must re-initdb the new cluster before continuing. Performing Upgrade ------------------ Analyzing all rows in the new cluster ok Freezing all rows in the new cluster ok Deleting files from new pg_xact ok Copying old pg_xact to new server ok Setting oldest XID for new cluster ok Setting next transaction ID and epoch for new cluster ok Deleting files from new pg_multixact/offsets ok Copying old pg_multixact/offsets to new server ok Deleting files from new pg_multixact/members ok Copying old pg_multixact/members to new server ok Setting next multixact ID and offset for new cluster ok Resetting WAL archives ok Setting frozenxid and minmxid counters in new cluster ok Restoring global objects in the new cluster ok Restoring database schemas in the new cluster ok Adding ".old" suffix to old global/pg_control ok If you want to start the old cluster, you will need to remove the ".old" suffix from /pgdata/pg13/global/pg_control.old. Because "link" mode was used, the old cluster cannot be safely started once the new cluster has been started. Linking user relation files ok Setting next OID for new cluster ok Sync data directory to disk ok Creating script to delete old cluster ok Checking for extension updates notice Your installation contains extensions that should be updated with the ALTER EXTENSION command. The file update_extensions.sql when executed by psql by the database superuser will update these extensions. Upgrade Complete ---------------- Optimizer statistics are not transferred by pg_upgrade. Once you start the new server, consider running: /pgbin/pg15/bin/vacuumdb --all --analyze-in-stages Running this script will delete the old cluster's data files: ./delete_old_cluster.sh
If you are using extensions, you should (or might even need) upgrade them also. The command in my case (shown in update_extensions.sql, see output above) is:
\connect template1 ALTER EXTENSION "timescaledb" UPDATE; \connect postgres ALTER EXTENSION "timescaledb" UPDATE; \connect zabbix ALTER EXTENSION "timescaledb" UPDATE;
Rebuilding the statistics
When you've reached this point and everything is fine - you've done the upgrade.
Now you need to rebuild the statistics because they are not copied during the upgrade - unless you want to wait for autovacuum to kick in (which I strongly disagree, the database should not be usable without statistics).
That's it - you have successfully upgrade your server!