postgres - database has no actual collation version, but a version was recorded
Upgrading a major postgres version using containers with different C libraries caused me some headaches because I go an error "database has no actual collation version, but a version was recorded" - and I did not fix it. At least I can give a hint on why it happend and how you could avoid it.
In my other post postgres - upgrading postgres with timescaledb running in a container I described the way to upgrade postgres within containers. This article will show only the dead-end I've found, so there will not even be a solution. So - don't expect anything :-)
The error
Everything started with a seemingly successfull postgres upgrade. I did start the statistics rebuild an got this:
database "postgres" has no actual collation version, but a version was recorded
This does not sound really bad in the first place - I thought. Let's fix this - I thought.
The most common approach you will find on the web is probably this one https://dba.stackexchange.com/questions/324649/collation-version-mismatch - it recommends that you refresh the collation version and everything should be fine.
So I gave it a try and got
invalid collation version change
And that's where it ends. It just has not been able to perform the version change and I did not find a way to fix it.
So - rollback and try again.
The root cause
Despite the fact I did not find a fix for the situation I found out why it went wrong. It relates to the operating system and architecture in use.
In rough terms it's like:
- Old runtime image (container) is based on alpine
- Migration is done using an ubuntu based image
- New runtime is based on alpine
The main difference between those is that alpine uses musl and ubuntu uses glibc as C library - and that's the library (correct me if I'm wrong) on which the collation is somewhat related.
This means that the pg_upgrade on ubuntu rewrote the collation to collversion 2.31 while the alpine based postgres shows collversion 153.120.
When trying to run a refresh on the collversion postgres is not able to either validate the migration path or just cannot change the collation.
The solution is described in my upgrade post in detail - the approach in a few words: don't switch archicture during upgrade :-)
That's it again - just that this post has no happy end. If you have some approach to fix the collation error, let me know!