Recently, I needed to rollback a kubernetes control-plane node to an older snapshot. This cause (obviously) that etcd could not operate anymore in the etcd cluster. The approach in this scenario is to remove the etcd node from the cluster and add it again. The removal of it is rather
On some of our systems we've recently upgraded Ubuntu to 22.04 LTS (from 20.04 LTS). And one thing that has been problematic afterwards is that containers kept crashing - or to be more exact: the containers have been killed because the probes failed. This affected cilium as well
Most of you will be aware that runc is one of the (currently) most commonly used container runtimes. In this image you can see runc is available besides other runtimes like kata of firecracker. If you need stronger isolation you might also be awaer of gVisor. However - there's another
Our software engineers are more and more working with AI which sometimes raises entirely new requirements on the environment. One of those is that we wanted to pin larger language models (in this case platypus2 70B parameters, or falcon 40B) to memory on a development host that runs microk8s. Long
On one of our systems we had issues that - once a week - the I/O subsystem stalled and causes issues on database operations.
Upgrading a major postgres version using containers with different C libraries caused me some headaches because I go an error "database has no actual collation version, but a version was recorded" - and I did not fix it. At least I can give a hint on why it happend and how you could avoid it.
Uprading postgres with timescaledb caused me some issues related to the collation. After some retries I've found a reliable way to doing the upgrade. This post describes the steps to be done.
Sometimes you are still facing standalone systems that need to provide storage based services - like backup targets. Running this on Windows Server offers your the option to use StorageBusCache to provide awesome speed.