Why you should use rook ceph on kubernetes (onprem)

If you run kubernetes on your own, you need to provide a storage solution with it. We are using ceph (operated through rook). This article gives some short overview about it's benefits and some pro's and con's of it.

Daniel Nachtrub

13 Jun 2023

Many - if not most - kubernetes installations out there are running on cloud providers. And that's fine. Everything is managed and you don't need to think too much about how all this is magically working.

In my experience the most challenging topic of running kubernetes - depending on the cluster size - is not necessarily to keep the control-plane running but rather integrate a well suited storage solution.

On our internal systems we use rook as an operator for ceph. So we provide some nodes on the kubernetes cluster that will run the storage cluster as an application workload in kubernetes and provide this the other workloads.

I don't want to talk to much about why we are using rook (the answer is simple: it's simple and it's reliable and we can use "hyperconverged" nodes). To get some more information about rook: https://rook.io/

What I actually want to answer is:

Why do we use ceph?

From time to time people are why we are using ceph.

There are a few major reasons why.

Ceph is reliable and field proven
Ceph is incredibly robust
Ceph can add more nodes and scale

But this is something many storages can. Can it do more?

Ceph CSI can provide ReadWriteOnce volumes (based on RBDs)
Ceph can provide ReadWriteMany volumes (based on ceph filesystem)
Ceph can provide an object storage with an S3 compatible endpoint

In this regard ceph provides both main volume modes on kubernetes workloads - and you will need both in most cases!

Plus we also have support for an object storage right from the actual storage solution - no additional layer of abstraction (like minio, which is still great!).

Myths about ceph

Isn't ceph creating much overhead?

Sure - if you have a very small cluster ceph will consume quite some CPU and memory. There's stuff about metadata going on, caching and basic CPU requirements.

If you have filesystems (RWX volumes) with milions of files, metadata is growing even further.

But - this is not an issue of ceph at the end. It's an issue of the storage layer every application/implementation will need to handle in one or another way. Other solutions will mostly have similar requirements.

Isn't ceph hard to manage?

This depends on your own experience with storage solutions and - unexpected - ceph itself. Ceph is a quite logical piece of software and behaves in most scenarios like expected. As in any application there are some quirks and stuff you just need to know. Generally speaking it's very intuitive and by default (especially with rook) holds you back from doing too dumb things :-)

Isn't ceph storage slow?

Using ceph as an abstraction layer on top of your physical disks is slower - that's true. And you should expect some serious reduction on IOPS here, especially if you are having very fast devices (like NVMe) and very small I/O (for example 4K ops). This setup is not ceph's very strength, ceph (in our experience) shines much brighter with larger request sizes.

There might be some future post about some detailed performance testing with ceph and some options you might to adjust to squeeze some more out of ceph. But that's mostly in the range of < 20% - don't expect magic here.

But - other abstraction layers do have the same issue here. Mayastor or longhorn show similar overheads than ceph.

This means, if your kubernetes runs database workload that is mostly I/O bound, you might need to think about another storage solution that runs outside of kubernetes at all (maybe some Microsoft S2D based storage that is mapped using SMB3).

Summary

So - when you have some spare time and want to get more into detail about kubernetes and more technical on an infrastructure level of such workloads, check out rook and play with it - it's worth it!

Cloud Container Kubernetes Linux OpenShift Docker

Daniel Nachtrub

Kind of likes computers. Linux foundation certified: LFCS / CKA / CKAD / CKS. Microsoft certified: Cybersecurity Architect Expert & Azure Solutions Architect Expert.

Authors →

Daniel Nachtrub

Sebastian Augustin

Lorenz Maier

Why do we use ceph?

Myths about ceph

Isn't ceph creating much overhead?

Isn't ceph hard to manage?

Isn't ceph storage slow?

Summary

Daniel Nachtrub

You might also like

Linux kernel keyrings, container isolation and maybe some kerberos

Choosing a postgres operator

Authors →

Daniel Nachtrub

Sebastian Augustin

Lorenz Maier