Setting up a ceph cluster

Ceph is an open-source storage platform that is built on top of an object storage and provides different ways to build application storage on top. Today, three types of storage are provided directly by ceph: object- block- and file-level storage. This article will focus on how to deploy a ceph cluster from zero.

Environment

To keep this article compact, we will skip some basic steps that you'll have to do on your own as preparation.

  • Three nodes running debian 10 (buster)
  • Each node has a system disk (/dev/sda) and a data disk available for ceph (/dev/sdb)
  • All nodes have already docker up and running
  • All nodes have proper time sync
  • All nodes are located in the same subnet

Basically you'll need the machines building the cluster up and running with each system providing an additional disk (later used for a ceph storage daemon). Systems should have docker already installed. To keep things simple, place them in the same subnet (which will mostly be the case anyway as latency is critical).

A word about hostnames

Ceph requires the hostname of your nodes to be a hostname - not an FQDN. Make sure that the hostnames of the ceph nodes is set to the shortname (host part only) and configure the FQDN of each node in /etc/hosts. If your DNS doesn't resolve the domain correctly for hostnames you'll want to assign all members in /etc/hosts.

10.56.34.16     nuv-dc-apphost1.dc.nuv.ntxzone.local nuv-dc-apphost1
10.56.34.17     nuv-dc-apphost2.dc.nuv.ntxzone.local nuv-dc-apphost2
10.56.34.18     nuv-dc-apphost3.dc.nuv.ntxzone.local nuv-dc-apphost3
/etc/hosts

This works for smaller and static setups with only a few hosts. If you're setting up a larger environment, make sure you can rely on you're DNS infrastructure to resolve hostnames correctly.

Create ceph configuration folder

Make sure to create folder /etc/ceph on all nodes participating the installation.

Set firewall rules

I'll assume that you've already set up your system firewalls. Because ceph requires quite some ports for communication you'll want to create the appropiate rules.

If you're working on debian/ubuntu and you're using ufw, you may use the following snippet to create all required rules.

ufw allow 3300/tcp comment "ceph monitor v2"
ufw allow 6789/tcp comment "ceph monitor v1"
ufw allow 6800:7300/tcp comment "ceph communication"
ufw allow 9093/tcp comment "ceph alertmanager"
ufw allow 9095/tcp comment "ceph prometheus"
ufw allow 9100/tcp comment "ceph communication"
ufw allow 9283/tcp comment "ceph prometheus"
create firewall rules

Depending on the exact ceph version and setup you're running, not all ports might be required.

Bootstrap the cluster

First part for installing the cluster is to get the latest cephadm build and then bootstrap the cluster.

Get cephadm

On each node that should be used to manage ceph (propably each node running ceph manager roles) i would recommend to install cephadm. You'll need this on at least one node to bootstrap the cluster.

Installing cephadm is done by downloading the the build from github repo. There's no need to move the binary to a bin-path - this will be done by the tool itself.

cd /tmp
 
wget -q -O- 'https://download.ceph.com/keys/release.asc' | apt-key add -
echo deb https://download.ceph.com/debian-octopus/ $(lsb_release -sc) main | tee /etc/apt/sources.list.d/ceph.list
apt-get update
 
curl --silent --remote-name --location https://github.com/ceph/ceph/raw/octopus/src/cephadm/cephadm
chmod +x cephadm
 
./cephadm install cephadm ceph-common
Install cephadm

This will add the ceph repository to apt, download cephadm build and install all required components to get ceph up and running.

Bootstrap the cluster

After cephadm is in place, the cluster can be created using the following command:

cephadm bootstrap --mon-ip 10.56.34.16
Bootstrap the cluster

The parameter mon-ip contains the ip-address that should be associated with the first monitoring-node (normally an ip address of the node you're working on right now).

Running this command will add several docker containers that host a monitor, manager, dashboard, prometheus, grafana and crash instance.

Adjusting management ports

Depending on your setup you may want to adjust the management ports of the ceph dashboard.

# adjust dashboard ports
ceph config set mgr mgr/dashboard/server_port 10080
ceph config set mgr mgr/dashboard/ssl_server_port 10443
ufw allow 10443/tcp comment "ceph dashboard"
 
# reload dashboard to apply changes
ceph mgr module disable dashboard
ceph mgr module enable dashboard
set dashboard ports

This will set the HTTP/HTTPS mangement ports to 10080 / 10443 and open the system firewall for HTTPS management. You'll propably not want to open HTTP at all in your firewall.

Set TLS certificate for dashboard

As we don't want to accept certificate trust errors we're creating certificates on our internal CA and assign them to the dashboard. This can be done easily using the CLI.

# assign certificate
ceph dashboard set-ssl-certificate -i dashboard.crt
ceph dashboard set-ssl-certificate-key -i dashboard.key
 
# apply changes
ceph mgr module disable dashboard
ceph mgr module enable dashboard
set dashboard tls certificate
Use of elliptic curve In my first installation (ceph version 15.2.6) i tried to assign certificates using elliptic-curves - i can tell for sure ceph didn't like this and the dashboard hasn't started anymore. Therefore use of RSA certificates is currently mandatory.

Manually set certificates in config store

If you need to swap certificates and your dashboard module is not working anymore, you cannot use the ceph dashboard commands because they will fail. In this case you may just directly access the ceph config store (ceph config-key) and remove the keys associated with the certificate (keys for common dashboard certificates are: mgr/dashboard/crt and mgr/dashboard/key).

That's all we need to set up the first node.

Add nodes to the cluster

Until now our cluster is built on a single node. Obviously we want to add more nodes. This will be our next step.

Enable SSH key auth

The next step is to enable SSH key auth to enable the ceph user from our current node to access the other nodes and run commands there. During the bootstrap a public key has been placed to /etc/ceph/ceph.pub for this purpose. Just use ssh-copy-id to copy the files to the other upcoming ceph nodes.

ssh-copy-id -f -i /etc/ceph/ceph.pub root@nuv-dc-apphost2
ssh-copy-id -f -i /etc/ceph/ceph.pub root@nuv-dc-apphost3
enable ssh key auth

Add nodes to ceph

As our primary node is now able to communicate with the other nodes we're going to add the remaining nodes to our cluster. This can be done easily using ceph orch host add.

# add nodes
ceph orch host add nuv-dc-apphost2
ceph orch host add nuv-dc-apphost3
 
# check cluster hosts
ceph orch host ls
 
# ceph orch host ls
HOST             ADDR             LABELS  STATUS  
nuv-dc-apphost1  nuv-dc-apphost1  mon             
nuv-dc-apphost2  nuv-dc-apphost2  mon             
nuv-dc-apphost3  nuv-dc-apphost3  mon       
add and verify cluster nodes

As you can see we just needed two commands to add two more nodes. ceph orch host ls will list the current state of our nodes.

Management from all cluster nodes

If you want to enable management of the cluster from all nodes, you may enable this using the following command:

ceph config set mgr mgr/cephadm/manage_etc_ceph_ceph_conf true

This will enable ceph to create some keyrings on /etc/ceph for you on all cluster nodes. Running a smaller setup this may be intended.

Ceph monitors

Running ceph monitors are quite important for the whole cluster to work.

Number of monitors

Generally ceph scales horizontal with the number of nodes. On medium to larger clusters having five monitors is recommended. In our setup we're only running three nodes, therefore we cannot run five monitors.

In case you want to adjust the number of monitors in your cluster, use the following command.

ceph orch apply mon 3
set number of monitors

Monitor placement

If you have a larger cluster and want to declare some nodes that should host monitors there are in general two ways to achieve this.

  • Static assignment: Assign monitor hosting nodes by name
  • Labeling nodes: Assign labels to nodes that may host monitors

Static assignment means that you explicitly assign nodes where monitors should be placed on.

Static placement

This will place the monitors on the specified hosts.

ceph orch apply mon nuv-dc-apphost1,nuv-dc-apphost2,nuv-dc-apphost3
static placement
Placement by tag

I personally prefer assigning monitors using labels - using labels is more transparent when viewing the hosts in the cluster.

# assign hosts that should be able to host monitor
ceph orch host label add nuv-dc-apphost1 mon
ceph orch host label add nuv-dc-apphost2 mon
ceph orch host label add nuv-dc-apphost3 mon
 
# set monitor placement strategy
ceph orch apply mon --placement="label:mon"
 
# ceph orch host ls
HOST             ADDR             LABELS  STATUS  
nuv-dc-apphost1  nuv-dc-apphost1  mon             
nuv-dc-apphost2  nuv-dc-apphost2  mon             
nuv-dc-apphost3  nuv-dc-apphost3  mon             
 
# ceph orch ls mon
NAME  RUNNING  REFRESHED  AGE  PLACEMENT  IMAGE NAME               IMAGE ID      
mon       3/3  11m ago    5m   label:mon  docker.io/ceph/ceph:v15  f16a759354cc  
place monitors using labels

If you don't supply a placement strategy, every node will be taken into consideration when placing monitors.

Use of Placement strategies Placement strategies may also be applied to other daemons like mgr, crash, grafana or prometheus, etc.

Verify cluster

Having completed the steps above your cluster is up and running. You may check the health using the dashboard or CLI (ceph health). Your cluster should be running in HEALTHY state right now.

In an upcoming article i'll cover how to create a ceph filesystem (cephfs) that is mounted on a client.