Setting up a ceph cluster
Ceph is an open-source storage platform that is built on top of an object storage and provides different ways to build application storage on top. Today, three types of storage are provided directly by ceph: object- block- and file-level storage. This article will focus on how to deploy a ceph cluster from zero.
Environment
To keep this article compact, we will skip some basic steps that you'll have to do on your own as preparation.
- Three nodes running debian 10 (buster)
- Each node has a system disk (/dev/sda) and a data disk available for ceph (/dev/sdb)
- All nodes have already docker up and running
- All nodes have proper time sync
- All nodes are located in the same subnet
Basically you'll need the machines building the cluster up and running with each system providing an additional disk (later used for a ceph storage daemon). Systems should have docker already installed. To keep things simple, place them in the same subnet (which will mostly be the case anyway as latency is critical).
A word about hostnames
Ceph requires the hostname of your nodes to be a hostname - not an FQDN. Make sure that the hostnames of the ceph nodes is set to the shortname (host part only) and configure the FQDN of each node in /etc/hosts. If your DNS doesn't resolve the domain correctly for hostnames you'll want to assign all members in /etc/hosts.
10.56.34.16 nuv-dc-apphost1.dc.nuv.ntxzone.local nuv-dc-apphost1
10.56.34.17 nuv-dc-apphost2.dc.nuv.ntxzone.local nuv-dc-apphost2
10.56.34.18 nuv-dc-apphost3.dc.nuv.ntxzone.local nuv-dc-apphost3
This works for smaller and static setups with only a few hosts. If you're setting up a larger environment, make sure you can rely on you're DNS infrastructure to resolve hostnames correctly.
Create ceph configuration folder
Make sure to create folder /etc/ceph on all nodes participating the installation.
Set firewall rules
I'll assume that you've already set up your system firewalls. Because ceph requires quite some ports for communication you'll want to create the appropiate rules.
If you're working on debian/ubuntu and you're using ufw, you may use the following snippet to create all required rules.
ufw allow 3300/tcp comment "ceph monitor v2"
ufw allow 6789/tcp comment "ceph monitor v1"
ufw allow 6800:7300/tcp comment "ceph communication"
ufw allow 9093/tcp comment "ceph alertmanager"
ufw allow 9095/tcp comment "ceph prometheus"
ufw allow 9100/tcp comment "ceph communication"
ufw allow 9283/tcp comment "ceph prometheus"
Depending on the exact ceph version and setup you're running, not all ports might be required.
Bootstrap the cluster
First part for installing the cluster is to get the latest cephadm build and then bootstrap the cluster.
Get cephadm
On each node that should be used to manage ceph (propably each node running ceph manager roles) i would recommend to install cephadm. You'll need this on at least one node to bootstrap the cluster.
Installing cephadm is done by downloading the the build from github repo. There's no need to move the binary to a bin-path - this will be done by the tool itself.
cd /tmp
wget -q -O- 'https://download.ceph.com/keys/release.asc' | apt-key add -
echo deb https://download.ceph.com/debian-octopus/ $(lsb_release -sc) main | tee /etc/apt/sources.list.d/ceph.list
apt-get update
curl --silent --remote-name --location https://github.com/ceph/ceph/raw/octopus/src/cephadm/cephadm
chmod +x cephadm
./cephadm install cephadm ceph-common
This will add the ceph repository to apt, download cephadm build and install all required components to get ceph up and running.
Bootstrap the cluster
After cephadm is in place, the cluster can be created using the following command:
cephadm bootstrap --mon-ip 10.56.34.16
The parameter mon-ip contains the ip-address that should be associated with the first monitoring-node (normally an ip address of the node you're working on right now).
Running this command will add several docker containers that host a monitor, manager, dashboard, prometheus, grafana and crash instance.
Adjusting management ports
Depending on your setup you may want to adjust the management ports of the ceph dashboard.
# adjust dashboard ports
ceph config set mgr mgr/dashboard/server_port 10080
ceph config set mgr mgr/dashboard/ssl_server_port 10443
ufw allow 10443/tcp comment "ceph dashboard"
# reload dashboard to apply changes
ceph mgr module disable dashboard
ceph mgr module enable dashboard
This will set the HTTP/HTTPS mangement ports to 10080 / 10443 and open the system firewall for HTTPS management. You'll propably not want to open HTTP at all in your firewall.
Set TLS certificate for dashboard
As we don't want to accept certificate trust errors we're creating certificates on our internal CA and assign them to the dashboard. This can be done easily using the CLI.
# assign certificate
ceph dashboard set-ssl-certificate -i dashboard.crt
ceph dashboard set-ssl-certificate-key -i dashboard.key
# apply changes
ceph mgr module disable dashboard
ceph mgr module enable dashboard
Manually set certificates in config store
If you need to swap certificates and your dashboard module is not working anymore, you cannot use the ceph dashboard commands because they will fail. In this case you may just directly access the ceph config store (ceph config-key) and remove the keys associated with the certificate (keys for common dashboard certificates are: mgr/dashboard/crt and mgr/dashboard/key).
That's all we need to set up the first node.
Add nodes to the cluster
Until now our cluster is built on a single node. Obviously we want to add more nodes. This will be our next step.
Enable SSH key auth
The next step is to enable SSH key auth to enable the ceph user from our current node to access the other nodes and run commands there. During the bootstrap a public key has been placed to /etc/ceph/ceph.pub for this purpose. Just use ssh-copy-id to copy the files to the other upcoming ceph nodes.
ssh-copy-id -f -i /etc/ceph/ceph.pub root@nuv-dc-apphost2
ssh-copy-id -f -i /etc/ceph/ceph.pub root@nuv-dc-apphost3
Add nodes to ceph
As our primary node is now able to communicate with the other nodes we're going to add the remaining nodes to our cluster. This can be done easily using ceph orch host add.
# add nodes
ceph orch host add nuv-dc-apphost2
ceph orch host add nuv-dc-apphost3
# check cluster hosts
ceph orch host ls
# ceph orch host ls
HOST ADDR LABELS STATUS
nuv-dc-apphost1 nuv-dc-apphost1 mon
nuv-dc-apphost2 nuv-dc-apphost2 mon
nuv-dc-apphost3 nuv-dc-apphost3 mon
As you can see we just needed two commands to add two more nodes. ceph orch host ls will list the current state of our nodes.
Management from all cluster nodes
If you want to enable management of the cluster from all nodes, you may enable this using the following command:
ceph config set mgr mgr/cephadm/manage_etc_ceph_ceph_conf true
This will enable ceph to create some keyrings on /etc/ceph for you on all cluster nodes. Running a smaller setup this may be intended.
Ceph monitors
Running ceph monitors are quite important for the whole cluster to work.
Number of monitors
Generally ceph scales horizontal with the number of nodes. On medium to larger clusters having five monitors is recommended. In our setup we're only running three nodes, therefore we cannot run five monitors.
In case you want to adjust the number of monitors in your cluster, use the following command.
ceph orch apply mon 3
Monitor placement
If you have a larger cluster and want to declare some nodes that should host monitors there are in general two ways to achieve this.
- Static assignment: Assign monitor hosting nodes by name
- Labeling nodes: Assign labels to nodes that may host monitors
Static assignment means that you explicitly assign nodes where monitors should be placed on.
Static placement
This will place the monitors on the specified hosts.
ceph orch apply mon nuv-dc-apphost1,nuv-dc-apphost2,nuv-dc-apphost3
Placement by tag
I personally prefer assigning monitors using labels - using labels is more transparent when viewing the hosts in the cluster.
# assign hosts that should be able to host monitor
ceph orch host label add nuv-dc-apphost1 mon
ceph orch host label add nuv-dc-apphost2 mon
ceph orch host label add nuv-dc-apphost3 mon
# set monitor placement strategy
ceph orch apply mon --placement="label:mon"
# ceph orch host ls
HOST ADDR LABELS STATUS
nuv-dc-apphost1 nuv-dc-apphost1 mon
nuv-dc-apphost2 nuv-dc-apphost2 mon
nuv-dc-apphost3 nuv-dc-apphost3 mon
# ceph orch ls mon
NAME RUNNING REFRESHED AGE PLACEMENT IMAGE NAME IMAGE ID
mon 3/3 11m ago 5m label:mon docker.io/ceph/ceph:v15 f16a759354cc
If you don't supply a placement strategy, every node will be taken into consideration when placing monitors.
Verify cluster
Having completed the steps above your cluster is up and running. You may check the health using the dashboard or CLI (ceph health). Your cluster should be running in HEALTHY state right now.
In an upcoming article i'll cover how to create a ceph filesystem (cephfs) that is mounted on a client.