Using cephfs with kubernetes
Introduction
When having a kubernetes cluster running like I do, it is also nice to have a scalable and resilient filesystem where your nodes can share configuration files or other read only data.
Previously I had my cluster pods mount a nfs share inside the pods, but since this is a single point of failure I wanted to move to a distributed filesystem and since I recently installed a ceph cluster for VM storage I decided to also leverage the cephfs filesystem instead of NFS.
This will allow me to directly mount the cephfs filesystem on the kubernetes nodes and mount parts of the filesystem into the pods, which ensures that the pods will always have access to their filesystem unless my entire ceph cluster dies.
So this guide is how to enable ceph on your kubernetes nodes, mount the filesystem and simply having it prepared for your pods.
Assumptions
For the purpose of this blog article we assume that the following values are being used
Client name: kubedata
Cephfs file system name: kubedata
Ceph monitor IP: 192.168.210.10
Ceph version used is Quincy (17.x)
Kubernetes node is a RHEL variant linux, i.e. RHEL, CentOS, Rocky Linux.
Prerequisites
Packages
To help with installing ceph - you can download a python script that help you automate installing many things.
This script requires python3 - if that is not installed you need to install it by entering the following command in a shell
sudo dnf -y install python3
To install the ceph helper script enter the following commands in a shell
curl --silent --remote-name --location https://github.com/ceph/ceph/raw/quincy/src/cephadm/cephadm
chmod +x ./cephadm
sudo ./cephadm add-repo --release quincy
These three lines downloads the cephadm
script, makes it executable and finally adds the ceph repository.
To install the minimum required packages for cephfs enter the following command in a shell
sudo dnf -y install ceph-common
Now the prerequisite software are installed and we can move onto getting the data we require to be able to successfully connect to the ceph cluster.
Data
To be able to mount a cephfs on your kubernetes nodes you require the following information:
- The cluster id
- The cephfs filesystem name (You need to know this by looking it up on the cluster)
- A client credential
All this information can be retrieved automatically by simply connecting to the ceph cluster from the kubernetes node
Actions
Grab required cluster information
To retrieve the information you require to connect successfully to the cluster the following commands should be done
To copy a ceph config you can do the following:
ssh [email protected] "sudo ceph config generate-minimal-conf" | sudo tee /etc/ceph/ceph.conf
This will connect to the monitor host, generate a configuration file and write it to the local machine in /etc/ceph/ceph.conf
This configuration file contains the list of monitor hosts for the ceph cluster and the cluster id. An excample could look like the below
# minimal ceph.conf for da5cbdc2-5c9b-48ab-908a-f03d6b2e6024
[global]
fsid = da5cbdc2-5c9b-48ab-908a-f03d6b2e6024
mon_host = [v2:192.168.210.10:3300/0,v1:192.168.210.10:6789/0] [v2:192.168.210.11:3300/0,v1:192.168.210.11:6789/0] [v2:192.168.210.12:3300/0,v1:192.168.210.12:6789/0]
The last bit we require is to set up authorization, so we need a client secret - this is done by connecting to the ceph monitor again - and authorizing the user kubedata
and writing the secret to local disk.
ssh [email protected] "sudo ceph fs authorize kubedata client.kubedata / rw" | sudo tee /etc/ceph/ceph.client.kubedata.keyring
This command connects to the monitor, authorizes the kubedata
client with read/write permissions on the root of the kubedata
filesystem. Then it writes the secret to /etc/ceph/ceph.client.kubedata.keyring
The contents of the file might look like the below
[client.kubedata]
key = ACDCeedjqx9JMxAABrEmXNxQkWaKfyEAO/AqcQ==
Now all the data required to connect to the cluster is on the local machine, so its simply just grabbing the required information from either /etc/ceph/ceph.conf
or /etc/ceph/ceph.client.kubedata.keyring
Finishing up
Verification
To verify that you have done everything correctly you can mount the cephfs filesystem.
First set up a mount point, i.e. kubedata sudo mkdir /mnt/kubedata
.
Then you mount the filesystem at the mount point
sudo mount.ceph [email protected]=/ /mnt/kubedata
If everything was done correctly, you should see your cephfs filesystem in /mnt/kubedata
and be able to read/write data.
If everything works, you can move on to auto mounting the filesystem when the machine boots - and if its not working you need to find out if you skipped a step - or your cluster requires extra setup not part of this guide.
Auto mount
Auto mounting is easiest to set up in /etc/fstab
So open up the file in your favorite editor and add a line similar to the one below
[email protected]=/ /mnt/kubedata2 ceph mon_addr=192.168.210.10:6789,rw,noatime,_netdev 0 0
The important bits to notice here are the mountpoint /mnt/kubedata
which should match the mount point where you want to mount the filesystem.
The [email protected]
- which is the name of the client, clusterid and the name of the ceph filesystem, i.e. [email protected]
You do not need to add the clusterid and you can just leave it as [email protected]
- but sometimes it is good to be explicit.
The mon_addr=192.168.210.10:6789
which is the ip/port of the ceph monitor - if you have multiple - you separate them with a /
, ie. mon_addr=192.168.210.10:6789/192.168.210.11:6789/192.168.210.12:6789
When the line has been added to the /etc/fstab
file its time to test that it works - this is done by simply calling sudo mount -a
and if you get no errors, your filesystem should be mounted at the mount point and should automatically mount when the server boots.
Last words
All that needs to happen now is that the pods inside my kubernetes cluster needs to be changed, so instead of mounting their configuration and static data from a nfs server, they simply mount the /mnt/kubedata from the kubernetes node.
Which happens in the PersistentVolume/PersistentVolumeClaim definitions, i.e. something similar to the below:
#
# PersistentVolume
#
apiVersion: v1
kind: PersistentVolume
metadata:
name: kubedata
labels:
type: local
spec:
storageClassName: hostpath
capacity:
storage: 256Mi
accessModes:
- ReadWriteMany
hostPath:
path: /mnt/kubedata
persistentVolumeReclaimPolicy: Retain
---
#
# PersistentVolumeClaim
#
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name:kubedata
spec:
storageClassName: hostpath
accessModes:
- ReadWriteMany
resources:
requests:
storage: 256Mi
I hope you enjoyed this post and if you spot errors, please let me know in the comments below on on email directly.