This article intends to cover in detail the installation and configuration of Rook, and how to integrate a highly available Ceph Storage Cluster to an existing kubernetes cluster. I’m performing this process on a recent deployment of Kubernetes in Rocky Linux 8 servers. But it can be used with any other Kubernetes Cluster deployed with Kubeadm or automation tools such as Kubespray and Rancher.
In the initial days of Kubernetes, most applications deployed were Stateless meaning there was no need for data persistence. However, as Kubernetes becomes more popular, there was a concern around reliability when scheduling stateful services. Currently, you can use many types of storage volumes including vSphere Volumes, Ceph, AWS Elastic Block Store, Glusterfs, NFS, GCE Persistent Disk among many others. This gives us the comfort of running Stateful services that requires robust storage backend.
What is Rook / Ceph?
Rook is a free to use and powerful cloud-native open source storage orchestrator for Kubernetes. It provides support for a diverse set of storage solutions to natively integrate with cloud-native environments. More details about the storage solutions currently supported by Rook are captured in the project status section.
Ceph is a distributed storage system that provides file, block and object storage and is deployed in large scale production clusters. Rook will enable us to automate deployment, bootstrapping, configuration, scaling and upgrading Ceph Cluster within a Kubernetes environment. Ceph is widely used in an In-House Infrastructure where managed Storage solution is rarely an option.
Rook uses Kubernetes primitives to run and manage Software defined storage on Kubernetes.
Key components of Rook Storage Orchestrator:
- Custom resource definitions (CRDs) – Used to create and customize storage clusters. The CRDs are implemented to Kubernetes during its deployment process.
- Rook Operator for Ceph – It automates the whole configuration of storage components and monitors the cluster to ensure it is healthy and available
- DaemonSet called rook-discover – It starts a pod running discovery agent on every nodes of your Kubernetes cluster to discover any raw disk devices / partitions that can be used as Ceph OSD disk.
- Monitoring – Rook enables Ceph Dashboard and provides metrics collectors/exporters and monitoring dashboards
Features of Rook
- Rook enables you to provision block, file, and object storage with multiple storage providers
- Capability to efficiently distribute and replicate data to minimize potential loss
- Rook is designed to manage open-source storage technologies – NFS, Ceph, Cassandra
- Rook is an open source software released under the Apache 2.0 license
- With Rook you can hyper-scale or hyper-converge your storage clusters within Kubernetes environment
- Rook allows System Administrators to easily enable elastic storage in your datacenter
- By adopting rook as your storage orchestrator you are able to optimize workloads on commodity hardware
Deploy Rook & Ceph Storage on Kubernetes Cluster
These are the minimal setup requirements for the deployment of Rook and Ceph Storage on Kubernetes Cluster.
- A Cluster with minimum of three nodes
- Available raw disk devices (with no partitions or formatted filesystems)
- Or Raw partitions (without formatted filesystem)
- Or Persistent Volumes available from a storage class in block mode
Step 1: Add Raw devices/partitions to nodes that will be used by Rook
List all the nodes in your Kubernetes Cluster and decide which ones will be used in building Ceph Storage Cluster. I recommend you use worker nodes and not the control plane machines.
[[email protected] ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
k8smaster01.hirebestengineers.com Ready control-plane,master 28m v1.22.2
k8smaster02.hirebestengineers.com Ready control-plane,master 24m v1.22.2
k8smaster03.hirebestengineers.com Ready control-plane,master 23m v1.22.2
k8snode01.hirebestengineers.com Ready <none> 22m v1.22.2
k8snode02.hirebestengineers.com Ready <none> 21m v1.22.2
k8snode03.hirebestengineers.com Ready <none> 21m v1.22.2
k8snode04.hirebestengineers.com Ready <none> 21m v1.22.2
In my Lab environment, each of the worker nodes will have one raw device – /dev/vdb which we’ll add later.
[[email protected] ~]# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
vda 253:0 0 40G 0 disk
├─vda1 253:1 0 1M 0 part
├─vda2 253:2 0 1G 0 part /boot
├─vda3 253:3 0 615M 0 part
└─vda4 253:4 0 38.4G 0 part /
[[email protected] ~]# free -h
total used free shared buff/cache available
Mem: 15Gi 209Mi 14Gi 8.0Mi 427Mi 14Gi
Swap: 614Mi 0B 614Mi
The following list of nodes will be used to build storage cluster.
[[email protected] ~]# virsh list | grep k8s-worker
31 k8s-worker-01-server running
36 k8s-worker-02-server running
38 k8s-worker-03-server running
41 k8s-worker-04-server running
Add secondary storage to each node
If using KVM hypervisor, start by listing storage pools:
$ sudo virsh pool-list
Name State Autostart
------------------------------
images active yes
I’ll add a 40GB volume on the default storage pool. This can be done with a for loop:
for domain in k8s-worker-0{1..4}-server; do
sudo virsh vol-create-as images ${domain}-disk-2.qcow2 40G
done
Command execution output:
Vol k8s-worker-01-server-disk-2.qcow2 created
Vol k8s-worker-02-server-disk-2.qcow2 created
Vol k8s-worker-03-server-disk-2.qcow2 created
Vol k8s-worker-04-server-disk-2.qcow2 created
You can check image details including size using qemu-img
command:
$ qemu-img info /var/lib/libvirt/images/k8s-worker-01-server-disk-2.qcow2
image: /var/lib/libvirt/images/k8s-worker-01-server-disk-2.qcow2
file format: raw
virtual size: 40 GiB (42949672960 bytes)
disk size: 40 GiB
To attach created volume(s) above to the Virtual Machine, run:
for domain in k8s-worker-0{1..4}-server; do
sudo virsh attach-disk --domain ${domain} \
--source /var/lib/libvirt/images/${domain}-disk-2.qcow2 \
--persistent --target vdb
done
--persistent
: Make live change persistent--target vdb
: Target of a disk device
Confirm add is successful
Disk attached successfully
Disk attached successfully
Disk attached successfully
Disk attached successfully
You can confirm that the volume was added to the vm as a block device /dev/vdb
[[email protected] ~]# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
vda 253:0 0 40G 0 disk
├─vda1 253:1 0 1M 0 part
├─vda2 253:2 0 1G 0 part /boot
├─vda3 253:3 0 615M 0 part
└─vda4 253:4 0 38.4G 0 part /
vdb 253:16 0 40G 0 disk
Step 2: Deploy Rook Storage Orchestrator
Clone the rook project from Github using git command. This should be done on a machine with kubeconfig configured and confirmed to be working.
You can also clone Rook’s specific branch as in release tag, for example:
cd ~/
git clone --single-branch --branch release-1.8 https://github.com/rook/rook.git
All nodes with available raw devices will be used for the Ceph cluster. As stated earlier, at least three nodes are required
cd rook/deploy/examples/
Deploy the Rook Operator
The first step when performing the deployment of deploy Rook operator is to use.
Create required CRDs as specified in crds.yaml manifest:
[[email protected] ceph]# kubectl create -f crds.yaml
customresourcedefinition.apiextensions.k8s.io/cephblockpools.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/cephclients.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/cephclusters.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/cephfilesystemmirrors.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/cephfilesystems.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/cephnfses.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/cephobjectrealms.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/cephobjectstores.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/cephobjectstoreusers.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/cephobjectzonegroups.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/cephobjectzones.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/cephrbdmirrors.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/objectbucketclaims.objectbucket.io created
customresourcedefinition.apiextensions.k8s.io/objectbuckets.objectbucket.io created
customresourcedefinition.apiextensions.k8s.io/volumereplicationclasses.replication.storage.openshift.io created
customresourcedefinition.apiextensions.k8s.io/volumereplications.replication.storage.openshift.io created
customresourcedefinition.apiextensions.k8s.io/volumes.rook.io created
Create common resources as in common.yaml file:
[[email protected] ceph]# kubectl create -f common.yaml
namespace/rook-ceph created
clusterrolebinding.rbac.authorization.k8s.io/rook-ceph-object-bucket created
serviceaccount/rook-ceph-admission-controller created
clusterrole.rbac.authorization.k8s.io/rook-ceph-admission-controller-role created
clusterrolebinding.rbac.authorization.k8s.io/rook-ceph-admission-controller-rolebinding created
clusterrole.rbac.authorization.k8s.io/rook-ceph-cluster-mgmt created
clusterrole.rbac.authorization.k8s.io/rook-ceph-system created
role.rbac.authorization.k8s.io/rook-ceph-system created
clusterrole.rbac.authorization.k8s.io/rook-ceph-global created
clusterrole.rbac.authorization.k8s.io/rook-ceph-mgr-cluster created
clusterrole.rbac.authorization.k8s.io/rook-ceph-object-bucket created
serviceaccount/rook-ceph-system created
rolebinding.rbac.authorization.k8s.io/rook-ceph-system created
clusterrolebinding.rbac.authorization.k8s.io/rook-ceph-system created
clusterrolebinding.rbac.authorization.k8s.io/rook-ceph-global created
serviceaccount/rook-ceph-osd created
serviceaccount/rook-ceph-mgr created
serviceaccount/rook-ceph-cmd-reporter created
role.rbac.authorization.k8s.io/rook-ceph-osd created
clusterrole.rbac.authorization.k8s.io/rook-ceph-osd created
clusterrole.rbac.authorization.k8s.io/rook-ceph-mgr-system created
role.rbac.authorization.k8s.io/rook-ceph-mgr created
role.rbac.authorization.k8s.io/rook-ceph-cmd-reporter created
rolebinding.rbac.authorization.k8s.io/rook-ceph-cluster-mgmt created
rolebinding.rbac.authorization.k8s.io/rook-ceph-osd created
rolebinding.rbac.authorization.k8s.io/rook-ceph-mgr created
rolebinding.rbac.authorization.k8s.io/rook-ceph-mgr-system created
clusterrolebinding.rbac.authorization.k8s.io/rook-ceph-mgr-cluster created
clusterrolebinding.rbac.authorization.k8s.io/rook-ceph-osd created
rolebinding.rbac.authorization.k8s.io/rook-ceph-cmd-reporter created
Warning: policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+
podsecuritypolicy.policy/00-rook-privileged created
clusterrole.rbac.authorization.k8s.io/psp:rook created
clusterrolebinding.rbac.authorization.k8s.io/rook-ceph-system-psp created
rolebinding.rbac.authorization.k8s.io/rook-ceph-default-psp created
rolebinding.rbac.authorization.k8s.io/rook-ceph-osd-psp created
rolebinding.rbac.authorization.k8s.io/rook-ceph-mgr-psp created
rolebinding.rbac.authorization.k8s.io/rook-ceph-cmd-reporter-psp created
serviceaccount/rook-csi-cephfs-plugin-sa created
serviceaccount/rook-csi-cephfs-provisioner-sa created
role.rbac.authorization.k8s.io/cephfs-external-provisioner-cfg created
rolebinding.rbac.authorization.k8s.io/cephfs-csi-provisioner-role-cfg created
clusterrole.rbac.authorization.k8s.io/cephfs-csi-nodeplugin created
clusterrole.rbac.authorization.k8s.io/cephfs-external-provisioner-runner created
clusterrolebinding.rbac.authorization.k8s.io/rook-csi-cephfs-plugin-sa-psp created
clusterrolebinding.rbac.authorization.k8s.io/rook-csi-cephfs-provisioner-sa-psp created
clusterrolebinding.rbac.authorization.k8s.io/cephfs-csi-nodeplugin created
clusterrolebinding.rbac.authorization.k8s.io/cephfs-csi-provisioner-role created
serviceaccount/rook-csi-rbd-plugin-sa created
serviceaccount/rook-csi-rbd-provisioner-sa created
role.rbac.authorization.k8s.io/rbd-external-provisioner-cfg created
rolebinding.rbac.authorization.k8s.io/rbd-csi-provisioner-role-cfg created
clusterrole.rbac.authorization.k8s.io/rbd-csi-nodeplugin created
clusterrole.rbac.authorization.k8s.io/rbd-external-provisioner-runner created
clusterrolebinding.rbac.authorization.k8s.io/rook-csi-rbd-plugin-sa-psp created
clusterrolebinding.rbac.authorization.k8s.io/rook-csi-rbd-provisioner-sa-psp created
clusterrolebinding.rbac.authorization.k8s.io/rbd-csi-nodeplugin created
clusterrolebinding.rbac.authorization.k8s.io/rbd-csi-provisioner-role created
role.rbac.authorization.k8s.io/rook-ceph-purge-osd created
rolebinding.rbac.authorization.k8s.io/rook-ceph-purge-osd created
serviceaccount/rook-ceph-purge-osd created
Finally deploy Rook ceph operator from operator.yaml manifest file:
[[email protected] ceph]# kubectl create -f operator.yaml
configmap/rook-ceph-operator-config created
deployment.apps/rook-ceph-operator created
After few seconds Rook components should be up and running as seen below:
[[email protected] ceph]# kubectl get all -n rook-ceph
NAME READY STATUS RESTARTS AGE
pod/rook-ceph-operator-9bf8b5959-nz6hd 1/1 Running 0 45s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/rook-ceph-operator 1/1 1 1 45s
NAME DESIRED CURRENT READY AGE
replicaset.apps/rook-ceph-operator-9bf8b5959 1 1 1 45s
Verify the rook-ceph-operator is in the Running
state before proceeding:
[[email protected] ceph]# kubectl -n rook-ceph get pod
NAME READY STATUS RESTARTS AGE
rook-ceph-operator-76dc868c4b-zk2tj 1/1 Running 0 69s
Step 3: Create a Ceph Storage Cluster on Kubernetes using Rook
Now that we have prepared worker nodes by adding raw disk devices and deployed Rook operator, it is time to deploy the Ceph Storage Cluster.
Let’s set default namespace to rook-ceph:
# kubectl config set-context --current --namespace rook-ceph
Context "[email protected]" modified.
Considering that Rook Ceph clusters can discover raw partitions by itself, it is okay to use the default cluster deployment manifest file without any modifications.
[[email protected] ceph]# kubectl create -f cluster.yaml
cephcluster.ceph.rook.io/rook-ceph created
For any further customizations on Ceph Cluster check Ceph Cluster CRD documentation.
When not using all the nodes you can expicitly define the nodes and raw devices to be used as seen in example below:
storage: # cluster level storage configuration and selection
useAllNodes: false
useAllDevices: false
nodes:
- name: "k8snode01.hirebestengineers.com"
devices: # specific devices to use for storage can be specified for each node
- name: "sdb"
- name: "k8snode03.hirebestengineers.com"
devices:
- name: "sdb"
To view all resources created run the following command:
kubectl get all -n rook-ceph
Watching Pods creation in rook-ceph namespace:
[[email protected] ceph]# kubectl get pods -n rook-ceph -w
This is a list of Pods running in the namespace after a successful deployment:
[[email protected] ceph]# kubectl get pods -n rook-ceph
NAME READY STATUS RESTARTS AGE
csi-cephfsplugin-8vrgj 3/3 Running 0 5m39s
csi-cephfsplugin-9csbp 3/3 Running 0 5m39s
csi-cephfsplugin-lh42b 3/3 Running 0 5m39s
csi-cephfsplugin-provisioner-b54db7d9b-kh89q 6/6 Running 0 5m39s
csi-cephfsplugin-provisioner-b54db7d9b-l92gm 6/6 Running 0 5m39s
csi-cephfsplugin-xc8tk 3/3 Running 0 5m39s
csi-rbdplugin-28th4 3/3 Running 0 5m41s
csi-rbdplugin-76bhw 3/3 Running 0 5m41s
csi-rbdplugin-7ll7w 3/3 Running 0 5m41s
csi-rbdplugin-provisioner-5845579d68-5rt4x 6/6 Running 0 5m40s
csi-rbdplugin-provisioner-5845579d68-p6m7r 6/6 Running 0 5m40s
csi-rbdplugin-tjlsk 3/3 Running 0 5m41s
rook-ceph-crashcollector-k8snode01.hirebestengineers.com-7ll2x6 1/1 Running 0 3m3s
rook-ceph-crashcollector-k8snode02.hirebestengineers.com-8ghnq9 1/1 Running 0 2m40s
rook-ceph-crashcollector-k8snode03.hirebestengineers.com-7t88qp 1/1 Running 0 3m14s
rook-ceph-crashcollector-k8snode04.hirebestengineers.com-62n95v 1/1 Running 0 3m14s
rook-ceph-mgr-a-7cf9865b64-nbcxs 1/1 Running 0 3m17s
rook-ceph-mon-a-555c899765-84t2n 1/1 Running 0 5m47s
rook-ceph-mon-b-6bbd666b56-lj44v 1/1 Running 0 4m2s
rook-ceph-mon-c-854c6d56-dpzgc 1/1 Running 0 3m28s
rook-ceph-operator-9bf8b5959-nz6hd 1/1 Running 0 13m
rook-ceph-osd-0-5b7875db98-t5mdv 1/1 Running 0 3m6s
rook-ceph-osd-1-677c4cd89-b5rq2 1/1 Running 0 3m5s
rook-ceph-osd-2-6665bc998f-9ck2f 1/1 Running 0 3m3s
rook-ceph-osd-3-75d7b47647-7vfm4 1/1 Running 0 2m40s
rook-ceph-osd-prepare-k8snode01.hirebestengineers.com--1-6kbkn 0/1 Completed 0 3m14s
rook-ceph-osd-prepare-k8snode02.hirebestengineers.com--1-5hz49 0/1 Completed 0 3m14s
rook-ceph-osd-prepare-k8snode03.hirebestengineers.com--1-4b45z 0/1 Completed 0 3m14s
rook-ceph-osd-prepare-k8snode04.hirebestengineers.com--1-4q8cs 0/1 Completed 0 3m14s
Each worker node will have a Job to add OSDs into Ceph Cluster:
[[email protected] ceph]# kubectl get -n rook-ceph jobs.batch
NAME COMPLETIONS DURATION AGE
rook-ceph-osd-prepare-k8snode01.hirebestengineers.com 1/1 11s 3m46s
rook-ceph-osd-prepare-k8snode02.hirebestengineers.com 1/1 34s 3m46s
rook-ceph-osd-prepare-k8snode03.hirebestengineers.com 1/1 10s 3m46s
rook-ceph-osd-prepare-k8snode04.hirebestengineers.com 1/1 9s 3m46s
[[email protected] ceph]# kubectl describe jobs.batch rook-ceph-osd-prepare-k8snode01.hirebestengineers.com
Verify that the cluster CR has been created and active:
[[email protected] ceph]# kubectl -n rook-ceph get cephcluster
NAME DATADIRHOSTPATH MONCOUNT AGE PHASE MESSAGE HEALTH EXTERNAL
rook-ceph /var/lib/rook 3 3m50s Ready Cluster created successfully HEALTH_OK
Step 4: Deploy Rook Ceph toolbox in Kubernetes
TheRook Ceph toolbox is a container with common tools used for rook debugging and testing. The toolbox is based on CentOS and any additional tools can be easily installed via yum.
We will start a toolbox pod in an Interactive mode for us to connect and execute Ceph commands from a shell. Change to ceph directory:
cd ~/
cd rook/deploy/examples
Apply the toolbox.yaml manifest file to create toolbox pod:
[[email protected] ceph]# kubectl apply -f toolbox.yaml
deployment.apps/rook-ceph-tools created
Connect to the pod using kubectl command with exec option:
[[email protected] ~]# kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- bash
[[email protected] /]#
Check Ceph Storage Cluster Status. Be keen on the value of cluster.health, it should beHEALTH_OK.
[[email protected] /]# ceph status
cluster:
id: 470b7cde-7355-4550-bdd2-0b79d736b8ac
health: HEALTH_OK
services:
mon: 3 daemons, quorum a,b,c (age 5m)
mgr: a(active, since 4m)
osd: 4 osds: 4 up (since 4m), 4 in (since 5m)
data:
pools: 1 pools, 128 pgs
objects: 0 objects, 0 B
usage: 25 MiB used, 160 GiB / 160 GiB avail
pgs: 128 active+clean
List all OSDs to check their current status. They should exist and be up.
[[email protected] /]# ceph osd status
ID HOST USED AVAIL WR OPS WR DATA RD OPS RD DATA STATE
0 k8snode04.hirebestengineers.com 6776k 39.9G 0 0 0 0 exists,up
1 k8snode03.hirebestengineers.com 6264k 39.9G 0 0 0 0 exists,up
2 k8snode01.hirebestengineers.com 6836k 39.9G 0 0 0 0 exists,up
3 k8snode02.hirebestengineers.com 6708k 39.9G 0 0 0 0 exists,up
Check raw storage and pools:
[[email protected] /]# ceph df
--- RAW STORAGE ---
CLASS SIZE AVAIL USED RAW USED %RAW USED
hdd 160 GiB 160 GiB 271 MiB 271 MiB 0.17
TOTAL 160 GiB 160 GiB 271 MiB 271 MiB 0.17
--- POOLS ---
POOL ID PGS STORED OBJECTS USED %USED MAX AVAIL
device_health_metrics 1 32 0 B 0 0 B 0 51 GiB
replicapool 3 32 35 B 8 24 KiB 0 51 GiB
k8fs-metadata 8 128 91 KiB 24 372 KiB 0 51 GiB
k8fs-data0 9 32 0 B 0 0 B 0 51 GiB
[[email protected] /]# rados df
POOL_NAME USED OBJECTS CLONES COPIES MISSING_ON_PRIMARY UNFOUND DEGRADED RD_OPS RD WR_OPS WR USED COMPR UNDER COMPR
device_health_metrics 0 B 0 0 0 0 0 0 0 0 B 0 0 B 0 B 0 B
k8fs-data0 0 B 0 0 0 0 0 0 1 1 KiB 2 1 KiB 0 B 0 B
k8fs-metadata 372 KiB 24 0 72 0 0 0 351347 172 MiB 17 26 KiB 0 B 0 B
replicapool 24 KiB 8 0 24 0 0 0 999 6.9 MiB 1270 167 MiB 0 B 0 B
total_objects 32
total_used 271 MiB
total_avail 160 GiB
total_space 160 GiB
Step 5: Working with Ceph Cluster Storage Modes
You have three types of storage exposed by Rook:
- Shared Filesystem: Create a filesystem to be shared across multiple pods (RWX)
- Block: Create block storage to be consumed by a pod (RWO)
- Object: Create an object store that is accessible inside or outside the Kubernetes cluster
All the necessary files for either storage mode are available in rook/cluster/examples/kubernetes/ceph/
directory.
cd ~/
cd rook/deploy/examples
1. Cephfs
Cephfs is used to enable shared filesystem which can be mounted with read/write permission from multiple pods.
Update the filesystem.yaml file by setting data pool name, replication size e.t.c.
[[email protected] ceph]# vim filesystem.yaml
apiVersion: ceph.rook.io/v1
kind: CephFilesystem
metadata:
name: k8sfs
namespace: rook-ceph # namespace:cluster
Once done with modifications let Rook operator create all the pools and other resources necessary to start the service:
[[email protected] ceph]# kubectl create -f filesystem.yaml
cephfilesystem.ceph.rook.io/k8sfs created
Access Rook toolbox pod and check if metadata and data pools are created.
[[email protected] ceph]# kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- bash
[[email protected] /]# ceph fs ls
name: k8sfs, metadata pool: k8sfs-metadata, data pools: [k8sfs-data0 ]
[[email protected] /]# ceph osd lspools
1 device_health_metrics
3 replicapool
8 k8fs-metadata
9 k8fs-data0
[[email protected] /]# exit
Update the fsName and pool name in Cephfs Storageclass configuration file:
$ vim csi/cephfs/storageclass.yaml
parameters:
clusterID: rook-ceph # namespace:cluster
fsName: k8sfs
pool: k8fs-data0
Create StorageClass using the command:
[[email protected] csi]# kubectl create -f csi/cephfs/storageclass.yaml
storageclass.storage.k8s.io/rook-cephfs created
List available storage classes in your Kubernetes Cluster:
[[email protected] csi]# kubectl get sc
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
rook-cephfs rook-ceph.cephfs.csi.ceph.com Delete Immediate true 97s
Create test PVC and Pod to test usage of Persistent Volume.
[[email protected] csi]# kubectl create -f csi/cephfs/pvc.yaml
persistentvolumeclaim/cephfs-pvc created
[[email protected] ceph]# kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
cephfs-pvc Bound pvc-fd024cc0-dcc3-4a1d-978b-a166a2f65cdb 1Gi RWO rook-cephfs 4m42s
[[email protected] csi]# kubectl create -f csi/cephfs/pod.yaml
pod/csicephfs-demo-pod created
PVC creation manifest file contents:
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: cephfs-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
storageClassName: rook-cephfs
Checking PV creation logs as captured by the provisioner pod:
[[email protected] csi]# kubectl logs deploy/csi-cephfsplugin-provisioner -f -c csi-provisioner
[[email protected] ceph]# kubectl get pods | grep csi-cephfsplugin-provision
csi-cephfsplugin-provisioner-b54db7d9b-5dpt6 6/6 Running 0 4m30s
csi-cephfsplugin-provisioner-b54db7d9b-wrbxh 6/6 Running 0 4m30s
If you made an update and provisioner didn’t pick you can always restart the Cephfs Provisioner Pods:
# Gracefully
$ kubectl delete pod -l app=csi-cephfsplugin-provisioner
# Forcefully
$ kubectl delete pod -l app=csi-cephfsplugin-provisioner --grace-period=0 --force
2. RBD
Block storage allows a single pod to mount storage (RWO mode). Before Rook can provision storage, a StorageClass and CephBlockPool need to be created
[[email protected] ~]# cd
[[email protected] ~]# cd rook/deploy/examples
[[email protected] csi]# kubectl create -f csi/rbd/storageclass.yaml
cephblockpool.ceph.rook.io/replicapool created
storageclass.storage.k8s.io/rook-ceph-block created
[[email protected] csi]# kubectl create -f csi/rbd/pvc.yaml
persistentvolumeclaim/rbd-pvc created
List StorageClasses and PVCs:
[[email protected] csi]# kubectl get sc
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
rook-ceph-block rook-ceph.rbd.csi.ceph.com Delete Immediate true 49s
rook-cephfs rook-ceph.cephfs.csi.ceph.com Delete Immediate true 6h17m
[[email protected] csi]# kubectl get pvc rbd-pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
rbd-pvc Bound pvc-c093e6f7-bb4e-48df-84a7-5fa99fe81138 1Gi RWO rook-ceph-block 43s
Deploying multiple apps
We will create a sample application to consume the block storage provisioned by Rook with the classic wordpress and mysql apps. Both of these apps will make use of block volumes provisioned by Rook.
[[email protected] ~]# cd
[[email protected] ~]# cd rook/deploy/examples
[[email protected] kubernetes]# kubectl create -f mysql.yaml
service/wordpress-mysql created
persistentvolumeclaim/mysql-pv-claim created
deployment.apps/wordpress-mysql created
[[email protected] kubernetes]# kubectl create -f wordpress.yaml
service/wordpress created
persistentvolumeclaim/wp-pv-claim created
deployment.apps/wordpress created
Both of these apps create a block volume and mount it to their respective pod. You can see the Kubernetes volume claims by running the following:
[[email protected] kubernetes]# kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
cephfs-pvc Bound pvc-aa972f9d-ab53-45f6-84c1-35a192339d2e 1Gi RWO rook-cephfs 2m59s
mysql-pv-claim Bound pvc-4f1e541a-1d7c-49b3-93ef-f50e74145057 20Gi RWO rook-ceph-block 10s
rbd-pvc Bound pvc-68e680c1-762e-4435-bbfe-964a4057094a 1Gi RWO rook-ceph-block 47s
wp-pv-claim Bound pvc-fe2239a5-26c0-4ebc-be50-79dc8e33dc6b 20Gi RWO rook-ceph-block 5s
Check deployment of MySQL and WordPress Services:
[[email protected] kubernetes]# kubectl get deploy wordpress wordpress-mysql
NAME READY UP-TO-DATE AVAILABLE AGE
wordpress 1/1 1 1 2m46s
wordpress-mysql 1/1 1 1 3m8s
[[email protected] kubernetes]# kubectl get svc wordpress wordpress-mysql
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
wordpress LoadBalancer 10.98.120.112 <pending> 80:32046/TCP 3m39s
wordpress-mysql ClusterIP None <none> 3306/TCP 4m1s
Retrieve WordPress NodePort and test URL using LB IP address and the port.
NodePort=$(kubectl get service wordpress -o jsonpath='{.spec.ports[0].nodePort}')
echo $NodePort
Cleanup Storage test PVC and pods
[[email protected] kubernetes]# kubectl delete -f mysql.yaml
service "wordpress-mysql" deleted
persistentvolumeclaim "mysql-pv-claim" deleted
deployment.apps "wordpress-mysql" deleted
[[email protected] kubernetes]# kubectl delete -f wordpress.yaml
service "wordpress" deleted
persistentvolumeclaim "wp-pv-claim" deleted
deployment.apps "wordpress" deleted
# Cephfs cleanup
[[email protected] kubernetes]# kubectl delete -f ceph/csi/cephfs/pod.yaml
[[email protected] kubernetes]# kubectl delete -f ceph/csi/cephfs/pvc.yaml
# RBD Cleanup
[[email protected] kubernetes]# kubectl delete -f ceph/csi/rbd/pod.yaml
[[email protected] kubernetes]# kubectl delete -f ceph/csi/rbd/pvc.yaml
Step 6: Accessing Ceph Dashboard
The Ceph dashboard gives you an overview of the status of your Ceph cluster:
- The overall health
- The status of the mon quorum
- The sstatus of the mgr, and osds
- Status of other Ceph daemons
- View pools and PG status
- Logs for the daemons, and much more.
List services in rook-ceph namespace:
[[email protected] ceph]# kubectl get svc -n rook-ceph
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
csi-cephfsplugin-metrics ClusterIP 10.105.10.255 <none> 8080/TCP,8081/TCP 9m56s
csi-rbdplugin-metrics ClusterIP 10.96.5.0 <none> 8080/TCP,8081/TCP 9m57s
rook-ceph-mgr ClusterIP 10.103.171.189 <none> 9283/TCP 7m31s
rook-ceph-mgr-dashboard ClusterIP 10.102.140.148 <none> 8443/TCP 7m31s
rook-ceph-mon-a ClusterIP 10.102.120.254 <none> 6789/TCP,3300/TCP 10m
rook-ceph-mon-b ClusterIP 10.97.249.82 <none> 6789/TCP,3300/TCP 8m19s
rook-ceph-mon-c ClusterIP 10.99.131.50 <none> 6789/TCP,3300/TCP 7m46s
From the output we can confirm port 8443 was configured.
Use port forwarding to access the dashboard:
$ kubectl port-forward service/rook-ceph-mgr-dashboard 8443:8443 -n rook-ceph
Forwarding from 127.0.0.1:8443 -> 8443
Forwarding from [::1]:8443 -> 8443
Now, should be accessible over https://locallhost:8443
Login username is admin and password can be extracted using the following command:
kubectl -n rook-ceph get secret rook-ceph-dashboard-password -o jsonpath="{['data']['password']}" | base64 --decode && echo
Access Dashboard with Node Port
To create a service with the NodePort, save this yaml as dashboard-external-https.yaml.
# cd
# vim dashboard-external-https.yaml
apiVersion: v1
kind: Service
metadata:
name: rook-ceph-mgr-dashboard-external-https
namespace: rook-ceph
labels:
app: rook-ceph-mgr
rook_cluster: rook-ceph
spec:
ports:
- name: dashboard
port: 8443
protocol: TCP
targetPort: 8443
selector:
app: rook-ceph-mgr
rook_cluster: rook-ceph
sessionAffinity: None
type: NodePort
Create a service that listens on Node Port:
[[email protected] ~]# kubectl create -f dashboard-external-https.yaml
service/rook-ceph-mgr-dashboard-external-https created
Check new service created:
[[email protected] ~]# kubectl -n rook-ceph get service rook-ceph-mgr-dashboard-external-https
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
rook-ceph-mgr-dashboard-external-https NodePort 10.103.91.41 <none> 8443:32573/TCP 2m43s
In this example, port 32573 will be opened to expose port 8443 from the ceph-mgr pod. Now you can enter the URL in your browser such as https://[clusternodeip]:32573
and the dashboard will appear.
Login with admin username and password decoded from rook-ceph-dashboard-password secret.
kubectl -n rook-ceph get secret rook-ceph-dashboard-password -o jsonpath="{['data']['password']}" | base64 --decode && echo
Ceph dashboard view:
Hosts list:
Bonus: Tearing Down the Ceph Cluster
If you want to tear down the cluster and bring up a new one, be aware of the following resources that will need to be cleaned up:
rook-ceph
namespace: The Rook operator and cluster created byoperator.yaml
andcluster.yaml
(the cluster CRD)/var/lib/rook
: Path on each host in the cluster where configuration is cached by the ceph mons and osds
All CRDs in the cluster.
[[email protected] ~]# kubectl get crds
NAME CREATED AT
apiservers.operator.tigera.io 2021-09-24T18:09:12Z
bgpconfigurations.crd.projectcalico.org 2021-09-24T18:09:12Z
bgppeers.crd.projectcalico.org 2021-09-24T18:09:12Z
blockaffinities.crd.projectcalico.org 2021-09-24T18:09:12Z
cephclusters.ceph.rook.io 2021-09-30T20:32:10Z
clusterinformations.crd.projectcalico.org 2021-09-24T18:09:12Z
felixconfigurations.crd.projectcalico.org 2021-09-24T18:09:12Z
globalnetworkpolicies.crd.projectcalico.org 2021-09-24T18:09:12Z
globalnetworksets.crd.projectcalico.org 2021-09-24T18:09:12Z
hostendpoints.crd.projectcalico.org 2021-09-24T18:09:12Z
imagesets.operator.tigera.io 2021-09-24T18:09:12Z
installations.operator.tigera.io 2021-09-24T18:09:12Z
ipamblocks.crd.projectcalico.org 2021-09-24T18:09:12Z
ipamconfigs.crd.projectcalico.org 2021-09-24T18:09:12Z
ipamhandles.crd.projectcalico.org 2021-09-24T18:09:12Z
ippools.crd.projectcalico.org 2021-09-24T18:09:12Z
kubecontrollersconfigurations.crd.projectcalico.org 2021-09-24T18:09:12Z
networkpolicies.crd.projectcalico.org 2021-09-24T18:09:12Z
networksets.crd.projectcalico.org 2021-09-24T18:09:12Z
tigerastatuses.operator.tigera.io 2021-09-24T18:09:12Z
Edit the CephCluster and add the cleanupPolicy
kubectl -n rook-ceph patch cephcluster rook-ceph --type merge -p '{"spec":{"cleanupPolicy":{"confirmation":"yes-really-destroy-data"}}}'
Delete block storage and file storage:
cd ~/
cd rook/deploy/examples
kubectl delete -n rook-ceph cephblockpool replicapool
kubectl delete -f csi/rbd/storageclass.yaml
kubectl delete -f filesystem.yaml
kubectl delete -f csi/cephfs/storageclass.yaml
Delete the CephCluster Custom Resource:
[[email protected] ~]# kubectl -n rook-ceph delete cephcluster rook-ceph
cephcluster.ceph.rook.io "rook-ceph" deleted
Verify that the cluster CR has been deleted before continuing to the next step.
kubectl -n rook-ceph get cephcluster
Delete the Operator and related Resources
kubectl delete -f operator.yaml
kubectl delete -f common.yaml
kubectl delete -f crds.yaml
Zapping Devices
# Set the raw disk / raw partition path
DISK="/dev/vdb"
# Zap the disk to a fresh, usable state (zap-all is important, b/c MBR has to be clean)
# Install: yum install gdisk -y Or apt install gdisk
sgdisk --zap-all $DISK
# Clean hdds with dd
dd if=/dev/zero of="$DISK" bs=1M count=100 oflag=direct,dsync
# Clean disks such as ssd with blkdiscard instead of dd
blkdiscard $DISK
# These steps only have to be run once on each node
# If rook sets up osds using ceph-volume, teardown leaves some devices mapped that lock the disks.
ls /dev/mapper/ceph-* | xargs -I% -- dmsetup remove %
# ceph-volume setup can leave ceph-<UUID> directories in /dev and /dev/mapper (unnecessary clutter)
rm -rf /dev/ceph-*
rm -rf /dev/mapper/ceph--*
# Inform the OS of partition table changes
partprobe $DISK
Removing the Cluster CRD Finalizer:
for CRD in $(kubectl get crd -n rook-ceph | awk '/ceph.rook.io/ {print $1}'); do
kubectl get -n rook-ceph "$CRD" -o name | \
xargs -I {} kubectl patch -n rook-ceph {} --type merge -p '{"metadata":{"finalizers": [null]}}'
done
If the namespace is still stuck in Terminating state as seen in the command below:
$ kubectl get ns rook-ceph
NAME STATUS AGE
rook-ceph Terminating 23h
You can check which resources are holding up the deletion and remove the finalizers and delete those resources.
kubectl api-resources --verbs=list --namespaced -o name | xargs -n 1 kubectl get --show-kind --ignore-not-found -n rook-ceph
From my output the resource is configmap named rook-ceph-mon-endpoints:
NAME DATA AGE
configmap/rook-ceph-mon-endpoints 4 23h
Delete the resource manually:
# kubectl delete configmap/rook-ceph-mon-endpoints -n rook-ceph
configmap "rook-ceph-mon-endpoints" deleted
Recommended reading: