Deploy Ceph Cluster with ceph-ansible

Merouane Agar
4 min readJan 25, 2021

ceph-ansible is widely deployed, but is not integrated with the new orchestrator APIs, introduced in Nautlius and Octopus, which means that newer management features and dashboard integration are not available.

1. Prepare your ansible environment

  • [ansible-server] : Add repo centos-release-ansible
[root@ansible ~]# yum install -y centos-release-ansible-29
-----
-----
Installed:
centos-release-ansible-29-1-2.el8.noarch centos-release-configmanagement-1-1.el8.noarch
Complete!
  • [ansible-server] : install ansible python3 git sudo
[root@ansible ~]# yum install -y ansible python3 git sudo
  • [ansible-server] : Create an ansible account and give it all the sudo privileges
[root@ansible ~]# useradd ansible
[root@ansible ~]# echo "ansible ALL=(ALL:ALL) NOPASSWD: ALL" > /etc/sudoers.d/ansible
  • [ansible-server] : Generate an ssh key
[root@ansible ~]# su - ansible
[ansible@ansible ~]$ ssh-keygen -t rsa
  • [ceph-server] : Preparer your ceph server
[root@ceph-server ~]# useradd ansible
[root@ceph-server ~]# echo "ansible ALL=(ALL:ALL) NOPASSWD: ALL" > /etc/sudoers.d/ansible
  • [ansible-server] : Copy ssh key
[ansible@ansible ~]$ ssh-copy-id ceph-server
  • [ansible-server] : Checking
[ansible@ansible ~]$ sudo bash -c 'cat > /etc/ansible/hosts' << EOF
[mons]
ceph-server
EOF
[ansible@ansible ~]$ ansible mons -m ping
The authenticity of host 'ceph-server (10.9.0.15)' can't be established.
ECDSA key fingerprint is SHA256:lZsTEnWFROpBZSimqPIWxFFR50JOsAM+TzZpVqnpeYo.
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
ceph-server | SUCCESS => {
"ansible_facts": {
"discovered_interpreter_python": "/usr/libexec/platform-python"
},
"changed": false,
"ping": "pong"
}
[ansible@ansible ~]$ ansible mons -m command -a id
ceph-server | CHANGED | rc=0 >>
uid=1001(ansible) gid=1001(ansible) groups=1001(ansible) context=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023
[ansible@ansible ~]$ ansible mons -m command -a id -b
ceph-server | CHANGED | rc=0 >>
uid=0(root) gid=0(root) groups=0(root) context=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023

2. Prepare ceph-ansible & configuration

  • [ansible-server] : Clone ceph-ansible repo, and use last stable version
[ansible@ansible ~]# git clone https://github.com/ceph/ceph-ansible.git[ansible@ansible ~]# cd ceph-ansible/[ansible@ansible ceph-ansible]# git branch -r | grep stable- | tail -1
origin/stable-5.0
[ansible@ansible ceph-ansible]# git checkout --track origin/stable-5.0
  • [ansible-server] : Install other needed Python libraries with pip:
[ansible@ansible ceph-ansible]# sudo python3 -m pip install -r requirements.txt
  • [ceph-server] : Check the storage block names available for ceph, as well as the IP configuration
[root@ceph-server ~]# lsblk 
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 20G 0 disk
├─sda1 8:1 0 1G 0 part /boot
└─sda2 8:2 0 19G 0 part
├─cs-root 253:0 0 17G 0 lvm /
└─cs-swap 253:1 0 2G 0 lvm [SWAP]
sdb 8:16 0 32G 0 disk
sdc 8:32 0 32G 0 disk
sdd 8:48 0 32G 0 disk
sr0 11:0 1 9G 0 rom
[root@ceph-server ~]# ip a show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens18: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel
link/ether 6e:ae:60:81:63:d6 brd ff:ff:ff:ff:ff:ff
inet 10.9.0.15/24 brd 10.9.0.255 scope global dynamic noprefixroute ens18
valid_lft 7155sec preferred_lft 7155sec
inet6 fe80::d343:89f2:3f30:791d/64 scope link noprefixroute
valid_lft forever preferred_lft forever
  • [ansible-server] : Ceph cluster configuration

more information : group_vars/all.yml.sample file

[ansible@ansible ceph-ansible]$ cp site.yml.sample site.yml[ansible@ansible ceph-ansible]# cat > group_vars/all.yml << EOF
---
dummy:
ceph_origin: repository
ceph_repository: community
ceph_stable_release: octopus
public_network: "10.9.0.0/24"
cluster_network: "{{ public_network }}"
journal_size: 1024
monitor_interface: ens18
dashboard_enabled: False

EOF
  • [ansible-server] : osd configuration

more information : group_vars/osds.yml.sample file

[ansible@ansible ceph-ansible]# cat > group_vars/osds.yml << EOF
---
dummy:
devices:
- /dev/sdb
- /dev/sdc
- /dev/sdd
osd_scenario: "collocated"
EOF
  • [ansible-server] : build your ansible inventory
[ansible@ansible ceph-ansible]$ sudo bash -c 'cat > /etc/ansible/hosts' << EOF
[mons]
ceph-server
[mgrs]
ceph-server
[osds]
ceph-server
EOF

3. Deploy Ceph Cluster

[ansible@ansible ceph-ansible]$ ansible-playbook site.yml
...
...
INSTALLER STATUS ********************************************************************
Install Ceph Monitor : Complete (0:00:33)
Install Ceph Manager : Complete (0:00:36)
Install Ceph OSD : Complete (0:00:49)

4. Inspection

[root@ceph-server ~]# ceph -s
cluster:
id: 4c7f2124-53cc-4dfe-9827-ed02ac68fdd2
health: HEALTH_WARN
Reduced data availability: 1 pg inactive
Degraded data redundancy: 1 pg undersized

services:
mon: 1 daemons, quorum ceph-server (age 2d)
mgr: ceph-server(active, since 2d)
osd: 3 osds: 3 up (since 2d), 3 in (since 2d)

data:
pools: 1 pools, 1 pgs
objects: 0 objects, 0 B
usage: 3.0 GiB used, 93 GiB / 96 GiB avail
pgs: 100.000% pgs not active
1 undersized+peered

5. Purge Cluster

[ansible@ansible ceph-ansible]$ ansible-playbook  infrastructure-playbooks/purge-cluster.yml     
[WARNING]: Could not match supplied host pattern, ignoring: grafana-server
Are you sure you want to purge the cluster? [no]: yes

--

--