vlambda博客
学习文章列表

读书笔记《ceph-cookbook-second-edition》使用Ceph块设备

Working with Ceph Block Device

在本章中,我们将介绍以下秘籍:

  • Configuring Ceph client
  • Creating Ceph Block Device
  • Mapping Ceph Block Device
  • Resizing Ceph RBD
  • Working with RBD snapshots
  • Working with RBD clones
  • Disaster recovery replication using RBD mirroring
  • Configuring pools for RBD mirroring with one way replication
  • Configuring image mirroring
  • Configuring two-way mirroring
  • Recovering from a disaster!

Introduction

安装和配置 Ceph 存储集群后,下一个任务就是执行存储配置。存储配置是为物理或虚拟服务器分配存储空间或容量的过程,采用块、文件或对象存储的形式。典型的计算机系统或服务器具有有限的本地存储容量,可能不足以满足您的数据存储需求。

Ceph 等存储解决方案为这些服务器提供了几乎无限的存储容量,使它们能够存储您的所有数据并确保您不会耗尽空间。使用专用存储系统代替本地存储可为您提供在可扩展性、可靠性和性能方面急需的灵活性。

Ceph 可以统一配置存储容量,包括块存储、文件系统存储和对象存储。下图显示了 Ceph 支持的存储格式,根据您的使用案例,您可以选择一个或多个存储选项:

读书笔记《ceph-cookbook-second-edition》使用Ceph块设备

我们将在本书中详细讨论这些选项,但在本章中,我们将重点介绍 Ceph 块存储。

Ceph 块设备,以前称为 RADOS 块设备< /strong>,为客户提供可靠、分布式、高性能的块存储磁盘。 RADOS 块设备使用 librbd 库,并在 Ceph 集群中的多个 OSD 上以条带化的顺序形式存储数据块。 RBD 由 Ceph 的 RADOS 层支持,因此每个块设备都分布在多个 Ceph 节点上,提供高性能和出色的可靠性。 RBD 对 Linux 内核具有原生支持,这意味着 RBD 驱动程序在过去几年中与 Linux 内核很好地集成在一起。除了可靠性和性能外,RBD 还提供企业级功能,例如完整和增量快照、精简配置、写时复制克隆、动态调整大小等。 RBD 还支持内存缓存,这极大地提高了它的性能:

读书笔记《ceph-cookbook-second-edition》使用Ceph块设备

KVM 和 Xen 等行业领先的开源虚拟机管理程序为 RBD 提供全面支持,并在其客户虚拟机上利用其功能。其他专有虚拟机管理程序,例如 VMware 和 Microsoft Hyper-V 将很快得到支持。社区中正在进行大量工作来支持这些虚拟机管理程序。 Ceph 块设备为 OpenStack、CloudStack 等云平台提供全面支持。对于这些云平台,它已被证明是成功且功能丰富的。在 OpenStack 中,您可以将 Ceph 块设备与 Cinder(块)和 Glance(映像)组件一起使用。这样做,您可以旋转数以千计的虚拟机VM) 我n 非常短的时间,利用 Ceph 块存储的写时复制功能。< /跨度>

所有这些特性使 RBD 成为 OpenStack 和 CloudStack 等云平台的理想选择。我们现在将学习如何创建 Ceph 块设备并使用它。

Configuring Ceph client

任何常规 Linux 主机(基于 RHEL 或 Debian)都可以充当 Ceph 客户端。客户端通过网络与 Ceph 存储集群交互以存储或检索用户数据。从 2.6.34 及更高版本开始,已将 Ceph RBD 支持添加到 Linux 主线内核。

 How to do it...

正如我们之前所做的,我们将使用 Vagrant 和 VirtualBox 设置一个 Ceph 客户端机器。我们将使用我们在上一章中克隆的相同的 Vagrantfile。 Vagrant 将启动一个 CentOS 7.3 虚拟机,我们将其配置为 Ceph 客户端:

  1. From the directory where we cloned the Ceph-Cookbook-Second-Edition GitHub repository, launch the client virtual machine using Vagrant:
         $ vagrant status client-node1
         $ vagrant up client-node1
读书笔记《ceph-cookbook-second-edition》使用Ceph块设备
  1. Log in to client-node1 and update the node:
      $ vagrant ssh client-node1
      $ sudo yum update -y     
The username and password that Vagrant uses to configure virtual machines is vagrant, and Vagrant has sudo rights. The default password for the root user is vagrant.
  1. Check OS and kernel release (this is optional):
        # cat /etc/centos-release
        # uname -r
  1. Check for RBD support in the kernel:
        # sudo modprobe rbd
读书笔记《ceph-cookbook-second-edition》使用Ceph块设备
  1. Allow ceph-node1 monitor machine to access client-node1 over SSH. To do this, copy root SSH keys from ceph-node1 to client-node1 Vagrant user. Execute the following commands from ceph-node1 machine until otherwise specified:
        ## Log in to the ceph-node1 machine
        $ vagrant ssh ceph-node1
        $ sudo su -
        # ssh-copy-id vagrant@client-node1

<提供一次性Vagrant用户密码,即vagrant /span> client-node1. < /span> ceph-node1 to client-node1,你应该可以不用密码登录到client-node1

  1. Using Ansible, we will create the ceph-client role which will copy the Ceph configuration file and administration keyring to the client node. On our Ansible administration node, ceph-node1, add a new section [clients] to the /etc/ansible/hosts file:
读书笔记《ceph-cookbook-second-edition》使用Ceph块设备
  1. Go to the /etc/ansible/group_vars directory on ceph-node1 and create a copy of clients.yml from the clients.yml.sample:
        # cp clients.yml.sample clients.yml
您可以通过更新 clients.yml 文件。通过取消注释 user_config 并设置为 true,您可以使用 Cephx 功能一起定义客户池和客户端名称。
  1. Run the Ansible playbook from ceph-node1:
        root@ceph-node1 ceph-ansible # ansible-playbook site.yml
读书笔记《ceph-cookbook-second-edition》使用Ceph块设备
  1. On client-node1 check and validate that the keyring and ceph.conf file were populated into the /etc/ceph directory by Ansible:
读书笔记《ceph-cookbook-second-edition》使用Ceph块设备
  1. On client-node1 you can validate that the Ceph client packages were installed by Ansible:
读书笔记《ceph-cookbook-second-edition》使用Ceph块设备
  1. The client machine will require Ceph keys to access the Ceph cluster. Ceph creates a default user, client.admin, which has full access to the Ceph cluster and Ansible copies the client.admin key to client nodes. It's not recommended to share client.admin keys with client nodes. A better approach is to create a new Ceph user with separate keys and allow access to specific Ceph pools.
    In our case, we will create a Ceph user, client.rbd, with access to the RBD pool. By default, Ceph Block Devices are created on the RBD pool:
读书笔记《ceph-cookbook-second-edition》使用Ceph块设备
  1. Add the key to client-node1 machine for client.rbd user:
读书笔记《ceph-cookbook-second-edition》使用Ceph块设备
  1. By this step, client-node1 should be ready to act as a Ceph client. Check the cluster status from the client-node1 machine by providing the username and secret key:

        # cat /etc/ceph/ceph.client.rbd.keyring >> /etc/ceph/keyring
        ### Since we are not using the default user client.admin we 
        need to supply username that will connect to the Ceph cluster
        # ceph -s --name client.rbd

读书笔记《ceph-cookbook-second-edition》使用Ceph块设备

Creating Ceph Block Device

到目前为止,我们已经配置好了 Ceph 客户端,现在我们将演示从 client-node1client-node1< /kbd> 机器。

How to do it...

  1. Create a RADOS Block Device named rbd1 of size 10240 MB:
        # rbd create rbd1 --size 10240 --name client.rbd
  1. There are multiple options that you can use to list RBD images:
        ## The default pool to store block device images is "rbd",
           you can also specify the pool name with the rbd 
           command using -p option:
        # rbd ls --name client.rbd
        # rbd ls -p rbd --name client.rbd
        # rbd list --name client.rbd
  1. Check the details of the RBD image:
        # rbd --image rbd1 info --name client.rbd
读书笔记《ceph-cookbook-second-edition》使用Ceph块设备

Mapping Ceph Block Device

现在我们已经在 Ceph 集群上创建了一个块设备,为了使用这个块设备,我们需要将它映射到客户端机器上。为此,请从 client-node1 机器执行以下命令。

How to do it...

  1. Map the block device to the client-node1:
        # rbd map --image rbd1 --name client.rbd
读书笔记《ceph-cookbook-second-edition》使用Ceph块设备
请注意,由于特征集不匹配,图像映射失败!
  1. With Ceph Jewel the new default format for RBD images is 2 and Ceph Jewel default configuration includes the following default Ceph Block Device features:
    • layering: layering support
    • exclusive-lock: exclusive locking support
    • object-map: object map support (requires exclusive-lock)
    • deep-flatten: snapshot flatten support
    • fast-diff: fast diff calculations (requires object-map)

      Using the krbd (kernel rbd) client on client-node1 we will be unable to map the block device image on CentOS kernel 3.10 as this kernel does not support object-map, deep-flatten and fast-diff (support was introduced in kernel 4.9). In order to work around this we will disable the unsupported features, there are several options to do this:
      • Disable the unsupported features dynamically (this is the option we will be using):
                               # rbd feature disable rbd1 
                                 exclusive-lock object-map 
                                 deep-flatten fast-diff
      • When creating the RBD image initially utilize the --image-feature layering option with the rbd create command which will only enable the layering feature:
                               # rbd create rbd1 --size 10240 
                                 --image-feature layering 
                                 --name client.rbd
      • Disable the feature in the Ceph configuration file:
                           rbd_default_features = 1
所有这些功能都适用于用户空间 RBD 客户端 librbd。
读书笔记《ceph-cookbook-second-edition》使用Ceph块设备
  1. Retry mapping the block device with the unsupported features now disabled:
        # rbd map --image rbd1 --name client.rbd
  1. Check the mapped block device:
        rbd showmapped --name client.rbd
读书笔记《ceph-cookbook-second-edition》使用Ceph块设备
  1. To make use of this block device, we should create a filesystem on this and mount it:
      # fdisk -l /dev/rbd0
      # mkfs.xfs /dev/rbd0
      # mkdir /mnt/ceph-disk1
      # mount /dev/rbd0 /mnt/ceph-disk1
      # df -h /mnt/ceph-disk1
读书笔记《ceph-cookbook-second-edition》使用Ceph块设备
  1. Test the block device by writing data to it:
        dd if=/dev/zero of=/mnt/ceph-disk1/file1 count=100 bs=1M
读书笔记《ceph-cookbook-second-edition》使用Ceph块设备
  1. To map the block device across reboots, we will need to create and configure a services file:
    1. Create a new file in the /usr/local/bin directory for mounting and unmounting and include the following:
                    # cd /usr/local/bin
                    # vim rbd-mount
读书笔记《ceph-cookbook-second-edition》使用Ceph块设备
    1. Save the file and make it executable:
              # sudo chmod +x rbd-mount

这可以通过从 Ceph-Cookbook-Second-Edition 存储库中获取 rbd-mount 脚本并使其可执行来自动完成:

              # wget https://raw.githubusercontent.com/PacktPublishing/
                Ceph-Cookbook-Second-Edition/master/
                rbdmap -O /usr/local/bin/rbd-mount
              # chmod +x /usr/local/bin/rbd-mount
    1. Go to the systemd directory and create the service file, include the following in the file rbd-mount.service:
               # cd /etc/systemd/system/
               # vim rbd-mount.service
读书笔记《ceph-cookbook-second-edition》使用Ceph块设备

这可以通过从 Ceph-Cookbook-Second-Edition 存储库中获取服务文件来自动完成:

              # wget https://raw.githubusercontent.com/PacktPublishing/
                Ceph-Cookbook-Second-Edition/master/rbd-mount.service
    1. After saving the file and exiting Vim, reload the systemd files and enable the rbd-mount.service to start at boot time:
              # systemctl daemon-reload
              # systemctl enable rbd-mount.service
  1. Reboot client-node1 and verify that block device rbd0 is mounted to /mnt/ceph-disk1 after the reboot:
         root@client-node1 # reboot -f
         # df -h
读书笔记《ceph-cookbook-second-edition》使用Ceph块设备

Resizing Ceph RBD

Ceph 支持精简配置块设备,这意味着在您开始在块设备上存储数据之前,物理存储空间不会被占用。 Ceph 块设备非常灵活;您可以从 Ceph 存储端动态增加或减少 RBD 的大小。但是,底层文件系统应该支持调整大小。 XFS、Btrfs、EXT、ZFS 等高级文件系统在一定程度上支持文件系统大小调整。请遵循文件系统特定文档以了解有关调整大小的更多信息。

XFS 目前不支持收缩,Btrfs 和 ext4 支持收缩,但应谨慎操作!

How to do it...

要增加或减少 Ceph RBD 映像大小,请在 rbd resize 命令中使用 --size <New_Size_in_MB> 选项,这将为 RBD 映像设置新大小:

  1. The original size of the RBD image that we created earlier was 10 GB. We will now increase its size to 20 GB:
        # rbd resize --image rbd1 --size 20480 --name client.rbd
        # rbd info --image rbd1 --name client.rbd
读书笔记《ceph-cookbook-second-edition》使用Ceph块设备
  1. Grow the filesystem so that we can make use of increased storage space. It's worth knowing that the filesystem resize is a feature of the OS as well as the device filesystem. You should read filesystem documentation before resizing any partition. The XFS filesystem supports online resizing. Check system messages to know the filesystem size change (you will notice df -h shows the original 10G size even though we resized, as the filesystem still see's the original size):
         # df -h
         # lsblk
         # dmesg | grep -i capacity
         # xfs_growfs -d /mnt/ceph-disk1
读书笔记《ceph-cookbook-second-edition》使用Ceph块设备

Working with RBD snapshots

Ceph 扩展了对快照的全面支持,快照是 RBD 映像的时间点只读副本。您可以通过创建快照并恢复快照以获取原始数据来保留 Ceph RBD 映像的状态。

如果您在对映像进行 I/O 处理时拍摄 RBD 映像的快照,则该快照可能会不一致。如果发生这种情况,您将需要将快照克隆到新映像以使其可挂载。拍摄快照时,建议在拍摄快照之前停止从应用程序到映像的 I/O。这可以通过自定义应用程序以在快照之前发出冻结来完成,或者可以使用 fsfreeze 命令手动完成(fsfreeze 的手册页进一步详细说明了此命令)。

How to do it...

让我们看看快照如何与 Ceph 一起工作:

  1. 为了测试 Ceph 的快照功能,让我们在之前创建的块设备上创建一个文件:

         # echo "Hello Ceph This is snapshot test" 
           > /mnt/ceph-disk1/snapshot_test_file
读书笔记《ceph-cookbook-second-edition》使用Ceph块设备
  1. Create a snapshot for the Ceph Block Device. Syntax for the same is as follows:
        # rbd snap create <pool name>/<image name>@<snap name>
        # rbd snap create rbd/rbd1@snapshot1
  1. To list the snapshots of an image, use the following syntax:
        # rbd snap ls <pool name>/<image name>
        # rbd snap ls rbd/rbd1
读书笔记《ceph-cookbook-second-edition》使用Ceph块设备
  1. To test the snapshot restore functionality of Ceph RBD, let's delete files from the filesystem:
        # rm -f /mnt/ceph-disk1/*
  1. We will now restore the Ceph RBD snapshot to get back the files that we deleted in the last step. Please note that a rollback operation will overwrite the current version of the RBD image and its data with the snapshot version. You should perform this operation carefully. The syntax is as follows:
        # rbd snap rollback <pool-name>/<image-name>@<snap-name>
        # umount /mnt/ceph-disk1
        # rbd snap rollback rbd/rbd1@snapshot1 --name client.rbd
在回滚操作之前,文件系统被卸载以验证回滚后刷新的文件系统状态。
  1. Once the snapshot rollback operation is completed, remount the Ceph RBD filesystem to refresh the filesystem state. You should be able to get your deleted files back:
         # mount /dev/rbd0 /mnt/ceph-disk1
         # ls -l /mnt/ceph-disk1
读书笔记《ceph-cookbook-second-edition》使用Ceph块设备
  1. You are also able to rename snapshots if you so choose. The syntax is as follows:
        #rbd snap rename <pool-name>/<image-name>@<original-snapshot- name> <pool-name>/<image-name>@<new-snapshot-name>
        # rbd snap rename rbd/rbd1@snapshot1 rbd/rbd1@snapshot1_new
  1. When you no longer need snapshots, you can remove a specific snapshot using the following syntax. Deleting the snapshot will not delete your current data on the Ceph RBD image:
        # rbd snap rm <pool-name>/<image-name>@<snap-name>
        # rbd snap rm rbd/rbd1@snapshot1 --name client.rbd
  1. If you have multiple snapshots of an RBD image and you wish to delete all the snapshots with a single command, then use the purge subcommand. The syntax is as follows:
         # rbd snap purge <pool-name>/<image-name>
         # rbd snap purge rbd/rbd1 --name client.rbd

Working with RBD clones

Ceph 支持从 RBD 快照创建 Copy-On-Write (COW) 克隆的非常好的功能。这在 Ceph 中也称为快照分层。分层允许客户端创建多个 Ceph RBD 的即时克隆。此功能对于 OpenStack、CloudStack、Qemu/KVM 等云和虚拟化平台非常有用。这些平台通常以快照的形式保护包含 OS/VM 映像的 Ceph RBD 映像。稍后,此快照会被克隆多次以生成新的虚拟机/实例。快照是只读的,但 COW 克隆是完全可写的; Ceph 的这一特性提供了更高级别的灵活性,并且在云平台中非常有用。在后面的章节中,我们将发现更多关于用于生成 OpenStack 实例的 COW 克隆:

读书笔记《ceph-cookbook-second-edition》使用Ceph块设备

每个克隆的图像(子图像)都存储其父快照的引用以读取图像数据。因此,在将父快照用于克隆之前,应该对其进行保护。在 COW 克隆映像上写入数据时,它会存储对自身的新数据引用。 COW 克隆图像与 RBD 一样好。它们像 RBD 一样非常灵活,这意味着它们是可写的、可调整大小的,并且支持快照和进一步克隆。

在 Ceph RBD 中,镜像有两种类型:format-1 和 format-2。 RBD 快照功能适用于两种类型,即格式 1 和格式 2 RBD 映像。但是,分层功能(COW 克隆功能)仅适用于格式为 2 的 RBD 映像。 Jewel 中默认的 RBD 图像格式是 format-2。

How to do it...

为了演示 RBD 克隆,我们将有意创建一个 RBD 映像(指定分层功能),然后创建并保护其快照,最后,从中创建 COW 克隆:

  1. Create an RBD image with layering feature specified and check it's details:
        # rbd create rbd2 --size 10240 
          --image-feature layering --name client.rbd
        # rbd info --image rbd2 --name client.rbd
读书笔记《ceph-cookbook-second-edition》使用Ceph块设备
  1. Create a snapshot of this RBD image:
        # rbd snap create rbd/rbd2@snapshot_for_cloning 
          --name client.rbd
  1. To create a COW clone, protect the snapshot. This is an important step, we should protect the snapshot because if the snapshot gets deleted, all the attached COW clones will be destroyed:
        # rbd snap protect rbd/rbd2@snapshot_for_cloning 
          --name client.rbd
  1. Next, we will create a cloned RBD image, specifying the layering feature, using this snapshot. The syntax is as follows:
         # rbd clone <pool-name>/<parent-image-name>@<snap-name> 
           <pool-name>/<child_image-name> 
           --image-feature <feature-name>
         # rbd clone rbd/rbd2@snapshot_for_cloning rbd/clone_rbd2 
           --image-feature layering --name client.rbd
  1. Creating a clone is a quick process. Once it's completed, check the new image information. You will notice that its parent pool, image, and snapshot information will be displayed:
        # rbd info rbd/clone_rbd2 --name client.rbd

客户端并不总是提供等效的功能,例如 fuse 客户端支持 client-enforced 配额,而内核客户端不支持:

读书笔记《ceph-cookbook-second-edition》使用Ceph块设备
  1. You also have the ability to list children of a snapshot. To list the children of a snapshot execute the following:
        # rbd children rbd/rbd2@snapshot_for_cloning

我们现在有一个克隆的 RBD 映像,它依赖于它的父映像。要将这个克隆的图像从它的父快照中拆分出来,我们需要将图像展平,这需要将所有数据从父快照图像复制到克隆。拼合可能需要一段时间才能完成,具体取决于父快照图像的大小。一个克隆的图像被展平,父快照和 RBD 克隆之间不再有关系。请注意,拼合的图像将包含快照中的所有信息,并且将使用比克隆更多的空间。

  1. To initiate the flattening process, use the following:
        # rbd flatten rbd/clone_rbd2 --name client.rbd
        # rbd info --image clone_rbd2 --name client.rbd

扁平化过程完成后,如果您检查图像信息,您会注意到父图像/快照名称不存在并且克隆是独立的:

读书笔记《ceph-cookbook-second-edition》使用Ceph块设备

如果在图像上启用了 deep-flatten 功能,则默认情况下图像克隆与其父级分离。

  1. You can also remove the parent image snapshot if you no longer require it. Before removing the snapshot, you first have to unprotect it:
        # rbd snap unprotect rbd/rbd2@snapshot_for_cloning 
          --name client.rbd
  1. Once the snapshot is unprotected, you can remove it:
         # rbd snap rm rbd/rbd2@snapshot_for_cloning --name client.rbd 

Disaster recovery replication using RBD mirroring

RBD 镜像是在多个 Ceph 集群之间异步复制 RBD 镜像。 RBD 镜像验证对 RBD 映像的任何更改的时间点一致副本,包括快照、克隆、读写 IOPS 和块设备大小调整。 RBD 镜像可以在主动+主动设置或主动+被动设置中运行。 RBD 镜像利用 RBD 日志和独占锁定功能,使 RBD 映像能够按照发生的顺序记录对映像的所有更改。这些功能验证远程映像的崩溃一致性副本在本地可用。在 Ceph 集群上启用镜像之前,必须在 RBD 映像上启用日志功能。

rbd-mirror 守护进程负责确保从一个 Ceph 集群到另一个集群的时间点一致性。根据您选择的复制类型,rbd-mirror 守护程序在单个集群或每个参与镜像的集群上运行:

  • One way replication: Data is mirrored from primary site to secondary site. RBD mirror runs only on secondary site.
  • Two-way replication: Active-Active configuration. Data is mirrored from primary site to secondary site and mirrored back from a primary site to secondary site. RBD mirror runs on both primary and secondary sites.

由于我们将 RBD 映像存储在本地和远程池中,因此最佳实践是确保支持镜像池的 CRUSH 层次结构具有相同的速度、大小和类型的媒体。此外,应在站点之间分配适当的带宽以处理镜像流量。

读书笔记《ceph-cookbook-second-edition》使用Ceph块设备

How to do it...

在这个秘籍中,我们将使用 Ceph 节点 ceph-node5ceph-node6ceph-node7 配置第二个 Ceph 集群。之前的 第 1 章Ceph – 简介及其他,可以参考使用 Ansible 设置您的第二个 Ceph 集群,其中节点 5、6 和 7 代替了配方中的 1、2 和 3,在运行之前必须进行一些亮点和更改辅助集群上的剧本如下:

  1. Your /etc/ansible/hosts file from each of your Ansible configuration nodes (ceph-node1 and ceph-node5) should look as follows:
      #Primary site (ceph-node1):
       [mons]
       ceph-node1
       ceph-node2
       ceph-node3
       [osds]
       ceph-node1
       ceph-node2
       ceph-node3
       #Secondary site (ceph-node5):
       [mons]
        ceph-node5
        ceph-node6
        ceph-node7
        [osds]
        ceph-node5
        ceph-node6
        ceph-node7
  1. Your cluster will require a distinct name, the default cluster naming is ceph. Since our primary cluster is named ceph our secondary cluster must be named something different. For this recipe, we will name the secondary cluster as backup. We will need to edit the all.yml file on ceph-node5 to reflect this change prior to deploying by commenting out cluster and renaming backup:
        root@ceph-node5 group_vars # vim all.yml
读书笔记《ceph-cookbook-second-edition》使用Ceph块设备

可以在两个同名集群之间镜像 RBD 映像,这需要将 /etc/sysconfig/ceph 文件中的一个集群的名称更改为 Ceph 以外的名称,然后创建一个ceph.conf 文件的符号链接。

  1. Run Ansible to install the second Ceph cluster with the distinct name of backup:
        root@ceph-node5 ceph-ansible # ansible-playbook site.yml
  1. When the playbook competes set the Ceph environment variable to use cluster name of backup:
        # export CEPH_ARGS="--cluster backup"
  1. In each of the clusters create a pool called data, this pool will be mirrored between the sites:
        root@ceph-node1 # ceph osd pool create data 64
        root@ceph-node5 # ceph osd pool create data 64
  1. Create the user client.local on the ceph Ceph cluster and give it a rwx access to data pool:
        root@ceph-node1 # ceph auth get-or-create client.local 
        mon 'allow r' osd 'allow class-read object_prefix rbd_children, 
        allow rwx pool=data' -o /etc/ceph/ceph.client.local.keyring 
        --cluster ceph
  1. Create the user client.remote on the backup cluster and give it a rwx access to data pool:
        root@ceph-node5 # ceph auth get-or-create client.remote 
        mon 'allow r' osd 'allow class-read object_prefix rbd_children, 
        allow rwx pool=data' -o /etc/ceph/backup.client.remote.keyring 
        --cluster backup
  1. Copy the Ceph configuration file from each of the clusters into the /etc/ceph directory of the corresponding peer cluster:
        root@ceph-node1 # scp /etc/ceph/ceph.conf 
        root@ceph-node5:/etc/ceph/ceph.conf
        root@ceph-node5 # scp /etc/ceph/backup.conf
        [email protected]:/etc/ceph/backup.conf
  1. Copy the keyrings for the user client.local and client.remote from each of the clusters into the /etc/ceph directory of the corresponding peer cluster:

       root@ceph-node1 # scp /etc/ceph/ceph.client.local.keyring 
       root@ceph-node5:/etc/ceph/ceph.client.local.keyring
       root@ceph-node5 # scp /etc/ceph/backup.client.remote.keyring 
       root@ceph-node1:/etc/ceph/backup.client.remote.keyring

我们现在有两个 Ceph 集群,一个 client.local 和一个 client.remote 用户,它们的对等 ceph.conf 文件的副本在etc/ceph 目录和每个对等集群上相应用户的密钥环。在下一个秘籍中,我们将在数据池上配置镜像。

Configuring pools for RBD mirroring with one way replication

RBD 镜像是通过在主 Ceph 集群和辅助 Ceph 集群中基于池启用来配置的。根据您选择镜像的数据级别,可以使用 RBD 镜像配置两种模式。请注意,启用的 RBD 镜像配置必须与主集群和辅助集群上的每个池相同。

  • Pool mode: Any image in a pool with journaling enabled is mirrored to the secondary cluster.
  • Image mode: Only specifically chosen images with mirroring enabled will be mirrored to the secondary cluster. This requires the image to have mirroring enabled for it.

How to do it...

在任何数据可以从 Ceph 集群镜像到 backup 集群之前,我们首先需要在 backup 集群上安装 rbd-mirror 守护程序,启用在数据池上进行镜像,然后将对等集群添加到池中:

  1. On ceph-node5 in the backup cluster install and configure the rbd-mirror. The client ID of remote (our user) is what the rbd-mirror daemon will use:
        root@ceph-node5 # yum install -y rbd mirror
                        # systemctl enable ceph-rbd-mirror.target
                        # systemctl enable ceph-rbd-mirror@remote
                        # systemctl start ceph-rbd-mirror@remote
  1. Enable mirroring of the whole pool named data in cluster ceph. The syntax is as follows:
                        # rbd mirror pool enable <pool> <mode>
        root@ceph-node1 # rbd mirror pool enable data pool
  1. Enable mirroring of the whole pool named rbd in the cluster backup:
        root@ceph-node5 # rbd mirror pool enable data pool
  1. For the rbd-mirror daemon to discover it's peer cluster, we now must register the peer to the pool. We will need to add the ceph cluster as a peer to the backup cluster. The syntax is as follows:
                       # rbd mirror pool peer add <pool> 
                         <client-name@cluster-name>
       root@ceph-node5 # rbd mirror pool peer add 
                         data client.local@ceph
  1. Next, we will validate the peer relationship between the pools and the cluster. The syntax is as follows:
                        # rbd mirror pool info <pool>
        root@ceph-node5 # rbd mirror pool info rbd
读书笔记《ceph-cookbook-second-edition》使用Ceph块设备

现在在池级别为 cephbackup 集群中的池数据启用了镜像,并且为集群中的数据池配置了池对等体 backup

  1. Review the data pool in each cluster and see there are currently no RBD images in either site. Once verified we will create three new RBD images in the data pool with exclusive-lock and journaling enabled and watch them sync to the secondary backup cluster:
        # rbd ls data
        # rbd create image-1 --size 1024 --pool data 
          --image-feature exclusive-lock,journaling
    1. Pool mirrors can be polled for a status of the images as they sync to the backup cluster:
               # rbd mirror pool status data
读书笔记《ceph-cookbook-second-edition》使用Ceph块设备
    1. Viewing the remote site for before and after on the data pool:
读书笔记《ceph-cookbook-second-edition》使用Ceph块设备
    1. Viewing the image sync status on the remote site:

读书笔记《ceph-cookbook-second-edition》使用Ceph块设备
    1. Viewing the healthy state of the journal replaying the image on the remote site:

读书笔记《ceph-cookbook-second-edition》使用Ceph块设备

重放是我们希望看到的图像状态,因为这意味着 rbd-mirror 守护进程看到同步的图像,并且正在重放日志以对需要同步的图像进行任何更改。

  1. We will now delete the three images from the ceph cluster and watch as they are removed from the backup cluster:
        # rbd rm -p data image-<num>
读书笔记《ceph-cookbook-second-edition》使用Ceph块设备
  1. Viewing the pool status to validate the images are removed from the remote site:
        # rbd mirror pool status data
读书笔记《ceph-cookbook-second-edition》使用Ceph块设备

如果您在任何时候选择禁用池上的镜像,这可以通过所选池的 rbd mirror peer removerbd mirror pool disable 命令来完成。请注意,当池禁用镜像时,您也会禁用池中任何映像的镜像。

以下是删除镜像的语法:

    1. To remove peer:
              # rbd mirror peer remove <pool-name> <peer-uuid>
    1. To remove mirroring on pool:
              # rbd mirror pool disable <pool>

Configuring image mirroring

当您选择只想镜像特定的镜像子集而不是整个池时,可以使用镜像镜像。下一个秘籍我们将在数据池中的单个图像上启用镜像,而不是镜像池中的其他两个图像。此秘籍要求您完成使用 RBD 镜像的灾难恢复复制秘籍中的步骤 1 - 步骤 9,并在备份站点上运行 rbd-mirror

How to do it...

  1. Create three images in the ceph cluster as we did in the previous recipe:
       # rbd create image-1 --size 1024 --pool data 
         --image-feature exclusive-lock,journaling
  1. Enable image mirroring on the data pool on the ceph and backup clusters:
        # rbd mirror pool enable data image
  1. Add ceph cluster as a peer to backup cluster:
         root@ceph-node5 # rbd mirror pool peer add 
                           data client.local@ceph
  1. Validate that peer is successfully added:
        # rbd mirror pool info
读书笔记《ceph-cookbook-second-edition》使用Ceph块设备
  1. In the ceph cluster, enable image mirroring on image-1, image-2 and image-3 will not be mirrored:
        root@ceph-node1 # rbd mirror image enable data/image-1
读书笔记《ceph-cookbook-second-edition》使用Ceph块设备
  1. Check mirror status in backup cluster to verify single image being mirrored:
          # rbd mirror pool status
  1. Check the image status in the backup cluster to validate the statue of this image and that image being mirrored is image-1:
         # rbd mirror image status data/image-1
读书笔记《ceph-cookbook-second-edition》使用Ceph块设备

Configuring two-way mirroring

双向镜像需要一个 rbd-mirror 守护进程在两个集群(主集群和辅助集群)上运行。通过双向镜像,可以将数据或映像从主站点镜像到辅助站点,辅助站点可以将数据或映像镜像回主站点。

我们不会在本书中演示此配置,因为它与单向配置非常相似,但我们将重点介绍在池级别进行双向复制所需的更改,这些步骤包含在单向复制配方中。

How to do it...

  1. Both clients must have the rbd-mirror installed and running:
        # yum install rbd-mirror
        # systemctl enable ceph-rbd-mirror.target
        # systemctl enable ceph-rbd-mirror@<client-id>
        # systemctl start ceph-rbd-mirror@<client-id>
  1. As with one way mirroring, both clients must have copies of the respective cluster configuration files and keyrings for client users for the mirrored pools.
  2. The pools to be replicated must have mirroring enabled at the pool or image level in both clusters:
        # rbd mirror pool enable <pool> <replication type>
        # rbd mirror pool enable <pool> <replication type>
  1. Validate that mirroring has been successfully enabled:
        # rbd mirror pool status data
  1. The pools to be replicated must have a peer registered for mirroring in both clusters:
       # rbd mirror pool peer add <pool> 
         client.<user>@<primary cluster name>
       # rbd mirror pool peer add <pool> 
         client.<user>@<secondary cluster name>

See also

Recovering from a disaster!

以下配方将展示如何在主集群 ceph 遇到灾难后故障转移到 backup 集群上的镜像数据,以及如何在 ceph 集群已恢复。处理灾难时有两种故障转移方法:

  • Orderly: Failover after an orderly shutdown. This would be a proper shutdown of the cluster and demotion and promotion of the image.
  • Non-orderly: Failover after a non-orderly shutdown. This would be a complete loss of the primary cluster. In this case, the failback would require a resynchronizing of the image.

How to do it...

  1. How to properly failover after an orderly shutdown:
    • Stop all client's that are writing to the primary image
    • Demote the primary image located on the ceph cluster:
               # rbd mirror image demote data/image-1
    • Promote the non-primary image located on the backup cluster:
                # rbd-mirror image promote data/image-1
    • Validate image has become primary on the backup cluster:
                # rbd mirror image status data/image-1
    • Resume client access to the image:
读书笔记《ceph-cookbook-second-edition》使用Ceph块设备
读书笔记《ceph-cookbook-second-edition》使用Ceph块设备
  1. How to properly failover after a non-orderly shutdown:
    • Validate that the primary cluster is in a down state
    • Stop all client access to the ceph cluster that accesses the primary image
    • Promote the non-primary image using the FORCE option on the backup cluster, as the demotion cannot be propagated to the down ceph cluster:
                 # rbd-mirror image promote data/image-1
    • Resume client access to the peer image
  1. How to failback from a disaster:
    • If there was a non-orderly shutdown on the ceph cluster then demote the old primary image on the ceph cluster once it returns:
                 # rbd mirror image demote data/image-1
    • Resynchronize the image only if there was a non-orderly shutdown:
                # rbd mirror image resync data/image-1
    • Validate that the re-synchronization has completed and image is in up+replaying state:
                # rbd mirror image status data/image-1
    • Demote the secondary image on the backup cluster:
                 # rbd mirror image demote data/image-1
    • Promote the formerly primary image on ceph cluster:
                  # rbd mirror image promotion data/image-1