vlambda博客
学习文章列表

读书笔记《elasticsearch-7-0-cookbook-fourth-edition》备份和恢复数据

Backups and Restoring Data

Elasticsearch 通常用作日志和其他类型数据的数据存储。因此,如果您存储有价值的数据,那么您还需要工具来备份和恢复这些数据以支持灾难恢复。

在早期版本的 Elasticsearch 中,唯一可行的解​​决方案是通过完整扫描转储数据,然后重新编制索引。随着 Elasticsearch 成熟为一个完整的产品,它支持原生功能来备份和恢复数据。

在本章中,我们将探讨如何使用 网络文件系统 (NFS)  配置共享存储。 用于存储备份,以及如何执行和恢复备份。

在本章的最后一个秘籍中,我们将演示如何使用重新索引功能在不同的 Elasticsearch 集群之间克隆数据。如果由于从旧 Elasticsearch 版本迁移到新版本而无法使用标准备份或恢复功能,这种方法非常有用。

在本章中,我们将介绍以下食谱:

  • Managing repositories
  • Executing a snapshot
  • Restoring a snapshot
  • Setting up an NFS share for backups
  • Reindexing from a remote cluster

Managing repositories

Elasticsearch 提供了一个内置系统来快速备份和恢复您的数据。在处理实时数据时,由于存在大量并发问题,因此保持备份很复杂。

Elasticsearch 快照允许您在远程存储库中创建单个索引(或别名)或整个集群的快照。

在开始执行快照之前,必须创建一个存储库——这是您的备份或快照将被存储的地方。

Getting ready

您将需要一个正常运行的 Elasticsearch 安装 - 类似于我们在 the 下载和安装 Elasticsearch 配方在 第 1 章 < em>开始。

为了执行命令,可以使用任何 HTTP 客户端,例如 Curl (https://curl.haxx.se/)或邮递员 (https://www.getpostman.com/)。您可以使用 Kibana 控制台,因为它为 Elasticsearch 提供了代码完成和更好的字符转义。

我们需要编辑 config/elasticsearch.yml 并添加备份存储库的目录 - path.repo: /backup/

通常,在生产集群中, /backup 目录应该是共享存储库。

如果您使用的是 Docker Compose 中提供的 第 1 章 ,   开始, 那么一切都已经为您配置好了。

How to do it...

要管理存储库,我们将执行以下步骤:

  1. To create a repository called my_repository, the HTTP method that we use is PUT, and the command will be as follows:
PUT /_snapshot/my_repository
{
  "type": "fs",
  "settings": {
    "location": "/backup/my_repository",
    "compress": true
  }
}

结果如下:

{"acknowledged":true}

如果您检查您的文件系统,/backup/my_repository 目录将按如下方式创建:

  1. To retrieve repository information, the HTTP method that we use is GET, and the curl command is as follows:
GET /_snapshot/my_repository

结果如下:

{
  "my_repository" : {
    "type" : "fs",
    "settings" : {
      "compress" : "true",
      "location" : "/backup/my_repository"
    }
  }
}
  1. To delete a repository, the HTTP method that we use is DELETE, and the curl command is as follows:
DELETE /_snapshot/my_repository

结果如下:

{
  "acknowledged" : true
}

How it works...

在对我们的数据进行快照之前,我们必须创建一个存储库——即我们可以存储备份数据的地方。可用于创建存储库的参数如下:

  • type: This is used to define the type of shared filesystem repository (it is generally fs).
  • settings: These are options that we can use to set up the shared filesystem repository.

在 fs类型使用的情况下,设置如下:

  • location: This is the location on the filesystem for storing snapshots.
  • compress: This turns on the compression for the snapshot files. Compression is applied only to metadata files (that is, index mapping and settings); data files are not compressed (the default is true). 
  • chunk_size: This defines the size of the chunks of files during snapshotting. The chunk size can be specified in bytes or by using size value notation (that is, 1 g, 10 m, or 5 k; the default is disabled).
  • max_restore_bytes_per_sec: This controls the throttle per node restore rate (the default is 20mb).
  • max_snapshot_bytes_per_sec: This controls the throttle per node snapshot rate (the default is 20mb).
  • readonly: This flag defines the repository as read-only (the default is false). It is possible to return all the defined repositories by executing GET without providing the repository name:
GET /_snapshot
The default values for  max_restore_bytes_per_sec and max_snapshot_bytes_per_sec are too low for production environments. Usually, production systems use SSDs or more efficient solutions, so it's better to configure these values related to your real networks and storage performance.

There's more...

存储库后端最常见的类型是 filesystem,但也有其他官方存储库后端,例如:

创建存储库后,会立即在所有数据节点上对其进行验证,以确保其正常运行。

Elasticsearch 还提供了一种手动方式来验证节点状态存储库,这对于检查云存储库存储的状态非常有用。手动验证存储库的命令如下:

POST /_snapshot/my_repository/_verify

See also

Executing a snapshot

在前面的秘籍中,我们定义了一个存储库——也就是我们将存储备份的地方。现在我们可以在命令被调用的那一刻创建索引的快照(使用索引的完整备份)。

对于每个存储库,都可以定义多个快照。

Getting ready

您将需要一个正常运行的 Elasticsearch 安装 - 类似于我们在 下载和安装 Elasticsearch  中描述的安装配方在 第 1 章,< /span> 开始

要执行这些命令,可以使用任何 HTTP 客户端,例如 Curl (https://curl.haxx.se/) 或 Postman (https://www.getpostman.com/)。您可以使用 Kibana 控制台,因为它为 Elasticsearch 提供了代码完成和更好的字符转义。

要正确执行以下命令,需要在上一个秘籍中创建的存储库。

How to do it...

要管理快照,我们将执行以下步骤:

  1. To create a snapshot called snap_1 for the index*, mybooks* and mygeo* indices, the HTTP method that we use is PUT, and the command is as follows:
PUT /_snapshot/my_repository/snap_1?wait_for_completion=true
{
  "indices": "index*,mybooks*,mygeo*",
  "ignore_unavailable": "true",
  "include_global_state": false
}

结果如下:

{
  "snapshot" : {
    "snapshot" : "snap_1",
    "uuid" : "9-LLrAHAT_KmmxLTmtF38w",
    "version_id" : 7000099,
    "version" : "7.0.0",
    "indices" : [
      "mybooks-join",
      "mybooks",
      "index-agg",
      "mygeo-index"
    ],
    "include_global_state" : false,
    "state" : "SUCCESS",
    "start_time" : "2019-01-06T13:00:24.328Z",
    "start_time_in_millis" : 1546779624328,
    "end_time" : "2019-01-06T13:00:24.441Z",
    "end_time_in_millis" : 1546779624441,
    "duration_in_millis" : 113,
    "failures" : [ ],
    "shards" : {
      "total" : 4,
      "failed" : 0,
      "successful" : 4
    }
  }
}
  1. If you check your filesystem, then the /backup/my_repository directory will be populated with a number of files, such as index (a directory that contains our data), metadata-snap_1, and snapshot-snap_1.
  2. To retrieve the snapshot information, the HTTP method that we use is GET, and the command is as follows:
GET /_snapshot/my_repository/snap_1

结果将与上一步相同。

  1. To delete a snapshot, the HTTP method that we use is DELETE, and the command is as follows:
DELETE /_snapshot/my_repository/snap_1

结果如下:

{
  "acknowledged" : true
}

How it works...

创建快照所需的最低配置是存储库的名称和快照的名称(即 snap_1)。

如果没有设置其他参数,则快照命令将转储所有集群数据。要控制快照过程,可以使用以下参数:

  • indices (a comma-delimited list of indices; wildcards are accepted): This controls the indices that must be dumped.
  • ignore_unavailable (the default is false): This prevents the snapshot from failing if some indices are missing.
  • include_global_state (this defaults to true; the available values are truefalse, and partial): This controls the storing of the global state in the snapshot. If a primary shard is not available, then the snapshot fails.

 wait_for_completion 查询参数 允许您在返回调用之前等待快照结束。如果您想自动化您的快照脚本以顺序备份索引,这将非常有用。

如果未设置 wait_for_completion 参数,则为了检查快照状态,用户必须使用快照 GET 调用对其进行监控。

快照是增量的;这意味着仅在同一索引的两个快照之间复制更改的文件。这种方法减少了快照期间的时间和磁盘使用量。

快照过程被设计为尽可能快,因此它实现了存储库中 Lucene 索引段的直接副本。为了防止在复制过程中可能发生的更改和索引损坏,所有需要复制的段都被阻止更改,直到快照结束。

Lucene 的段副本是在分片级别的,所以如果你有一个由多个节点组成的集群并且你有一个本地存储库,那么快照会传播到所有节点。因此,在生产集群中,必须共享存储库,以便轻松收集所有备份片段。

Elasticsearch 会处理快照期间的所有事情,包括防止将数据写入快照过程中的文件,以及管理集群事件(例如分片重定位、故障等)。

要检索存储库的所有可用快照,命令如下:

GET /_snapshot/my_repository/_all

There's more...

可以使用 _status 端点监控快照过程,该端点提供快照状态的完整概览。

对于当前示例,快照 _status API 调用如下:

GET /_snapshot/my_repository/snap_1/_status

结果很长,由以下部分组成:

  • Here is the information about the snapshot:
  "snapshots" : [
    {
      "snapshot" : "snap_1",
      "repository" : "my_repository",
      "uuid" : "h50pswT-Qw642VUi4aandQ",
      "state" : "SUCCESS",
      "include_global_state" : false,
  • Here are the global shard's statistics:
"shards_stats" : {
        "initializing" : 0,
        "started" : 0,
        "finalizing" : 0,
        "done" : 4,
        "failed" : 0,
        "total" : 4
      },
  • Here are the snapshot's global statistics:
      "stats" : {
        "incremental" : {
          "file_count" : 16,
          "size_in_bytes" : 837344
        },
        "total" : {
          "file_count" : 16,
          "size_in_bytes" : 837344
        },
        "start_time_in_millis" : 1546779914447,
        "time_in_millis" : 52
      },
  • Here is a drill down of the snapshot index statistics:
 "indices" : {
        "mybooks-join" : {
          "shards_stats" : {
            "initializing" : 0,
            "started" : 0,
            "finalizing" : 0,
            "done" : 1,
            "failed" : 0,
            "total" : 1
          },
          "stats" : {
            "incremental" : {
              "file_count" : 4,
              "size_in_bytes" : 10409
            },
            "total" : {
              "file_count" : 4,
              "size_in_bytes" : 10409
            },
            "start_time_in_millis" : 1546779914449,
            "time_in_millis" : 15
          },
  • Here are the statistics for each index and shard:
"shards" : {
            "0" : {
              "stage" : "DONE",
              "stats" : {
                "incremental" : {
                  "file_count" : 4,
                  "size_in_bytes" : 10409
                },
                "total" : {
                  "file_count" : 4,
                  "size_in_bytes" : 10409
                },
                "start_time_in_millis" : 1546779914449,
                "time_in_millis" : 15
              }
            }
          }
        }, ... truncated ...

状态响应非常丰富,也可以用来估计快照的性能和增量备份及时需要的大小。

Restoring a snapshot

一旦您拥有数据的快照,就可以将其还原。恢复过程非常快——索引分片数据只需复制到节点上并激活。

Getting ready

您将需要一个正常运行的 Elasticsearch 安装——类似于我们在 下载和安装 Elasticsearch  中描述的安装第 1 章中的配方,  开始。 

要执行这些命令,可以使用任何 HTTP 客户端,例如 Curl (https://curl.haxx.se/) 或 Postman (https://www.getpostman.com/)。您可以使用 Kibana 控制台,因为它为 Elasticsearch 提供了代码完成和更好的字符转义。

为了正确执行以下命令,需要在上一个秘籍中创建的备份。

How to do it...

要恢复快照,我们将执行以下步骤:

  1. To restore a snapshot called snap_1 for the mybooks-* indices, the HTTP method that we use is POST, and the command is as follows:
POST /_snapshot/my_repository/snap_1/_restore
{
  "indices": "mybooks-*",
  "ignore_unavailable": "true",
  "include_global_state": false,
  "rename_pattern": "mybooks-(.+)",
  "rename_replacement": "copy_$1"
}

结果如下:

{
"accepted" : true
}
  1. The restore is finished when the cluster state changes from red to yellow or green.

在此示例中, "mybooks-*" 索引模式 匹配 mybooks-joinrename_pattern 参数捕获 "join",新索引会通过rename_placement  "copy -join"

How it works...

恢复过程非常快;该过程包括以下步骤:

  1. The data is copied on the primary shard of the restored index (during this step, the cluster is in the red state).
  2. The primary shards are recovered (during this step, the cluster turns from red to yellow or green).
  3. If a replica is set, then the primary shards are copied onto other nodes.

可以使用一些参数来控制恢复过程,如下所示:

  • indices: This controls the indices that must be restored. If not defined, then all indices in the snapshot are restored (a comma-delimited list of indices; wildcards are accepted).
  • ignore_unavailable: This stops the restore from failing if some indices are missing (the default is false).
  • include_global_state: This allows the restoration of the global state from the snapshot (this defaults to true; the available values are true and false).
  • rename_pattern and rename_replacement: The first one is a pattern that must be matched, and the second one uses regular expression replacement to define a new index name.
  • partial: If set to true, it allows the restoration of indices with missing shards (the default is false).

Setting up an NFS share for backups

管理存储库(存储数据的地方)是 Elasticsearch 备份管理中最关键的部分。由于其原生分布式架构,快照和还原设计为集群样式。

在快照期间,分片被复制到定义的存储库。如果此存储库是节点本地的,则备份数据分布在所有节点上。出于这个原因,如果您有一个多节点集群,则必须拥有共享存储库存储。

一种常见的方法是使用 NFS,因为它很容易设置并且是一种非常快速的解决方案(此外,可以使用标准的 Windows Samba 共享。)

Getting ready

我们有一个包含以下节点的网络:

  • Host server: 192.168.1.30 (where we will store the backup data)
  • Elasticsearch master node 1: 192.168.1.40
  • Elasticsearch data node 1: 192.168.1.50
  • Elasticsearch data node 2: 192.168.1.51

您将需要一个正常运行的 Elasticsearch 安装 - 类似于我们在 下载和安装 Elasticsearch  中描述的安装第 1 章中的配方,  开始

为了执行命令,可以使用任何 HTTP 客户端,例如 curl (https://curl.haxx.se/)或邮递员 (https://www.getpostman.com/)。您可以使用 Kibana 控制台,因为它为 Elasticsearch 提供了代码完成和更好的字符转义。

以下说明适用于标准 Debian 或 Ubuntu 发行版;它们可以很容易地针对另一个 Linux 发行版进行更改。

How to do it...

要创建 NFS 共享存储库,我们需要在 NFS 服务器上执行以下步骤:

  1. Install the NFS server (using the nfs-kernel-server package) on the host server. On the 192.168.1.30 host server, we will execute the following commands:
sudo apt-get update
sudo apt-get install nfs-kernel-server
  1. Once the package is installed, create a directory to be shared among all the clients:
sudo mkdir /mnt/shared-directory
  1. Give access permissions for this directory to the nobody user and the nogroup group. The nobody/nogroup are special user/group values to used to allow to share read/write permissions. To apply them you need root access and execute the following command:
sudo chown -R nobody:nogroup /mnt/shared-directory
  1. Then, we need to configure the NFS exports, where we can specify that this directory will be shared with certain machines. Edit the /etc/exports file (sudo nano /etc/exports) and add the following lines containing the directory that is to be shared and a list of client IP that are allowed to access the exported directory:
/mnt/shared-directory 192.168.1.40(rw,sync,no_subtree_check)

192.168.1.50(rw,sync,no_subtree_check)

192.168.1.51(rw,sync,no_subtree_check)
  1. To refresh the NFS table that holds the export of the share, the following command must be executed:
sudo exportfs -a
  1. Finally, we can start the NFS service by running the following command:
sudo service nfs-kernel-server start

NFS 服务器启动并运行后,我们需要配置客户端。我们将在每个 Elasticsearch 节点上重复以下步骤:

  1. Install the NFS client on our Elasticsearch node:
sudo apt-get update
sudo apt-get install nfs-common
  1. Now create a directory on the client machine and we'll try to mount the remote shared directory:
sudo mkdir /mnt/nfs
sudo mount 192.168.1.30:/mnt/shared-directory /mnt/nfs
  1. If everything is fine, we can add the mount directory to our node /etc/fstab file, so that it will be mounted during the next boot:
sudo nano /etc/fstab
  1. Then, add the following lines into this file:
192.168.1.30:/mnt/shared-directory /mnt/nfs/ nfs auto,noatime,nolock,bg,nfsvers=4,sec=krb5p,intr,tcp,actimeo=1800 0 0
  1. We update our Elasticsearch node configuration (config/elasticsearch.yml) of path.repo as follows:
path.repo: /mnt/nfs/
  1. After having restarted all the Elasticsearch nodes, we can create our share repository on the cluster using a single standard repository creation call:
PUT /_snapshot/my_repository
{
  "type": "fs",
  "settings": {
    "location": "/ mnt/nfs/my_repository",
    "compress": true
  }
}

How it works...

NFS 是一种分布式文件系统协议,在 Unix 世界中非常常见——它允许您在服务器上挂载远程目录。挂载的目录看起来像服务器的本地目录,因此,通过使用 NFS,多个服务器可以写入同一个目录。

如果您需要进行共享备份,这非常方便;这是因为所有节点都将从同一个共享目录写入/读取。

If you need to snapshot an index that will be rarely updated, such as an old time-based index, the best practice is to optimize it before backing it up, cleaning up deleted documents, and reducing the Lucene segments.

Reindexing from a remote cluster

快照和恢复 API 非常快,是备份数据的首选方式,但它们也有一些限制:

  • The backup is a safe Lucene index copy, so it depends on the Elasticsearch version that is used. If you are switching from a version of Elastisearch that is prior to version 5.x, then it's not possible to restore the old indices.
  • It's not possible to restore the backups of a newer Elasticsearch version in an older version; the restore is only forward-compatible.
  • It's not possible to restore partial data from a backup.

为了能够在这种情况下复制数据,解决方案是使用远程服务器使用重新索引 API。

Getting ready

您将需要一个正常运行的 Elasticsearch 安装 - 类似于我们在 下载和安装 Elasticsearch  中描述的安装第 1 章中的配方,  开始

为了执行命令,可以使用任何 HTTP 客户端,例如 curl (https://curl.haxx.se/)或邮递员 (https://www.getpostman.com/)。您可以使用 Kibana 控制台,因为它为 Elasticsearch 提供了代码完成和更好的字符转义。

How to do it...

要从远程服务器复制索引,我们需要执行以下步骤:

  1. We need to add the remote server address in the config/elasticsearch.yml section using reindex.remote.whitelist as follows:
reindex.remote.whitelist: ["192.168.1.227:9200"]
  1. After having restarted the Elasticsearch node to take the new configuration, we can call the reindex API to copy a test-source index data in test-dest using the remote REST endpoint in this way:
POST /_reindex
{
  "source": {
    "remote": {
      "host": "http://192.168.1.227:9200"
    },
    "index": "test-source"
  },
  "dest": {
    "index": "test-dest"
  }
}

结果将类似于我们已经在 重新索引索引 配方中看到的本地重新索引>第 3 章,基本操作

How it works...

重新索引 API 允许您调用远程集群。支持 Elasticsearch 服务器的每个版本(主要是 1.x 或更高版本)。

reindex API 对远程索引集群执行扫描查询,并将数据放入当前集群。此过程可能需要很长时间,具体取决于需要复制的数据量以及索引数据所需的时间。

源部分包含控制获取数据的重要参数,例如:

  • remote: This is a section that contains information on the remote cluster connection.
  • index: This is the remote index that has to be used to fetch the data; it can also be an alias or multiple indices via GLOB patterns.
  • query: This parameter is optional; it's a standard query that can be used to select the document that must be copied.
  • size: This parameter is optional and the buffer is up to 200 MB – the number of the documents to be used for the bulk read and write.

配置的 remote 部分由以下参数组成:

  • host: The remote REST endpoint of the cluster.
  • username: The username to be used for copying the data (this is an optional parameter).
  • password: The password for the user to access the remote cluster (this is optional).

在标准快照和还原上使用这种方法有很多优点,包括:

  • The ability to copy data from older clusters (from version 1.x or later).
  • The ability to use a query to copy from a selection of documents. This is very handy for copying data from a production cluster to a development or test one.

See also