vlambda博客
学习文章列表

读书笔记《hands-on-docker-for-microservices-with-python》使用Docker构建、运行和测试您的服务

Build, Run, and Test Your Service Using Docker

在上一章设计了一个工作的 RESTful 微服务后,我们将在本章中看到如何以Docker 方式使用它,将服务封装到一个自包含的容器中,使其不可变并且可以自行部署。本章非常明确地描述了服务的依赖关系以及可以使用它的方式。运行服务的主要方式是将其作为 Web 服务器运行,但也可以进行其他操作,例如运行单元测试、生成报告等。我们还将了解如何在本地计算机上部署服务以进行测试,以及如何通过图像存储库共享它。

本章将涵盖以下主题:

  • Building your service with a Dockerfile
  • Operating with an immutable container
  • Configuring your service
  • Deploying the Docker service locally
  • Pushing your Docker image to a remote registry

在本章结束时,您将了解如何操作 Docker、创建基本服务、构建映像并运行它。您还将知道如何共享要在另一台计算机上运行的映像。

Technical requirements

本章需要安装 Docker,版本 18.09 或以上。请参阅官方文档(https://docs.docker.com/install/)了解如何操作所以对于你的平台。

如果您在 Linux 中安装 Docker,您可能必须将服务器配置为以非 root 访问权限运行。检查文档 https://docs.docker.com/install/linux/linux-postinstall/

使用以下命令检查版本:

$ docker version
Client: Docker Engine - Community
 Version: 18.09.2
 API version: 1.39
 Go version: go1.10.8
 Git commit: 6247962
 Built: Sun Feb 10 04:12:39 2019
 OS/Arch: darwin/amd64
 Experimental: false

您还需要安装 Docker Compose 1.24.0 或更高版本。请注意,在某些安装中,例如 macOS,它会自动为您安装。查看 Docker 文档中的安装说明 (https://docs.docker.com/compose/install/< /a>):

$ docker-compose version
docker-compose version 1.24.0, build 0aa5906
docker-py version: 3.7.2
CPython version: 3.7.3
OpenSSL version: OpenSSL 1.0.2r 26 Feb 2019

代码在 GitHub 上,在这个目录中:https://github.com/PacktPublishing/Hands-On-Docker-for-Microservices-with-Python/tree/master/Chapter03第 2 章中提供了 ThoughtsBackend 的副本, 使用 Python 创建 REST 服务,但代码略有不同。我们将在本章中了解不同之处。

Building your service with a Dockerfile

这一切都始于一个容器。正如我们在 第 1 章中所说,采取行动 - 设计、计划, 和 Execute,容器是一个打包的软件包,以标准方式封装。它们是可以独立运行的软件单元,因为它们是完全独立的。要制作容器,我们需要构建它。

请记住我们将容器描述为被自己的文件系统包围的进程。构建容器构建了这个文件系统。

要使用 Docker 构建容器,我们需要定义其内容。文件系统是通过一层一层地应用来创建的。每个 Dockerfile(生成容器的配方)都包含生成容器的步骤定义。

例如,让我们创建一个非常简单的 Dockerfile。创建一个名为 example.txt 的文件,其中包含一些示例文本和另一个名为 Dockerfile.simple 的文件,其内容如下:

# scratch is a special container that is totally empty
FROM scratch
COPY example.txt /example.txt

现在使用以下命令构建它:

$ # docker build -f <dockerfile> --tag <tag> <context>
$   docker build -f Dockerfile.simple --tag simple .
Sending build context to Docker daemon 3.072kB
Step 1/2 : FROM scratch
 --->
Step 2/2 : COPY example.txt /example.txt
 ---> Using cache
 ---> f961aef9f15c
Successfully built f961aef9f15c
Successfully tagged simple:latest

$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
simple latest f961aef9f15c 4 minutes ago 11B

这将创建一个仅包含 example.txt 文件的 Docker 映像。它不是很有用,但很小——只有 11 个字节。那是因为它继承自空容器 scratch。然后它将 example.txt 文件复制到 /example.txt 容器中的位置内。

让我们看一下 docker build 命令。 Dockerfile用-f参数定义,生成镜像的tag用--tag 定义,context参数定义作为点 (.)。 context 参数是对在 Dockerfile 中的步骤中定义的文件的查找位置的引用。

该图像还具有自动分配的图像 ID f961aef9f15c。这是文件系统内容的哈希值。我们稍后会看到为什么这是相关的。

Executing commands

以前的容器不是很令人兴奋。完全可以从头开始创建自己的容器,但通常情况下,您会寻找一个包含某种 Linux 发行版的基线,以便您可以对容器做一些有用的事情。

正如我们在 FROM 命令中看到的那样,我们可以从以前的容器开始。我们将在本书中使用 Alpine Linux (https://alpinelinux.org/) 发行版,但也可以使用其他发行版,如 Ubuntu 和 CentOS。查看文章 https://sweetcode.io/linux-distributions-optimized-hosting -docker/ 用于针对 Docker 容器的发行版。

为什么选择 Alpine Linux?它可以说是 Docker 系统最流行的发行版,因为它占用的空间非常小,而且它的目标是安全。它维护良好,并定期更新和修补。它还有一个完整的包管理系统,可以让您轻松安装大多数 Web 服务的常用工具。基本映像的大小只有 5 MB 左右,并且包含一个正常工作的 Linux 操作系统。

使用它时有一些怪癖,例如使用自己的包管理,称为 apk,但它易于使用,几乎可以直接替代常见的 Linux 发行版。

以下 Dockerfile 将从基础 alpine 容器继承并添加 example.txt 文件:

FROM alpine

RUN mkdir -p /opt/
COPY example.txt /opt/example.txt

这个容器允许我们运行命令,因为包含了常用的命令行实用程序:

$ docker build -f Dockerfile.run --tag container-run .
Sending build context to Docker daemon 4.096kB
Step 1/3 : FROM alpine
 ---> 055936d39205
Step 2/3 : RUN mkdir -p /opt/
 ---> Using cache
 ---> 4f565debb941
Step 3/3 : COPY example.txt /opt/example.txt
 ---> Using cache
 ---> d67a72454d75
Successfully built d67a72454d75
Successfully tagged container-run:latest

$ # docker run <image name> <command> 
$   docker run container-run cat /opt/example.txt
An example file

注意 cat /opt/example.txt 命令行是如何被执行的。这实际上发生在容器内部。我们在 stdout 控制台中的 stdout 中打印结果。但是,如果创建了一个文件,当容器停止时,该文件不会保存在我们的本地文件系统中,而只会保存在容器内部:

$ ls
Dockerfile.run example.txt
$ docker run container-run /bin/sh -c 'cat /opt/example.txt > out.txt'
$ ls
Dockerfile.run example.txt

该文件实际上保存在已停止的容器中。容器完成运行后,它会一直被 Docker 停止,直到被移除。您可以使用 docker ps -a 命令查看已停止的容器。停止的容器不是很有趣,尽管它的文件系统保存在磁盘上。

When running web services, the command being run won't stop; it will keep running until stopped. Remember what we said before about a container being a process with a filesystem attached. The command running is the key to the container.

您可以通过添加以下内容来添加默认命令,该命令将在没有给出命令时执行:

CMD cat /opt/example.txt

使用以下命令使其自动运行:

$ docker run container-run
An example file

定义标准命令使容器变得非常简单。只需运行它,它就会执行它配置的任何操作。请记住在容器中包含默认命令。

我们还可以在容器中执行一个 shell 并与之交互。请记住添加 -it 标志以保持连接正常打开,-i 以保持 stdin 处于打开状态,以及 -t 创建一个伪终端,你可以记住它为交互式终端:

$ docker run -it container-run /bin/sh
/ # cd opt/
/opt # ls
example.txt
/opt # cat example.txt
An example file
/opt # exit
$

这在发现问题或执行探索性测试时非常有用。

Understanding the Docker cache

构建映像时的主要困惑之一是了解 Docker 层的工作原理。

Dockerfile 上的每个命令都在前一层之上连续执行。如果您对 Git 感到满意,您会注意到该过程是相似的。每层只存储对上一步的更改:

读书笔记《hands-on-docker-for-microservices-with-python》使用Docker构建、运行和测试您的服务

这允许 Docker 非常积极地缓存,因为已经计算了更改之前的任何层。例如,在本例中,我们使用 apk update 更新可用包,然后安装 python3 包,然后再复制 example.txt 文件.对 example.txt 文件的任何更改只会在层 be086a75fe23 上执行最后两个步骤。这加快了图像的重建速度。

这也意味着您需要仔细构建 Dockerfile,以免缓存无效。从很少更改的操作开始,例如安装项目依赖项,并以更改频繁的操作结束,例如添加代码。我们示例的带注释的 Dockerfile 指示了缓存的使用情况。

这也意味着图像永远不会变小,即使该层删除了数据,也会添加一个新层,因为前一层仍然存储在磁盘上。如果你想从一个步骤中删除杂物,你需要在同一个步骤中这样做。

Keeping your containers small is quite important. In any Docker system, the tendency is to have a bunch of containers and lots of images. Big images for no reason will fill up repositories quickly. They'll be slow to download and push, and also slow to start, as the container is copied around in your infrastructure.

There's another practical consideration. Containers are a great tool to simplify and reduce your service to the minimum. With a bit of investment, you'll have great results and keep small and to-the-point containers.

有几种做法可以使您的图像保持较小。除了注意不要安装额外的元素外,主要的元素是创建一个安装和卸载的单一复杂层,以及多阶段图像。多阶段 Dockerfile 是一种引用先前中间层并从那里复制数据的方法。检查 Docker 文档(https://docs.docker.com/develop/develop -images/multistage-build/)。

Compilers, in particular, tend to get a lot of space. When possible, try to use precompiled binaries. You can use a multi-stage Dockerfile to compile in one container and then copy the binaries to the running one.

您可以在本文中详细了解这两种策略之间的区别:https://pythonspeed .com/articles/smaller-python-docker-images/

A good tool to analyze a particular image and the layers that compose it is dive ( https://github.com/wagoodman/dive). It will also discover ways that an image can be reduced in size.

我们将在下一步中创建一个多阶段容器。

Building a web service container

我们有一个特定的目标,即创建一个能够运行我们的微服务 ThoughtsBackend 的容器。为此,我们有几个要求:

  • We need to copy our code to the container.
  • The code needs to be served through a web server.

因此,概括地说,我们需要创建一个带有 Web 服务器的容器,添加我们的代码,配置它以运行我们的代码,并在启动容器时提供结果。

We will store most of the configuration files inside subdirectories in the ./docker directory.

作为 Web 服务器,我们将使用 uWSGI (https://uwsgi-docs.readthedocs.io/ zh/最新/)。 uWSGI 是一个 Web 服务器,能够通过 WSGI 协议为我们的 Flask 应用程序提供服务。 uWSGI 非常可配置,有很多选项,并且能够直接提供 HTTP。

A very common configuration is to have NGINX in front of uWSGI to serve static files, as it's more efficient for that. In our specific use case, we don't serve many static files, as we're running a RESTful API, and, in our main architecture, as described in Chapter 1, Making the Move – Design, Plan, and Execute, there's already a load balancer on the frontend and a dedicated static files server. This means we won't be adding an extra component for simplicity. NGINX usually communicates to uWSGI using the uwsgi protocol, which is a protocol specifically for the uWSGI server, but it can also do it through HTTP. Check the NGINX and uWSGI documentation.

我们来看看docker/app/Dockerfile文件.有两个阶段;第一个是编译依赖项:

########
# This image will compile the dependencies
# It will install compilers and other packages, that won't be carried
# over to the runtime image
########
FROM alpine:3.9 AS compile-image

# Add requirements for python and pip
RUN apk add --update python3

RUN mkdir -p /opt/code
WORKDIR /opt/code

# Install dependencies
RUN apk add python3-dev build-base gcc linux-headers postgresql-dev libffi-dev

# Create a virtual environment for all the Python dependencies
RUN python3 -m venv /opt/venv
# Make sure we use the virtualenv:
ENV PATH="/opt/venv/bin:$PATH"
RUN pip3 install --upgrade pip

# Install and compile uwsgi
RUN pip3 install uwsgi==2.0.18
# Install other dependencies
COPY ThoughtsBackend/requirements.txt /opt/
RUN pip3 install -r /opt/requirements.txt

此阶段执行以下步骤:

  1. Names the stage compile-image, inheriting from Alpine.
  2. Installs python3.
  3. Installs the build dependencies, including the gcc compiler and Python headers (python3-dev).
  4. Creates a new virtual environment. We will install all the Python dependencies here.
  5. The virtual environment gets activated.
  6. Installs uWSGI. This step compiles it from code.
You can also install the included uWSGI package in the Alpine distribution, but I found the compiled package to be more complete and easier to configure, as the Alpine uwsgi package requires you to install other packages such as uwsgi-python3, uwsgi-http, and so on, then enable the plugin in the uWSGI config. The size difference is minimal. This also allows you to use the latest uWSGI version and not depend on the one in your Alpine distribution.
  1. Copy the requirements.txt file and install all the dependencies. This will compile and copy the dependencies to the virtual environment.

第二阶段是准备运行的容器。让我们来看看:

########
# This image is the runtime, will copy the dependencies from the other
########
FROM alpine:3.9 AS runtime-image

# Install python
RUN apk add --update python3 curl libffi postgresql-libs

# Copy uWSGI configuration
RUN mkdir -p /opt/uwsgi
ADD docker/app/uwsgi.ini /opt/uwsgi/
ADD docker/app/start_server.sh /opt/uwsgi/

# Create a user to run the service
RUN addgroup -S uwsgi
RUN adduser -H -D -S uwsgi
USER uwsgi

# Copy the venv with compile dependencies from the compile-image
COPY --chown=uwsgi:uwsgi --from=compile-image /opt/venv /opt/venv
# Be sure to activate the venv
ENV PATH="/opt/venv/bin:$PATH"

# Copy the code
COPY --chown=uwsgi:uwsgi ThoughtsBackend/ /opt/code/

# Run parameters
WORKDIR /opt/code
EXPOSE 8000
CMD ["/bin/sh", "/opt/uwsgi/start_server.sh"]

它执行以下操作:

  1. Labels the image as runtime-image and inherits from Alpine, as previously.
  2. Installs Python and other requirements for the runtime.
请注意,需要安装编译所需的任何运行时。比如我们安装 libffi 在运行时和 libffi-dev 编译,需要 密码学包。尝试访问(不存在的)库时,不匹配将引发运行时错误。这 dev 库通常包含运行时库。
  1. Copy the uWSGI configuration and script to start the service. We'll take a look at that in a moment.
  1. Create a user to run the service, and set it as the default using the USER command.
This step is not strictly necessary as, by default, the root user will be used. As our containers are isolated, gaining root access in one is inherently more secure than in a real server. In any case, it's good practice to not configure our public-facing service accessing as root and it will remove some understandable warnings.
  1. Copy the virtual environment from the compile-image image. This installs all the compiled Python packages. Note that they are copied with the user to run the service, to have access to them. The virtual environment is activated.
  2. Copy the application code.
  3. Define the run parameters. Note that port 8000 is exposed. This will be the port we will serve the application on.
If running as root, port 80 can be defined. Routing a port in Docker is trivial, though, and other than the front-facing load balancer, there's not really any reason why you need to use the default HTTP port. Use the same one in all your systems, though, which will remove uncertainty.

请注意,应用程序代码复制到文件末尾。应用程序代码很可能是最常更改的代码,因此这种结构利用 Docker 缓存并仅重新创建最后几层,而不必从头开始。在设计 Dockerfile 时要考虑到这一点。

Also, keep in mind that there's nothing stopping you from changing the order while developing. If you're trying to find a problem with a dependency, and so on, you can comment out irrelevant layers or add steps later once the code is stable.

现在让我们构建我们的容器。看到创建了两个图像,但只有一个被命名。另一个是编译映像,它更大,因为它包含编译器,依此类推:

$ docker build -f docker/app/Dockerfile --tag thoughts-backend .
...
 ---> 027569681620
Step 12/26 : FROM alpine:3.9 AS runtime-image
...
Successfully built 50efd3830a90
Successfully tagged thoughts-backend:latest
$ docker images | head
REPOSITORY TAG IMAGE ID CREATED SIZE
thoughts-backend latest 50efd3830a90 10 minutes ago 144MB
<none>           <none> 027569681620 12 minutes ago 409MB

现在我们可以运行容器了。为了能够访问内部端口 8000,我们需要使用 -p 选项对其进行路由:

$ docker run -it  -p 127.0.0.1:8000:8000/tcp thoughts-backend

访问我们的本地浏览器到 127.0.0.1 显示我们的应用程序。您可以在标准输出中看到访问日志:

读书笔记《hands-on-docker-for-microservices-with-python》使用Docker构建、运行和测试您的服务

您可以使用 docker exec 从不同的终端访问正在运行的容器并执行新的 shell。请记住添加 -it 以保持终端打开。使用 docker ps 检查当前运行的容器以找到容器 ID:

$ docker ps
CONTAINER ID IMAGE            COMMAND ... PORTS ...
ac2659958a68 thoughts-backend ... ...     127.0.0.1:8000->8000/tcp 
$ docker exec -it ac2659958a68 /bin/sh
/opt/code $ ls
README.md __pycache__ db.sqlite3 init_db.py pytest.ini requirements.txt tests thoughts_backend wsgi.py
/opt/code $ exit
$ 

您可以使用 Ctrl + C 停止容器,或者更优雅地从另一个终端停止它:

$ docker ps
CONTAINER ID IMAGE            COMMAND ... PORTS ...
ac2659958a68 thoughts-backend ... ...     127.0.0.1:8000->8000/tcp 
$ docker stop ac2659958a68
ac2659958a68

日志将显示 graceful stop

...
spawned uWSGI master process (pid: 6)
spawned uWSGI worker 1 (pid: 7, cores: 1)
spawned uWSGI http 1 (pid: 8)
Caught SIGTERM signal! Sending graceful stop to uWSGI through the master-fifo
Fri May 31 10:29:47 2019 - graceful shutdown triggered...
$ 

正确捕获 SIGTERM 并优雅地停止我们的服务对于避免突然终止服务很重要。我们将看到如何在 uWSGI 中配置它,以及其他元素。

Configuring uWSGI

uwsgi.ini 文件包含 uWSGI 配置:

[uwsgi]
uid=uwsgi
chdir=/opt/code
wsgi-file=wsgi.py
master=True
pidfile=/tmp/uwsgi.pid
http=:8000
vacuum=True
processes=1
max-requests=5000
# Used to send commands to uWSGI
master-fifo=/tmp/uwsgi-fifo

其中大部分是我们从 Dockerfile 中获得的信息,尽管它需要匹配,以便 uWSGI 知道在哪里可以找到应用程序代码、要启动的 WSGI 文件的名称、启动它的用户等等。

其他参数特定于 uWSGI 行为:

  • master: Creates a master process that controls the others. Recommended for uWSGI operation as it creates smoother operation.
  • http: Serves in the specified port. The HTTP mode creates a process that load balances the HTTP requests toward the workers, and it's recommended to serve HTTP outside of the container.
  • processes: The number of application workers. Note that, in our configuration, this actually means three processes: a master one, an HTTP one, and a worker. More workers can handle more requests but will use more memory. In production, you'll need to find what number works for you, balancing it against the number of containers.
  • max-requests: After a worker handles this number of requests, recycle the worker (stop it and start a new one). This reduces the probability of memory leaks.
  • vacuum: Clean the environment when exiting.
  • master-fifo: Create a Unix pipe to send commands to uWSGI. We will use this to handle graceful stops.
The uWSGI documentation ( https://uwsgi-docs.readthedocs.io/en/latest/) is quite extensive and comprehensive. It contains a lot of valuable information, both for operating uWSGI itself and understanding details about how web servers operate. I learn something new each time that I read it, but it can be a bit overwhelming at first.

It's worth investing a bit of time in running tests to discover what are the best parameters for your service in areas such as timeouts, the number of workers, and so on. However, remember that some of the options for uWSGI may be better served with your container's configuration, which simplifies things.

为了允许优雅的停止,我们将 uWSGI 的执行包装在 start_server.sh 脚本中:

#!/bin/sh

_term() {
  echo "Caught SIGTERM signal! Sending graceful stop to uWSGI through the master-fifo"
  # See details in the uwsgi.ini file and
  # in http://uwsgi-docs.readthedocs.io/en/latest/MasterFIFO.html
  # q means "graceful stop"
  echo q > /tmp/uwsgi-fifo
}

trap _term SIGTERM

uwsgi --ini /opt/uwsgi/uwsgi.ini &

# We need to wait to properly catch the signal, that's why uWSGI is started
# in the background. $! is the PID of uWSGI
wait $!
# The container exits with code 143, which means "exited because SIGTERM"
# 128 + 15 (SIGTERM)
# http://www.tldp.org/LDP/abs/html/exitcodes.html
# http://tldp.org/LDP/Bash-Beginners-Guide/html/sect_12_02.html

脚本的核心是调用 uwsgi 来启动服务。然后它会一直等到它停止。

SIGTERM 信号将被捕获,uWSGI将通过发送q命令到master-fifo来优雅地停止管道。

A graceful stop means that a request won't be interrupted when a new container version is available. We'll see later how to make rollout deployments, but one of the key elements is to interrupt existing servers when they are not serving requests, to avoid stopping in the middle of a request and leaving an inconsistent state.

Docker 使用 SIGTERM 信号来停止容器的执行。超时后,它会用 SIGKILL 杀死它们。

Refreshing Docker commands

我们已经查看了一些重要的 Docker 命令:

  • docker build: Builds an image
  • docker run: Runs an image
  • docker exec: Executes a command in a running container
  • docker ps: Shows the currently running containers
  • docker images: Displays the existing images

虽然这些是基本的,但了解大多数可用的 Docker 命令对于调试问题和执行诸如监控、复制和标记图像、创建网络等操作非常有用。这些命令还将向您展示有关 Docker 内部工作原理的很多信息。

An important command: be sure to clean up old containers and images with docker system prune from time to time. Docker is quite space-intensive after working with it for a few weeks.

Docker 文档(https://docs.docker.com/v17. 12/engine/reference/commandline/docker/) 相当完整。一定要知道你的方式。

Operating with an immutable container

本章前面提到的 Docker 命令是基础,一切从这里开始。但是,当处理多个时,处理它们开始变得复杂。您已经看到某些命令可能会变得很长。

要在集群操作中使用容器,我们将使用 docker-compose。这是 Docker 自己的用于定义多容器操作的编排工具。它由一个包含所有不同任务和服务的 YAML 文件定义,每个任务和服务都有足够的上下文来构建和运行它。

它允许您将 每个服务和参数的不同服务和参数存储在这个配置文件中,默认情况下称为 docker-compose.yaml。这使您可以协调它们并生成可复制的服务集群。

Testing the container

我们将首先创建一个服务来运行单元测试。请记住,测试需要在容器内部运行。这将标准化它们的执行并确保依赖关系是恒定的。

Note that, in the creation of our container, we include all the requirements to execute the tests. There's the option to create the running container and inherit from it to add the tests and test dependencies.

This certainly creates a smaller running container but creates a situation where the testing container is not 100% exactly the same as the one in production. If the size is critical and there's a big difference, this may be an option, but be aware of the differentiation if there's a subtle bug.

我们需要在docker-compose.yaml文件中定义一个服务,这样:

version: '3.7'

services:
    # Development related
    test-sqlite:
        environment:
            - PYTHONDONTWRITEBYTECODE=1
        build:
            dockerfile: docker/app/Dockerfile
            context: .
        entrypoint: pytest
        volumes:
            - ./ThoughtsBackend:/opt/code

本节定义了一个名为 test-sqlite 的服务。构建定义要使用的 Dockerfile 和上下文,就像我们使用 docker build 命令一样。 docker-compose 自动设置名称。

我们可以使用以下命令构建容器:

$ docker-compose build test-sqlite
Building test-sqlite
...
Successfully built 8751a4a870d9
Successfully tagged ch3_test-sqlite:latest

entrypoint 指定要运行的命令,在这种情况下,通过 pytest 命令运行测试。

There are some differences between the command and the entrypoint, which both execute a command. The most relevant ones are that command is easier to overwrite and entrypoint appends any extra arguments at the end.

要运行容器,请调用 run 命令:

$ docker-compose run test-sqlite
=================== test session starts ===================
platform linux -- Python 3.6.8, pytest-4.5.0, py-1.8.0, pluggy-0.12.0 -- /opt/venv/bin/python3
cachedir: .pytest_cache
rootdir: /opt/code, inifile: pytest.ini
plugins: flask-0.14.0
collected 17 items

tests/test_thoughts.py::test_create_me_thought PASSED [ 5%]
...
tests/test_token_validation.py::test_valid_token_header PASSED [100%]

========== 17 passed, 177 warnings in 1.25 seconds ============
$ 

您可以附加 pytest 参数,这些参数将传递给内部 entrypoint。例如,要运行与 validation 字符串匹配的测试,请运行以下命令:

$ docker-compose run test-sqlite -k validation
...
===== 9 passed, 8 deselected, 13 warnings in 0.30 seconds =======
$

还有两个额外的细节:当前代码通过卷挂载并覆盖容器中的代码。查看 ./ThoughtsBackend 中的当前代码是如何挂载到容器中代码的位置 /opt/code 的。这对于开发来说非常方便,因为它将避免每次进行更改时都必须重新构建容器。

这也意味着挂载目录层次结构中的任何写入都将保存在本地文件系统中。例如,./ThoughtsBackend/db.sqlite3 数据库文件允许您使用它进行测试。它还将存储生成的 pyc 文件。

The generation of the db.sqlite3 file can create permission problems in some operating systems. If that's the case, delete it to be regenerated and/or allow it to read and write to all users with chmod 666 ./ThoughtsBackend/db.sqlite3.

这就是我们使用 environment 选项来传递 PYTHONDONTWRITEBYTECODE=1 环境变量的原因。这会阻止 Python 创建 pyc 文件。

虽然 SQLite 有利于测试,但我们需要创建一个更好的结构来反映部署,并配置对数据库的访问以便能够部署服务器。

Creating a PostgreSQL database container

我们需要针对 PostgreSQL 数据库测试我们的代码。这是我们将在生产环境中部署代码的数据库。

虽然 SQLAlchemy 中的抽象层旨在减少差异,但数据库的行为存在一些差异。

例如,在 /thoughts_backend/api_namespace.py 中,以下行不区分大小写,这是我们想要的行为:

query = (query.filter(ThoughtModel.text.contains(search_param)))

把它翻译成 PostgreSQL,它是区分大小写的,这需要你检查它。如果使用 SQLite 进行测试并在 PostgreSQL 中运行,这将是生产中的错误。

替换的代码,使用 ilike 为预期的行为,如下:

param = f'%{search_param}%'
query = (query.filter(ThoughtModel.text.ilike(param)))

我们将旧代码保留在注释中以显示此问题。

要创建数据库容器,我们需要定义相应的 Dockerfile。我们将所有文件存储在 docker/db/ 子目录中。让我们看一下 Dockerfile 及其不同的部分。整个文件可以在 GitHub 上找到 (https://github.com/PacktPublishing/Hands-On-Docker-for-Microservices-with-Python/blob/master/Chapter03/docker/db/Dockerfile)。这个 Dockerfile 可以分为以下几个阶段:

  1. Using the ARG keyword, define the basic PostgreSQL configuration such as the name of the database, user, and password. They get set in environment variables so that the PostgreSQL commands can use them.
这些命令仅用于本地开发。它们需要与设置的环境相匹配。这 ARG 关键字在构建时定义 Dockerfile 的参数。我们将看到如何将它们设置为 docker-compose.yaml 文件。

ARG 元素也被定义为 ENV 变量,因此我们将它们定义为环境变量:

# This Dockerfile is for localdev purposes only, so it won't be
# optimised for size
FROM alpine:3.9

# Add the proper env variables for init the db
ARG POSTGRES_DB
ENV POSTGRES_DB $POSTGRES_DB
ARG POSTGRES_USER
ENV POSTGRES_USER $POSTGRES_USER
ARG POSTGRES_PASSWORD
ENV POSTGRES_PASSWORD $POSTGRES_PASSWORD
ARG POSTGRES_PORT
ENV LANG en_US.utf8
EXPOSE $POSTGRES_PORT

# For usage in startup
ENV POSTGRES_HOST localhost
ENV DATABASE_ENGINE POSTGRESQL
# Store the data inside the container, as we don't care for
# persistence
RUN mkdir -p /opt/data
ENV PGDATA /opt/data
  1. Install the postgresql package and all its dependencies, such as Python 3 and its compilers. We will need them to be able to run the application code:
RUN apk update
RUN apk add bash curl su-exec python3
RUN apk add postgresql postgresql-contrib postgresql-dev
RUN apk add python3-dev build-base linux-headers gcc libffi-dev
  1. Install and run the postgres-setup.sh script:
# Adding our code
WORKDIR /opt/code

RUN mkdir -p /opt/code/db
# Add postgres setup
ADD ./docker/db/postgres-setup.sh /opt/code/db/
RUN /opt/code/db/postgres-setup.sh

这将初始化数据库,设置正确的用户、密码等。请注意,这还没有为我们的应用程序创建特定的表。

As part of our initialization, we create the data files inside the container. This means that the data won't persist after the container stops. This is a good thing for testing, but, if you want to access the data for debug purposes, remember to keep the container up.
  1. Install the requirements for our application and specific commands to run in the database container:
## Install our code to prepare the DB
ADD ./ThoughtsBackend/requirements.txt /opt/code

RUN pip3 install -r requirements.txt
  1. Copy the application code and database commands stored in docker/db. Run the prepare_db.sh script, which creates the application database structure. In our case, it sets up the thoughts table:
## Need to import all the code, due dependencies to initialize the DB
ADD ./ThoughtsBackend/ /opt/code/
# Add all DB commands
ADD ./docker/db/* /opt/code/db/

## get the db ready
RUN /opt/code/db/prepare_db.sh

该脚本首先启动后台运行的 PostgreSQL 数据库,然后调用 init_db.py,然后优雅地停止数据库。

Keep in mind that, in each of the steps of Dockerfile, in order to access the database, it needs to be running, but it will also be stopped at the end of each step. In order to avoid corruption of the data or the abrupt killing of the process, be sure to use the stop_postgres.sh script until the end. Though PostgreSQL will normally recover for an abruptly stopped database, it will slow the startup time.
  1. To start the database in operation, the CMD is just the postgres command. It needs to run with the postgres user:
# Start the database in normal operation
USER postgres
CMD ["postgres"]

要运行数据库服务,我们需要将其设置为 docker-compose 文件的一部分:

    db:
        build:
            context: .
            dockerfile: ./docker/db/Dockerfile
            args:
                # These values should be in sync with environment
                # for development. If you change them, you'll 
                # need to rebuild the container
                - POSTGRES_DB=thoughts
                - POSTGRES_USER=postgres
                - POSTGRES_PASSWORD=somepassword
                - POSTGRES_PORT=5432
        ports:
            - "5432:5432"

请注意,args 参数将在构建期间设置 ARG 值。我们还路由 PostgreSQL 端口以允许访问数据库。

您现在可以构建并启动服务器:

$ docker-compose up build
$ docker-compose up db
Creating ch3_db_1 ... done
Attaching to ch3_db_1
...
db_1 | 2019-06-02 13:55:38.934 UTC [1] LOG: database system is ready to accept connections

在不同的终端中,您可以使用 PostgreSQL 客户端访问数据库。我推荐出色的 pgcli。你可以查看它的文档(https://www.pgcli.com/)。

You can use also the official psql client or any other PostgreSQL client of your preference. The documentation for the default client can be found here: https://www.postgresql.org/docs/current/app-psql.html.

在这里,我们使用 PGPASSWORD 环境变量来显示密码是之前配置的密码:

$ PGPASSWORD=somepassword pgcli -h localhost -U postgres thoughts
Server: PostgreSQL 11.3
Version: 2.0.2
Chat: https://gitter.im/dbcli/pgcli
Mail: https://groups.google.com/forum/#!forum/pgcli
Home: http://pgcli.com
postgres@localhost:thoughts> select * from thought_model
+------+------------+--------+-------------+
|  id  |  username  |  text  |  timestamp  |
|------+------------+--------+-------------|
+------+------------+--------+-------------+
SELECT 0
Time: 0.016s

能够访问数据库对于调试目的很有用。

Configuring your service

我们可以将服务配置为使用环境变量来改变行为。对于容器,这是使用配置文件的绝佳替代方案,因为它允许不可变容器注入其配置。这符合十二因素应用程序 (https://12factor.net/config) 的原则,并允许代码和配置之间的良好分离,以及代码可能用于的不同部署的设置。

One of the advantages that we'll look at later with the use of Kubernetes is creating new environments on-demand, which can be tweaked for testing purposes or tailored for development or demo. Being able to quickly change all the configuration by injecting the proper environment makes this operation very easy and straightforward. It also allows you to enable or disable features, if properly configured, which helps the enablement of features on launch day, with no code rollout.

这允许配置数据库连接,因此我们可以在 SQLite 后端或 PostgreSQL 之间进行选择。

Configuring the system is not limited to open variables, though. Environment variables will be used later in the book for storing secrets. Note that a secret needs to be available inside the container.

我们将配置测试以访问我们新创建的数据库容器。为此,我们首先需要能够通过配置在 SQLite 或 PostgreSQL 之间进行选择。查看 ./ThoughtsBackend/thoughts_backend/db.py 文件:

import os
from pathlib import Path
from flask_sqlalchemy import SQLAlchemy

DATABASE_ENGINE = os.environ.get('DATABASE_ENGINE', 'SQLITE')

if DATABASE_ENGINE == 'SQLITE':
    dir_path = Path(os.path.dirname(os.path.realpath(__file__)))
    path = dir_path / '..'

    # Database initialisation
    FILE_PATH = f'{path}/db.sqlite3'
    DB_URI = 'sqlite+pysqlite:///{file_path}'
    db_config = {
        'SQLALCHEMY_DATABASE_URI': DB_URI.format(file_path=FILE_PATH),
        'SQLALCHEMY_TRACK_MODIFICATIONS': False,
    }

elif DATABASE_ENGINE == 'POSTGRESQL':
    db_params = {
        'host': os.environ['POSTGRES_HOST'],
        'database': os.environ['POSTGRES_DB'],
        'user': os.environ['POSTGRES_USER'],
        'pwd': os.environ['POSTGRES_PASSWORD'],
        'port': os.environ['POSTGRES_PORT'],
    }
    DB_URI = 'postgresql://{user}:{pwd}@{host}:{port}/{database}'
    db_config = {
        'SQLALCHEMY_DATABASE_URI': DB_URI.format(**db_params),
        'SQLALCHEMY_TRACK_MODIFICATIONS': False,
    }

else:
    raise Exception('Incorrect DATABASE_ENGINE')

db = SQLAlchemy()

当使用 DATABASE_ENGINE 环境变量设置为 POSTGRESQL 时,它会正确配置它。其他环境变量需要正确;也就是说,如果数据库引擎设置为 PostgreSQL,则需要设置 POSTGRES_HOST 变量。

环境变量可以单独存储在 docker-compose.yaml 文件中,但是在一个文件中存储多个更方便。我们来看看environment.env

DATABASE_ENGINE=POSTGRESQL
POSTGRES_DB=thoughts
POSTGRES_USER=postgres
POSTGRES_PASSWORD=somepassword
POSTGRES_PORT=5432
POSTGRES_HOST=db

注意用户的定义等是符合参数的,创建Dockerfile进行测试。 POSTGRES_HOST 定义为db,即服务的名称。

在为创建的 Docker 集群内部 docker-compose,你可以通过名字来引用服务。这将由内部 DNS 引导到正确的容器,作为快捷方式。这允许服务之间的轻松通信,因为它们可以通过名称非常轻松地配置其访问权限。请注意,此连接仅在集群内部有效,用于容器之间的通信。

我们使用 PostgreSQL 容器的测试服务然后被定义如下:

    test-postgresql:
        env_file: environment.env
        environment:
            - PYTHONDONTWRITEBYTECODE=1
        build:
            dockerfile: docker/app/Dockerfile
            context: .
        entrypoint: pytest
        depends_on:
            - db
        volumes:
            - ./ThoughtsBackend:/opt/code

这与 test-sqlite 服务非常相似,但它在 environment.env 中添加了环境配置,并添加了对 db 的依赖。这意味着 docker-compose 将启动 db 服务,如果不存在的话。

您现在可以针对 PostgreSQL 数据库运行测试:

$ docker-compose run test-postgresql
Starting ch3_db_1 ... done
============== test session starts ====================
platform linux -- Python 3.6.8, pytest-4.6.0, py-1.8.0, pluggy-0.12.0 -- /opt/venv/bin/python3
cachedir: .pytest_cache
rootdir: /opt/code, inifile: pytest.ini
plugins: flask-0.14.0
collected 17 items

tests/test_thoughts.py::test_create_me_thought PASSED [ 5%]
...
tests/test_token_validation.py::test_valid_token_header PASSED [100%]

===== 17 passed, 177 warnings in 2.14 seconds ===
$

这个环境文件对于任何需要连接到数据库的服务都会很有用,比如在本地部署服务。

Deploying the Docker service locally

使用所有这些元素,我们可以创建服务以在本地部署 Thoughts 服务:

     server:
        env_file: environment.env
        image: thoughts_server
        build:
            context: .
            dockerfile: docker/app/Dockerfile
        ports:
            - "8000:8000"
        depends_on:
            - db

我们需要确保添加db数据库服务的依赖。我们还绑定了内部端口,以便我们可以在本地访问它。

We start the service with the up command. There are some differences between the up and the run commands, but the main one is that run is for single commands that start and stop, while up is designed for services. For example, run creates an interactive Terminal, which displays colors, and up shows the standard output as logs, including the time when they were generated, accepts the -d flag to run in the background, and so on. Using one instead of the other is normally okay, however, up exposes ports and allows other containers and services to connect, while run does not.

我们现在可以使用以下命令启动服务:

$ docker-compose up server
Creating network "ch3_default" with the default driver
Creating ch3_db_1 ... done
Creating ch3_server_1 ... done
Attaching to ch3_server_1
server_1 | [uWSGI] getting INI configuration from /opt/uwsgi/uwsgi.ini
server_1 | *** Starting uWSGI 2.0.18 (64bit) on [Sun Jun 2 
...
server_1 | spawned uWSGI master process (pid: 6)
server_1 | spawned uWSGI worker 1 (pid: 7, cores: 1)
server_1 | spawned uWSGI http 1 (pid: 8)

现在在浏览器中访问 localhost:8000 中的服务:

读书笔记《hands-on-docker-for-microservices-with-python》使用Docker构建、运行和测试您的服务

您可以在终端中查看日志。点击 Ctrl + C 将停止服务器。该服务也可以使用 -d 标志启动,以分离终端并以守护程序模式运行:

$ docker-compose up -d server
Creating network "ch3_default" with the default driver
Creating ch3_db_1 ... done
Creating ch3_server_1 ... done
$

使用 docker-compose ps检查正在运行的服务、它们的当前状态和打开的端口:

$ docker-compose ps
    Name Command State Ports
------------------------------------------------------------------------------
ch3_db_1 postgres Up 0.0.0.0:5432->5432/tcp
ch3_server_1 /bin/sh /opt/uwsgi/start_s ... Up 0.0.0.0:8000->8000/tcp

正如我们之前所见,我们可以直接访问数据库并在其中运行原始 SQL 命令。这对于调试问题或进行实验很有用:

$ PGPASSWORD=somepassword pgcli -h localhost -U postgres thoughts
Server: PostgreSQL 11.3
Version: 2.0.2

postgres@localhost:thoughts> 
INSERT INTO thought_model (username, text, timestamp) 
VALUES ('peterparker', 'A great power carries a great
 responsability', now());

INSERT 0 1
Time: 0.014s
postgres@localhost:thoughts>

现在可以通过以下 API 获得这个想法:

$ curl http://localhost:8000/api/thoughts/
[{"id": 1, "username": "peterparker", "text": "A great power carries a great responsability", "timestamp": "2019-06-02T19:44:34.384178"}]

如果需要在分离模式下查看日志,可以使用 docker-compose logs <optional: service> 命令:

$ docker-compose logs server
Attaching to ch3_server_1
server_1 | [uWSGI] getting INI configuration from /opt/uwsgi/uwsgi.ini
server_1 | *** Starting uWSGI 2.0.18 (64bit) on [Sun Jun 2 19:44:15 2019] ***
server_1 | compiled with version: 8.3.0 on 02 June 2019 11:00:48
...
server_1 | [pid: 7|app: 0|req: 2/2] 172.27.0.1 () {28 vars in 321 bytes} [Sun Jun 2 19:44:41 2019] GET /api/thoughts/ => generated 138 bytes in 4 msecs (HTTP/1.1 200) 2 headers in 72 bytes (1 switches on core 0)

要完全停止集群,请调用 docker-compose down

$ docker-compose down
Stopping ch3_server_1 ... done
Stopping ch3_db_1 ... done
Removing ch3_server_1 ... done
Removing ch3_db_1 ... done
Removing network ch3_default

这会停止所有容器。

Pushing your Docker image to a remote registry

我们看到的所有操作都适用于我们的本地 Docker 存储库。鉴于 Docker 镜像的结构以及每一层都可以独立工作的事实,它们很容易上传和共享。为此,我们需要使用远程存储库或 Docker 术语中的注册表,它将接受推送到它的图像,并允许从中提取图像。

Docker 镜像的结构由每一层组成。它们中的每一个都可以独立推送,只要注册表包含它所依赖的层即可。如果前面的层已经存在,这可以节省空间,因为它们只会存储一次。

Obtaining public images from Docker Hub

默认注册表是 Docker Hub。这是默认配置的,它作为公共图像的主要来源。您可以在 https://hub.docker.com/ 中自由访问它并搜索可用的图像以建立您的图片:

读书笔记《hands-on-docker-for-microservices-with-python》使用Docker构建、运行和测试您的服务

每个图像都有关于使用它的方式和可用标签的信息。您不需要单独下载镜像,只需使用镜像名称或运行 docker pull 命令即可。如果未指定其他注册表,Docker 将自动从 Docker Hub 拉取:

读书笔记《hands-on-docker-for-microservices-with-python》使用Docker构建、运行和测试您的服务

镜像的名称也是我们在 Dockerfiles 中的 FROM 命令中使用的名称。

Docker is a fantastic way of distributing a tool. It's very common right now for an open source tool to have an official image in Docker Hub that can be downloaded and started in a standalone model, standardizing the access.

This can be used either for a quick demo, for something such as Ghost— https://hub.docker.com/_/ghost (a blogging platform), or a Redis ( https://hub.docker.com/_/redis) instance to act as cache with minimal work. Try to run the Ghost example locally.

Using tags

标签是标记同一图像的不同版本的描述符。有一个图像 alpine:3.9 和另一个图像 alpine:3.8。也有针对不同解释器(3.6、3.7、2.7 等)的 Python 官方镜像,但除了版本之外,解释器可能会参考镜像的创建方式。

例如,这些图像具有相同的效果。第一个是包含 Python 3.7 解释器的完整图像:

$ docker run -it python:3.7
Python 3.7.3 (default, May 8 2019, 05:28:42)
[GCC 6.3.0 20170516] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>>

第二个也有一个 Python 3.7 解释器。请注意名称中的 slim 更改:

$ docker run -it python:3.7-slim
Python 3.7.3 (default, May 8 2019, 05:31:59)
[GCC 6.3.0 20170516] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>>

但是,图像的大小完全不同:

$ docker images | grep python
python 3.7-slim ca7f9e245002 4 weeks ago 143MB
python 3.7      a4cc999cf2aa 4 weeks ago 929MB

如果未指定另一个标记,则任何构建都会自动使用 latest 标记。

Keep in mind that tags can be overwritten. This may be confusing, given some of the similarities between the way Docker and Git work, as the term "tag" in Git means something that can't change. A tag in Docker is similar to a branch in Git.

可以使用不同的标签多次标记单个图像。例如,latest 标签也可以是版本 v1.5

$ docker tag thoughts-backend:latest thoughts-backend:v1.5
$ docker images
REPOSITORY       TAG    IMAGE ID     CREATED    SIZE
thoughts-backend latest c7a8499623e7 5 min ago 144MB
thoughts-backend v1.5   c7a8499623e7 5 min ago 144MB

注意 image id 是如何相同的。使用标签允许您标记特定图像,因此我们知道它们已准备好部署或赋予它们某种意义。

Pushing into a registry

一旦我们标记了我们的图像,我们就可以将其推送到共享注册表,以便其他服务可以使用它。

可以部署自己的 Docker 注册表,但除非绝对必要,否则最好避免使用它。有一些云提供商允许您创建自己的注册表,无论是公共的还是私有的,甚至在您自己的私有云网络中。如果你想让你的镜像可用,最好的选择是 Docker Hub,因为它是标准的,而且最容易访问。在本章中,我们将在这里创建一个,但我们将在本书后面探讨其他选项。

It's worth saying it again: maintaining your own Docker registry is much more expensive than using a provider one. Commercial prices for registries, unless you require a lot of repos will be in the range of tens of dollars per month, and there are options from well-known cloud providers such as AWS, Azure, and Google Cloud.

除非确实需要,否则请避免使用自己的注册表。

我们将在 Docker Hub 注册表中创建一个新的存储库。您可以免费创建一个私人仓库,也可以创建任意数量的公共仓库。你需要创建一个新用户,下载Docker的时候可能就是这样。

在 Docker 术语中,repo 是一组具有不同标签的图像;例如 thoughts-backend 的所有标签。这与注册表不同,注册表是一个包含多个 repos 的服务器。

在更非正式的术语中,通常将注册表称为 repos,将 repos 称为 images,不过,纯粹而言,图像是唯一的,可能是标签(或不)。

然后,您可以创建一个新的 repo,如下所示:

读书笔记《hands-on-docker-for-microservices-with-python》使用Docker构建、运行和测试您的服务

创建 repo 后,我们需要相应地标记我们的图像。这意味着它应该包含 Docker Hub 中的用户名来识别 repo。另一种方法是直接使用包含的用户名命名图像:

$ docker tag thoughts-backend:latest jaimebuelta/thoughts-backend:latest

为了能够访问 repo,我们需要在 Docker Hub 中使用我们的用户名和密码登录 Docker:

$ docker login
Login with your Docker ID to push and pull images from Docker Hub. If you don't have a Docker ID, head over to https://hub.docker.com to create one.
Username: jaimebuelta
Password:
Login Succeeded

登录后即可推送图片:

$ docker push jaimebuelta/thoughts-backend:latest
The push refers to repository [docker.io/jaimebuelta/thoughts-backend]
1ebb4000a299: Pushed
669047e32cec: Pushed
6f7246363f55: Pushed
ac1d27280799: Pushed
c43bb774a4bb: Pushed
992e49acee35: Pushed
11c1b6dd59b3: Pushed
7113f6aae2a4: Pushed
5275897866cf: Pushed
bcf2f368fe23: Mounted from library/alpine
latest: digest: sha256:f1463646b5a8dec3531842354d643f3d5d62a15cc658ac4a2bdbc2ecaf6bb145 size: 2404

鉴于本地 Docker 已正确记录,您现在可以共享映像并从任何地方拉取它。当我们部署生产集群时,我们需要确保执行它的 Docker 服务器能够访问注册表并且它被正确记录。

Summary

在本章中,我们学习了如何使用 Docker 命令来创建和操作容器。我们学习了大部分常用的 Docker 命令,如 buildrunexecps图像标签推送

我们了解了如何构建 Web 服务容器,包括准备配置文件、如何构建 Dockerfile 以及如何使我们的镜像尽可能小。我们还介绍了如何使用 docker-compose 在本地操作,并通过一个 docker-compose.yaml 文件连接运行在一个集群配置。这包括创建一个数据库容器,允许使用相同的工具进行更接近生产部署的测试。

我们看到了如何使用环境变量来配置我们的服务,以及如何通过 docker-compose 配置注入它们以允许不同的模式,例如测试。

最后,我们分析了如何使用注册表来共享我们的图像,以及如何充分标记它们并允许将它们从本地开发中移出,以便在部署中使用。

在下一章中,我们将看到如何利用创建的容器和操作来自动运行测试,并让自动化工具为我们完成繁重的工作,以确保我们的代码始终是高质量的!

Questions

  1. What does the FROM keyword do in a Dockerfile?
  2. How would you start a container with its predefined command?
  3. Why won't creating a step to remove files in a Dockerfile make a smaller image?
  4. Can you describe how a multistage Docker build works?
  5. What is the difference between the run and exec commands?
  6. When should we use the-it flags when using the run and exec commands?
  7. Do you know any alternatives to uWSGI to serve Python web applications?
  8. What is docker-compose used for?
  9. Can you describe what a Docker tag is?
  10. Why is it necessary to push images to a remote registry?

Further reading