为Prometheus实现全局视图和高可用性
实现全局视图和高可用性
Thanos提供了一系列组件,可以提供高可用性的度量系统,存储容量几乎无限制。它可以添加到现有的Prometheus部署环境上,提供全局查询视图、数据备份和历史数据访问等功能。此外,这些功能可彼此独立使用,这使得你只要在需要时引入Thanos功能。
初始集群设置
你将在Kubernetes集群中部署Prometheus,然后在其中模拟所需的场景。kind工具是在本地启动Kubernetes集群的好方法。你将使用以下配置。
# config.yamlkind: ClusterapiVersion: kind.x-k8s.io/v1alpha4name: thanos-demonodes:: control-plane: kindest/node:v1.23.0@sha256:2f93d3c7b12a3e93e6c1f34f331415e105979961fcddbe69a4e3ab5a93ccbb35: worker: kindest/node:v1.23.0@sha256:2f93d3c7b12a3e93e6c1f34f331415e105979961fcddbe69a4e3ab5a93ccbb35: workerimage: kindest/node:v1.23.0@sha256:2f93d3c7b12a3e93e6c1f34f331415e105979961fcddbe69a4e3ab5a93ccbb35
有了这个配置,你可以随时启动集群。
~ kind create cluster --config config.yamlCreating cluster "thanos-demo" ...✓ Ensuring node image (kindest/node:v1.23.0)✓ Preparing nodes✓ Writing configuration✓ Starting control-plane✓ Installing CNI✓ Installing StorageClass✓ Joining worker nodesSet kubectl context to "kind-thanos-demo"You can now use your cluster with:kubectl cluster-info --context kind-thanos-demoHave a nice day!
集群启动并运行后,你要检查安装,以确保可以随时启动Prometheus。你需要kubectl与Kubernetes集群进行交互。
~ kind get clustersthanos-demo~ kubectl get nodesNAME STATUS ROLES AGE VERSIONthanos-demo-control-plane Ready control-plane,master 119s v1.23.0thanos-demo-worker Ready <none> 88s v1.23.0thanos-demo-worker2 Ready <none> 88s v1.23.0~ kubectl get pods -o name -Apod/coredns-64897985d-mz8bv</p>pod/coredns-64897985d-pxzkqpod/etcd-thanos-demo-control-planepod/kindnet-27cdwpod/kindnet-42kcvpod/kindnet-5rlcjpod/kube-apiserver-thanos-demo-control-planepod/kube-controller-manager-thanos-demo-control-planepod/kube-proxy-49mggpod/kube-proxy-nhvkmpod/kube-proxy-z4fpnpod/kube-scheduler-thanos-demo-control-planepod/local-path-provisioner-5bb5788f44-hj5c4
有了这个配置,你可以随时启动集群。
~ kind create cluster --config config.yamlCreating cluster "thanos-demo" ...✓ Ensuring node image (kindest/node:v1.23.0)✓ Preparing nodes✓ Writing configuration✓ Starting control-plane✓ Installing CNI✓ Installing StorageClass✓ Joining worker nodesSet kubectl context to "kind-thanos-demo"You can now use your cluster with:kubectl cluster-info --context kind-thanos-demoHave a nice day!
集群启动并运行后,你要检查安装,以确保可以随时启动Prometheus。你需要kubectl与Kubernetes集群进行交互。
~ kind get clustersthanos-demo~ kubectl get nodesNAME STATUS ROLES AGE VERSIONthanos-demo-control-plane Ready control-plane,master 119s v1.23.0thanos-demo-worker Ready <none> 88s v1.23.0thanos-demo-worker2 Ready <none> 88s v1.23.0~ kubectl get pods -o name -Apod/coredns-64897985d-mz8bvpod/coredns-64897985d-pxzkqpod/etcd-thanos-demo-control-planepod/kindnet-27cdwpod/kindnet-42kcvpod/kindnet-5rlcjpod/kube-apiserver-thanos-demo-control-planepod/kube-controller-manager-thanos-demo-control-planepod/kube-proxy-49mggpod/kube-proxy-nhvkmpod/kube-proxy-z4fpnpod/kube-scheduler-thanos-demo-control-planepod/local-path-provisioner-5bb5788f44-hj5c4
初始Prometheus设置
~ helm repo add prometheus-community https://prometheus-community.github.io/helm-charts"prometheus-community" has been added to your repositories~ helm repo updateHang tight while we grab the latest from your chart repositories......Successfully got an update from the "prometheus-community" chart repositoryUpdateComplete.⎈HappyHelming!⎈
由于实际上你只有一个Kubernetes集群,所以你将通过在不同的命名空间中部署Prometheus来模拟多个区域。你将为europe创建一个命名空间,为united-states创建另一个命名空间。
~ kubectl create namespace europenamespace/europe created~ kubectl create namespace united-statesnamespace/united-states created
你已有了区域,可以随时部署Prometheus。
# prometheus-europe.yamlnameOverride:"eu"namespaceOverride:"europe"nodeExporter::falsegrafana::falsealertmanager::falsekubeStateMetrics::falseprometheus:::2:"replica":"cluster"# prometheus-united-states.yamlnameOverride:"us"namespaceOverride:"united-states"nodeExporter::falsegrafana::falsealertmanager::falsekubeStateMetrics::falseprometheus:::"replica"prometheusExternalLabelName:"cluster"
使用上述配置,你将在每个区域部署Prometheus实例。
~ helm -n europe upgrade -i prometheus-europe prometheus-community/kube-prometheus-stack -f prometheus-europe.yamlRelease"prometheus-europe" does not exist.Installing it now.NAME: prometheus-europeLAST DEPLOYED:SatJan2218:26:222022NAMESPACE: europeSTATUS: deployedREVISION:1TEST SUITE:NoneNOTES:kube-prometheus-stack has been installed.Check its status by running:kubectl --namespace europe get pods -l "release=prometheus-europe"~ helm -n united-states upgrade -i prometheus-united-states prometheus-community/kube-prometheus-stack -f prometheus-united-states.yamlRelease"prometheus-united-states" does not exist.Installing it now.NAME: prometheus-united-statesLAST DEPLOYED:SatJan2218:26:482022NAMESPACE: united-statesSTATUS: deployedREVISION:1TEST SUITE:NoneNOTES:kube-prometheus-stack has been installed.Check its status by running:kubectl --namespace united-states get pods -l "release=prometheus-united-states"Visit https://github.com/prometheus-operator/kube-prometheus for instructions on how to create & configure Alertmanager and Prometheus instances using the Operator.
现在可以确保你的Prometheus按预期的方式运行。
~ kubectl -n europe get pods -l app.kubernetes.io/name=prometheusNAME READY STATUS RESTARTS AGEprometheus-prometheus-europe-prometheus-0 2/2 Running 0 18sprometheus-prometheus-europe-prometheus-1 2/2 Running 0 18s~ kubectl -n united-states get pods -l app.kubernetes.io/name=prometheusNAME READY STATUS RESTARTS AGEprometheus-prometheus-united-states-prometheus-0 2/2 Running 0 39s
你现在可以在每个单独的实例上查询任何指标,但无法执行多集群查询。
部署Thanos Sidecar
kube-prometheus-stack支持将Thanos部署为sidecar,这意味着它将与Prometheus本身一起部署。Thanos sidecar通过StoreAPI来公开Prometheus,而StoreAPI是一个通用的gRPC API,允许Thanos组件从诸多系统获取指标。
# prometheus-europe.yamlnameOverride:"eu"namespaceOverride:"europe"nodeExporter::falsegrafana::falsealertmanager::falsekubeStateMetrics::falseprometheus:::2:"replica":"cluster":: quay.io/thanos/thanos: v0.24.0# prometheus-united-states.yamlnameOverride:"us"namespaceOverride:"united-states"nodeExporter::falsegrafana::falsealertmanager::falsekubeStateMetrics::falseprometheus:::"replica":"cluster":: quay.io/thanos/thanosversion: v0.24.0
有了更新后的配置,你可以随时升级Prometheus。
~ helm -n europe upgrade -i prometheus-europe prometheus-community/kube-prometheus-stack -f 2/prometheus-europe.yamlRelease"prometheus-europe" has been upgraded.HappyHelming!NAME: prometheus-europeLAST DEPLOYED:SatJan2218:42:242022NAMESPACE: europeSTATUS: deployedREVISION:2TEST SUITE:NoneNOTES:kube-prometheus-stack has been installed.Check its status by running:kubectl --namespace europe get pods -l "release=prometheus-europe"~ helm -n united-states upgrade -i prometheus-united-states prometheus-community/kube-prometheus-stack -f 2/prometheus-united-states.yamlRelease"prometheus-united-states" has been upgraded.HappyHelming!NAME: prometheus-united-statesLAST DEPLOYED:SatJan2218:43:062022NAMESPACE: united-statesSTATUS: deployedREVISION:2TEST SUITE:NoneNOTES:kube-prometheus-stack has been installed.Check its status by running:kubectl --namespace united-states get pods -l "release=prometheus-united-states"Visit https://github.com/prometheus-operator/kube-prometheus for instructions on how to create & configure Alertmanager and Prometheus instances using the Operator.
你应核查Prometheus pod有一个额外的容器与它们一起运行。
~ kubectl -n europe get pods -l app.kubernetes.io/name=prometheusNAME READY STATUS RESTARTS AGEprometheus-prometheus-europe-prometheus-0 3/3 Running 0 48sprometheus-prometheus-europe-prometheus-1 3/3 Running 0 65s~ kubectl -n united-states get pods -l app.kubernetes.io/name=prometheusNAME READY STATUS RESTARTS AGEprometheus-prometheus-united-states-prometheus-0 3/3 Running 0 44s
部署Thanos Querier以实现全局视图
Querier实现Prometheus HTTP v1 API,以便通过PromQL查询Thanos集群中的数据。它将允许你从单个端点获取指标。它先从底层StoreAPI收集评估查询所需的数据,之后评估查询,最后返回结果。
你利用kube-prometheus-stack来部署Thanos sidecar。遗憾的是,该图不支持其他Thanos 组件。为此,你将利用Banzai Cloud Helm Charts存储库。与以前一样,你先从添加存储库开始,就跟之前的做法一样。
~ helm repo add banzaicloud https://kubernetes-charts.banzaicloud.com"banzaicloud" has been added to your repositories~ helm repo updateHang tight while we grab the latest from your chart repositories......Successfully got an update from the "prometheus-community" chart repository...Successfully got an update from the "banzaicloud" chart repositoryUpdateComplete.⎈HappyHelming!⎈
为了模拟集中式监控解决方案,你将创建monitoring命名空间。
~ kubectl create namespace monitoringnamespace/monitoring created
下列配置可配置Thanos Querier,并将它指向Prometheus实例。
store:enabled:falsecompact:enabled:falsebucket: https://thanos.io/v0.8/components/bucket/enabled:falserule:enabled:falsesidecar:enabled:falsequeryFrontend:enabled:falsequery:enabled:truereplicaLabels:- replicastores:-"dnssrv+_grpc._tcp.prometheus-operated.europe.svc.cluster.local"-"dnssrv+_grpc._tcp.prometheus-operated.united-states.svc.cluster.local"
有了上述配置,你可以随时部署Querier。
~ helm -n monitoring upgrade -i thanos banzaicloud/thanos -f query.yamlRelease"thanos" does not exist.Installing it now.NAME: thanosLAST DEPLOYED:SatJan2218:48:032022NAMESPACE: monitoringSTATUS: deployedREVISION:1TEST SUITE:None~ kubectl -n monitoring port-forward svc/thanos-query-http 10902:10902Forwardingfrom127.0.0.1:10902->10902Forwardingfrom[::1]:10902->10902
使用port-forward,你可以连接到集群。应确保自己能执行多集群查询。你部署Prometheus后,设置replicaExternalLabelName: “replica”和prometheusExternalLabelName: “cluster”。重复数据删除功能将充分利用这些设置。启用该功能后,你可以确保对来自europe集群的指标执行重复数据删除。那是由于Thanos假设它们来自同一个高可用性组。之所以出现这种情况,是由于它们有相同的标签,除了副本标签外。
部署Thanos Query Frontend以提高可读性
最后一部分是部署Query Frontend(查询前端),这项服务可以放在Querier的前面,以提高可读性。它基于Cortex Query Frontend组件,支持拆分、重试、缓存和慢查询日志等功能。
# query.yamlstore:enabled:falsecompact:enabled:falsebucket:enabled:falserule:enabled:falsesidecar:enabled:falsequeryFrontend:enabled:truequery:enabled:truereplicaLabels:- replicastores:-"dnssrv+_grpc._tcp.prometheus-operated.europe.svc.cluster.local"-"dnssrv+_grpc._tcp.prometheus-operated.united-states.svc.cluster.local"
更新前面的配置以部署Query Frontend,你现在可以更新设置了。
~ helm -n monitoring upgrade -i thanos banzaicloud/thanos -f query.yamlRelease"thanos" has been upgraded.HappyHelming!NAME: thanosLAST DEPLOYED:SatJan2218:56:292022NAMESPACE: monitoringSTATUS: deployedREVISION:2TEST SUITE:None~ kubectl -n monitoring port-forward svc/thanos-query-frontend-http 10902:10902Forwardingfrom127.0.0.1:10902->10902Forwardingfrom[::1]:10902->10902
再次使用port-forward,你就能够访问Query Frontend了。
Query Frontend是向多个Prometheus实例发送查询的入口点。执行这类查询的服务(比如Grafana)应通过Query Frontend进行查询。
参考链接:
https://www.kubernetes.org.cn/9877.html
扫码加我微信,进群和大佬们零距离
