为Prometheus实现全局视图和高可用性
实现全局视图和高可用性
Thanos提供了一系列组件,可以提供高可用性的度量系统,存储容量几乎无限制。它可以添加到现有的Prometheus部署环境上,提供全局查询视图、数据备份和历史数据访问等功能。此外,这些功能可彼此独立使用,这使得你只要在需要时引入Thanos功能。
初始集群设置
你将在Kubernetes集群中部署Prometheus,然后在其中模拟所需的场景。kind工具是在本地启动Kubernetes集群的好方法。你将使用以下配置。
# config.yaml
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
name: thanos-demo
nodes:
control-plane :
kindest/node:v1.23.0@sha256:2f93d3c7b12a3e93e6c1f34f331415e105979961fcddbe69a4e3ab5a93ccbb35 :
worker :
kindest/node:v1.23.0@sha256:2f93d3c7b12a3e93e6c1f34f331415e105979961fcddbe69a4e3ab5a93ccbb35 :
worker :
image: kindest/node:v1.23.0@sha256:2f93d3c7b12a3e93e6c1f34f331415e105979961fcddbe69a4e3ab5a93ccbb35
有了这个配置,你可以随时启动集群。
~ kind create cluster --config config.yaml
Creating cluster "thanos-demo" ...✓ Ensuring node image (kindest/node:v1.23.0)✓ Preparing nodes
✓ Writing configuration
✓ Starting control-plane
✓ Installing CNI
✓ Installing StorageClass✓ Joining worker nodes
Set kubectl context to "kind-thanos-demo"You can now use your cluster with:kubectl cluster-info --context kind-thanos-demoHave a nice day!
集群启动并运行后,你要检查安装,以确保可以随时启动Prometheus。你需要kubectl与Kubernetes集群进行交互。
~ kind get clusters
thanos-demo
~ kubectl get nodes
NAME STATUS ROLES AGE VERSION
thanos-demo-control-plane Ready control-plane,master 119s v1.23.0
thanos-demo-worker Ready <none> 88s v1.23.0
thanos-demo-worker2 Ready <none> 88s v1.23.0~ kubectl get pods -o name -Apod/coredns-64897985d-mz8bv</p>
pod/coredns-64897985d-pxzkq
pod/etcd-thanos-demo-control-plane
pod/kindnet-27cdw
pod/kindnet-42kcv
pod/kindnet-5rlcj
pod/kube-apiserver-thanos-demo-control-plane
pod/kube-controller-manager-thanos-demo-control-plane
pod/kube-proxy-49mgg
pod/kube-proxy-nhvkm
pod/kube-proxy-z4fpn
pod/kube-scheduler-thanos-demo-control-plane
pod/local-path-provisioner-5bb5788f44-hj5c4
有了这个配置,你可以随时启动集群。
~ kind create cluster --config config.yaml
Creating cluster "thanos-demo" ...
✓ Ensuring node image (kindest/node:v1.23.0)
✓ Preparing nodes
✓ Writing configuration
✓ Starting control-plane
✓ Installing CNI
✓ Installing StorageClass
✓ Joining worker nodes
Set kubectl context to "kind-thanos-demo"You can now use your cluster with:
kubectl cluster-info --context kind-thanos-demo
Have a nice day!
集群启动并运行后,你要检查安装,以确保可以随时启动Prometheus。你需要kubectl与Kubernetes集群进行交互。
~ kind get clusters
thanos-demo
~ kubectl get nodes
NAME STATUS ROLES AGE VERSION
thanos-demo-control-plane Ready control-plane,master 119s v1.23.0
thanos-demo-worker Ready <none> 88s v1.23.0
thanos-demo-worker2 Ready <none> 88s v1.23.0~ kubectl get pods -o name -A
pod/coredns-64897985d-mz8bv
pod/coredns-64897985d-pxzkq
pod/etcd-thanos-demo-control-plane
pod/kindnet-27cdw
pod/kindnet-42kcv
pod/kindnet-5rlcj
pod/kube-apiserver-thanos-demo-control-plane
pod/kube-controller-manager-thanos-demo-control-plane
pod/kube-proxy-49mgg
pod/kube-proxy-nhvkm
pod/kube-proxy-z4fpn
pod/kube-scheduler-thanos-demo-control-plane
pod/local-path-provisioner-5bb5788f44-hj5c4
初始Prometheus设置
~ helm repo add prometheus-community https://prometheus-community.github.io/helm-charts"prometheus-community" has been added to your repositories
~ helm repo update
Hang tight while we grab the latest from your chart repositories......Successfully got an update from the "prometheus-community" chart repository
UpdateComplete.⎈HappyHelming!⎈
由于实际上你只有一个Kubernetes集群,所以你将通过在不同的命名空间中部署Prometheus来模拟多个区域。你将为europe创建一个命名空间,为united-states创建另一个命名空间。
~ kubectl create namespace europe
namespace/europe created
~ kubectl create namespace united-states
namespace/united-states created
你已有了区域,可以随时部署Prometheus。
# prometheus-europe.yaml
nameOverride:"eu"
namespaceOverride:"europe"
nodeExporter:
false :
grafana:
false :
alertmanager:
false :
kubeStateMetrics:
false :
prometheus:
:
2 :
"replica" :
"cluster"# prometheus-united-states.yaml :
nameOverride:"us"
namespaceOverride:"united-states"
nodeExporter:
false :
grafana:
false :
alertmanager:
false :
kubeStateMetrics:
false :
prometheus:
:
"replica" :
prometheusExternalLabelName:"cluster"
使用上述配置,你将在每个区域部署Prometheus实例。
~ helm -n europe upgrade -i prometheus-europe prometheus-community/kube-prometheus-stack -f prometheus-europe.yaml
Release"prometheus-europe" does not exist.Installing it now.
NAME: prometheus-europe
LAST DEPLOYED:SatJan2218:26:222022
NAMESPACE: europe
STATUS: deployed
REVISION:1
TEST SUITE:None
NOTES:
kube-prometheus-stack has been installed.Check its status by running:
kubectl --namespace europe get pods -l "release=prometheus-europe"~ helm -n united-states upgrade -i prometheus-united-states prometheus-community/kube-prometheus-stack -f prometheus-united-states.yaml
Release"prometheus-united-states" does not exist.Installing it now.
NAME: prometheus-united-states
LAST DEPLOYED:SatJan2218:26:482022
NAMESPACE: united-states
STATUS: deployed
REVISION:1
TEST SUITE:None
NOTES:
kube-prometheus-stack has been installed.Check its status by running:
kubectl --namespace united-states get pods -l "release=prometheus-united-states"Visit https://github.com/prometheus-operator/kube-prometheus for instructions on how to create & configure Alertmanager and Prometheus instances using the Operator.
现在可以确保你的Prometheus按预期的方式运行。
~ kubectl -n europe get pods -l app.kubernetes.io/name=prometheus
NAME READY STATUS RESTARTS AGE
prometheus-prometheus-europe-prometheus-0 2/2 Running 0 18s
prometheus-prometheus-europe-prometheus-1 2/2 Running 0 18s~ kubectl -n united-states get pods -l app.kubernetes.io/name=prometheus
NAME READY STATUS RESTARTS AGE
prometheus-prometheus-united-states-prometheus-0 2/2 Running 0 39s
你现在可以在每个单独的实例上查询任何指标,但无法执行多集群查询。
部署Thanos Sidecar
kube-prometheus-stack支持将Thanos部署为sidecar,这意味着它将与Prometheus本身一起部署。Thanos sidecar通过StoreAPI来公开Prometheus,而StoreAPI是一个通用的gRPC API,允许Thanos组件从诸多系统获取指标。
# prometheus-europe.yaml
nameOverride:"eu"
namespaceOverride:"europe"
nodeExporter:
false :
grafana:
false :
alertmanager:
false :
kubeStateMetrics:
false :
prometheus:
:
2 :
"replica" :
"cluster" :
:
quay.io/thanos/thanos :
v0.24.0# prometheus-united-states.yaml :
nameOverride:"us"
namespaceOverride:"united-states"
nodeExporter:
false :
grafana:
false :
alertmanager:
false :
kubeStateMetrics:
false :
prometheus:
:
"replica" :
"cluster" :
:
quay.io/thanos/thanos :
version: v0.24.0
有了更新后的配置,你可以随时升级Prometheus。
~ helm -n europe upgrade -i prometheus-europe prometheus-community/kube-prometheus-stack -f 2/prometheus-europe.yaml
Release"prometheus-europe" has been upgraded.HappyHelming!
NAME: prometheus-europe
LAST DEPLOYED:SatJan2218:42:242022
NAMESPACE: europe
STATUS: deployed
REVISION:2
TEST SUITE:None
NOTES:
kube-prometheus-stack has been installed.Check its status by running:
kubectl --namespace europe get pods -l "release=prometheus-europe"~ helm -n united-states upgrade -i prometheus-united-states prometheus-community/kube-prometheus-stack -f 2/prometheus-united-states.yaml
Release"prometheus-united-states" has been upgraded.HappyHelming!
NAME: prometheus-united-states
LAST DEPLOYED:SatJan2218:43:062022
NAMESPACE: united-states
STATUS: deployed
REVISION:2
TEST SUITE:None
NOTES:
kube-prometheus-stack has been installed.Check its status by running:
kubectl --namespace united-states get pods -l "release=prometheus-united-states"Visit https://github.com/prometheus-operator/kube-prometheus for instructions on how to create & configure Alertmanager and Prometheus instances using the Operator.
你应核查Prometheus pod有一个额外的容器与它们一起运行。
~ kubectl -n europe get pods -l app.kubernetes.io/name=prometheus
NAME READY STATUS RESTARTS AGE
prometheus-prometheus-europe-prometheus-0 3/3 Running 0 48s
prometheus-prometheus-europe-prometheus-1 3/3 Running 0 65s~ kubectl -n united-states get pods -l app.kubernetes.io/name=prometheus
NAME READY STATUS RESTARTS AGE
prometheus-prometheus-united-states-prometheus-0 3/3 Running 0 44s
部署Thanos Querier以实现全局视图
Querier实现Prometheus HTTP v1 API,以便通过PromQL查询Thanos集群中的数据。它将允许你从单个端点获取指标。它先从底层StoreAPI收集评估查询所需的数据,之后评估查询,最后返回结果。
你利用kube-prometheus-stack来部署Thanos sidecar。遗憾的是,该图不支持其他Thanos 组件。为此,你将利用Banzai Cloud Helm Charts存储库。与以前一样,你先从添加存储库开始,就跟之前的做法一样。
~ helm repo add banzaicloud https://kubernetes-charts.banzaicloud.com"banzaicloud" has been added to your repositories
~ helm repo update
Hang tight while we grab the latest from your chart repositories......Successfully got an update from the "prometheus-community" chart repository
...Successfully got an update from the "banzaicloud" chart repository
UpdateComplete.⎈HappyHelming!⎈
为了模拟集中式监控解决方案,你将创建monitoring命名空间。
~ kubectl create namespace monitoring
namespace/monitoring created
下列配置可配置Thanos Querier,并将它指向Prometheus实例。
store:
enabled:false
compact:
enabled:false
bucket: https://thanos.io/v0.8/components/bucket/
enabled:false
rule:
enabled:false
sidecar:
enabled:false
queryFrontend:
enabled:false
query:
enabled:true
replicaLabels:
- replica
stores:
-"dnssrv+_grpc._tcp.prometheus-operated.europe.svc.cluster.local"
-"dnssrv+_grpc._tcp.prometheus-operated.united-states.svc.cluster.local"
有了上述配置,你可以随时部署Querier。
~ helm -n monitoring upgrade -i thanos banzaicloud/thanos -f query.yaml
Release"thanos" does not exist.Installing it now.
NAME: thanos
LAST DEPLOYED:SatJan2218:48:032022
NAMESPACE: monitoring
STATUS: deployed
REVISION:1
TEST SUITE:None~ kubectl -n monitoring port-forward svc/thanos-query-http 10902:10902Forwardingfrom127.0.0.1:10902->10902Forwardingfrom[::1]:10902->10902
使用port-forward,你可以连接到集群。应确保自己能执行多集群查询。你部署Prometheus后,设置replicaExternalLabelName: “replica”和prometheusExternalLabelName: “cluster”。重复数据删除功能将充分利用这些设置。启用该功能后,你可以确保对来自europe集群的指标执行重复数据删除。那是由于Thanos假设它们来自同一个高可用性组。之所以出现这种情况,是由于它们有相同的标签,除了副本标签外。
部署Thanos Query Frontend以提高可读性
最后一部分是部署Query Frontend(查询前端),这项服务可以放在Querier的前面,以提高可读性。它基于Cortex Query Frontend组件,支持拆分、重试、缓存和慢查询日志等功能。
# query.yaml
store:
enabled:false
compact:
enabled:false
bucket:
enabled:false
rule:
enabled:false
sidecar:
enabled:false
queryFrontend:
enabled:true
query:
enabled:true
replicaLabels:
- replica
stores:
-"dnssrv+_grpc._tcp.prometheus-operated.europe.svc.cluster.local"
-"dnssrv+_grpc._tcp.prometheus-operated.united-states.svc.cluster.local"
更新前面的配置以部署Query Frontend,你现在可以更新设置了。
~ helm -n monitoring upgrade -i thanos banzaicloud/thanos -f query.yaml
Release"thanos" has been upgraded.HappyHelming!
NAME: thanos
LAST DEPLOYED:SatJan2218:56:292022
NAMESPACE: monitoring
STATUS: deployed
REVISION:2
TEST SUITE:None~ kubectl -n monitoring port-forward svc/thanos-query-frontend-http 10902:10902Forwardingfrom127.0.0.1:10902->10902Forwardingfrom[::1]:10902->10902
再次使用port-forward,你就能够访问Query Frontend了。
Query Frontend是向多个Prometheus实例发送查询的入口点。执行这类查询的服务(比如Grafana)应通过Query Frontend进行查询。
参考链接:
https://www.kubernetes.org.cn/9877.html
扫码加我微信,进群和大佬们零距离