监控是任何微服务架构的关键点,尤其是任何基于云的架构。无论如何,你的架构需要有一个监控平台,这样它才能不断地观察系统的性能、可靠性、资源可用性和消耗、安全性和存储等。
但是,选择正确的平台可能很困难,因为有很多组件可以发挥作用。用于正确实施监控解决方案平台的任务如下:
- Use one platform: A platform that's capable of discovering and grasping information of the running systems, and aggregate the result in a comprehensive way using charts.
- Identify metrics and events: An application is responsible for exposing these metrics, and the platform should take only the ones that are the most relevant.
- Split data: Store application-monitoring data separately from infrastructure-monitoring data, but centralize the monitoring view.
- Alert: Provide alerts when limits are met, both for application and infrastructure. For example; when an application is performing slowly, and when the storage is running out of space.
- Observe user experience: Response times, throughput, latency, and errors.
在本章中,我们将介绍以下主题:
- Prometheus
- Node-exporter
- Grafana