从es读取网站pv和响应时间等写入mysql
现生产环境nginx下有几十个vhost运行并发布在公网,质管部门每月要求提供不同站点的pv/访问成功率/响应时间等数据报表以此作为KPI。
我这样实现这一需求:nginx日志做json格式处理由es收集,然后通过grafana大屏展示。
刚开始几个月,我都是手工在es上导出报表,再进行数据加工核对,基本上需要1-2天的工作量,后来实在不想人工操作了,于是写脚本解决,便有了本文档。
刚刚说到采集的站点有几十个,每天es对应的索引数据较大,所以数据我只保存一个月,历史数据每天一条统计记录插入到数据库。
此次采集脚本思路是:
每天凌晨2点从es采集前一天的数据,插入到数据库。整个过程都有写日志,对于失败的操作,通过failed关键字标识,以此加入到监控。
对于这项工作,感觉一下子轻松了!
附脚本:
#!/bin/bash
time=`date -d "yesterday" +%Y-%m-%d`
datetime=`date '+%Y-%m-%d %H:%M:%S'`
#定义日志文件
topdir=$(cd $(dirname "$0") && pwd)
log_file=''${topdir}'/message.log'
#定义ES查询时间 [格式为 yyyy-mm-dd hh:mm:ss]
starttime=''${time}' 00:00:00'
endtime=''${time}' 23:59:59'
#转换查询时间为Unix时间戳(毫秒)
starttime1=`expr $(date -d "${starttime}" +%s%N) / 1000000`
endtime1=`expr $(date -d "${endtime}" +%s%N) / 1000000`
#从定义的查询时间获取月份和日期
month=`eval echo $starttime|awk -F' ' {'print $1'}|awk -F'-' {'print $2'}`
date=`eval echo $starttime|awk -F' ' {'print $1'}`
#定于查询站点
site='
www.test1.com|主数据系统
crm.test2.com|集团CRM
bbs.test.com|xxx论坛
pm.test.com|项目管理
monitor.test.com|监控服务
'
#定义ES信息
es_ip='x.x.x.x'
es_port='xxxx'
index='it-cluster_nginx-2021-'${month}'-*' #这里索引的 year 建议也写成变量
#定义MySQL数据库信息
db_ip='x.x.x.x'
db_name='kpireport'
db_user='kpireport'
db_passwd='xxxxxxx'
#定义写日志
function log()
{
echo "$datetime $1" | tee -a "$log_file"
}
#检查是否安装mysql客户端
which mysql 1>/dev/null 2>&1
if [ $? -eq 0 ]
then
mysql_path=`which mysql`
else
log "MySQL client not found,you need install first!exit..."
exit 1
fi
#定义写数据库
function insertdb()
{
$mysql_path -h${db_ip} -u${db_user} -p${db_passwd} <<EOF
use ${db_name};
$1;
EOF
}
#检查执行是否成功
function check_running_ok()
{
if [ $? -eq 0 ]
then
echo "$datetime $1" | tee -a "$log_file"
else
echo "$datetime $2" | tee -a "$log_file"
fi
}
for i in ${site}
do
eval `echo $i|awk -F'|' '{print "domain="$1"\t sysname="$2}'`
#定义pv获取方法
pv='
{
"track_total_hits": true,
"query": {
"bool": {
"filter": [{
"range": {
"@timestamp": {
"gte": '${starttime1}',
"lte": '${endtime1}',
"format": "epoch_millis"
}
}
}, {
"query_string": {
"analyze_wildcard": true,
"query": "domain:'${domain}' AND status:(NOT 500)"
}
}]
}
},
"aggs": {
"2": {
"date_histogram": {
"interval": "6h",
"field": "@timestamp",
"min_doc_count": 0,
"extended_bounds": {
"min": '${starttime1}',
"max": '${endtime1}'
},
"format": "epoch_millis"
},
"aggs": {}
}
}
}
'
#定义平均响应时间获取方法
reptime='
{
"track_total_hits": true,
"query": {
"bool": {
"filter": [{
"range": {
"@timestamp": {
"gte": '${starttime1}',
"lte": '${endtime1}',
"format": "epoch_millis"
}
}
}, {
"query_string": {
"analyze_wildcard": true,
"query": "domain:'${domain}' AND status:(NOT 500)"
}
}]
}
},
"aggs": {
"2": {
"date_histogram": {
"interval": "6h",
"field": "@timestamp",
"min_doc_count": 0,
"extended_bounds": {
"min": '${starttime1}',
"max": '${endtime1}'
},
"format": "epoch_millis"
},
"aggs": {
"1": {
"avg": {
"field": "responsetime"
}
}
}
},
"avgtime":
{"avg":
{"field": "responsetime"}
}
}
}
'
value1=`curl -s -XGET "http://${es_ip}:${es_port}/${index}/_search?pretty" -H 'Content-Type:application/json' -d "${pv}" |grep value|awk -F': ' {'print $2'}|tr -d ','`
value2=`curl -s -XGET "http://${es_ip}:${es_port}/${index}/_search?pretty" -H 'Content-Type:application/json' -d "${reptime}" |grep -A 1 'avgtime'|grep value|awk -F':' {'print $2'}| awk '$1=$1'`
echo ${date},${sysname},${domain},${value1},${value2}
insertdb "insert into kpi (date,sysname,domain,pv,reptime,create_time) values ('${date}','${sysname}','${domain}','${value1}','${value2}',now());" 2>/dev/null
check_running_ok "[${date},${domain} collect pv from es insert db sucessed!]" "[${date},${domain} collect pv from es insert db failed,please check!]"
done
nginx日志格式化
log_format main '{"@timestamp":"$time_iso8601",'
'"@source":"$server_addr",'
'"hostname":"$hostname",'
'"ip":"$http_x_forwarded_for",'
'"client":"$remote_addr",'
'"request_method":"$request_method",'
'"scheme":"$scheme",'
'"domain":"$server_name",'
'"referer":"$http_referer",'
'"request":"$request_url",'
'"args":"$args",'
'"size":$body_bytes_sent,'
'"status": $status,'
'"responsetime":$request_time,'
'"upstreamtime":"$upstream_response_time",'
'"upstreamaddr":"$upstream_addr",'
'"http_user_agent":"$http_user_agent",'
'"https":"$https"'
'}';
添加crontab计划任务
crontab -l
# 从ES获取pv插入到数据库
0 2 * * * source /etc/profile;sh /usr/local/kpi/getpv.sh
脚本执行结果:
2021-04-11,主数据系统,www.test1.com,101966,0.21503962067708154
2021-04-12 10:08:34 [2021-04-11,www.test1.com collect pv from es insert db sucessed!]
2021-04-11,集团CRM,crm.test2.com,2707,0.0430384186613155
2021-04-12 10:08:34 [2021-04-11,crm.test2.com collect pv from es insert db sucessed!]
2021-04-11,xxx论坛,bbs.test.com,1873,0.0029161773233321924
2021-04-12 10:08:34 [2021-04-11,bbs.test.com collect pv from es insert db sucessed!]
2021-04-11,项目管理,pm.test.com,2405,0.01842370085808373
2021-04-12 10:08:34 [2021-04-11,pm.test.com|m collect pv from es insert db sucessed!]
2021-04-11,监控服务,monitor.test.com,186729,0.3296082667271888
2021-04-12 10:08:34 [2021-04-11,monitor.test.com collect pv from es insert db sucessed!]
数据库中的记录:
添加对应监控和告警触发器
是不是很简单?快来实践一下吧!