vlambda博客
学习文章列表

隔山打牛之-借助nginx解析rgw日志

隔山打牛之-借助nginx解析rgw日志

需求及背景

知识tips:一般情况下每一个客户端发往RGW的HTTP请求都会在其header里面包含authorization这个字段,该字段中包含了用户的Access_key信息,但是AWS2和ASW4两种签名格式各不相同。
背景:业务在访问RGW服务的时候会记录对应的log,对比nginx一类的专业产品,原生的RGW日志格式和内容都太过粗糙,如果去改动RGW代码虽然可以满足需求,但是后续格式变化又要批量更新RGW,对运维造成不便,而且从管理的角度来看这类改动收益较低。因此从充分解耦的思想出发,想借助nginx来实现日志格式的标准化管理,因此在RGW前端架设了一层nginx作为反向代理。但是原生的nginx日志是无法解析出每个HTTP请求的authorization字段的,因此有了这篇文章。

解决方案

思路:通过nginx内置的map指令,解析http的header字段,将authorization字段按正则匹配进行解析,将解析出来的结果保存在自定义的变量access_key里面,最终将access_key的内容存储到log中,完成最终解析。

map指令用例参考:https://www.nginx.com/resources/wiki/start/topics/examples/forwarded/

nginx配置内容如下,正则部分比较粗糙,只是抛砖引玉,大家可以根据情况自己做一些优化。

http {
   map $http_authorization $access_key {
   default "anonymous"#未匹配的设置为匿名用户
   ~^AWS[\ ](.*):(.*) $1; #匹配AWS2签名
  ~^AWS4-HMAC-SHA256[\ ]Credential=(.*)/(.*)/(.*)/s3/aws4_request $1; #匹配AWS4签名
  }

    log_format  json  '{"scheme":"$scheme","http_host":"$http_host","remote_addr":"$remote_addr","server_addr":"$server_addr","time_local":"[$time_local]","request":"$request","status":$status,"body_bytes_sent":$body_bytes_sent,"http_referer":"$http_referer","http_user_agent":"$http_user_agent","upstream_addr":"$upstream_addr","upsteam_response_time":$upstream_response_time,"request_time":$request_time,"http_x_forwarded_for":"$http_x_forwarded_for","content_length":"$content_length","request_length":$request_length,"request_method":"$request_method","server_protocol":"$server_protocol","request_uri":"$request_uri","x_rgw_request_id":"$upstream_http_x_amz_request_id","access_key":"$authorization"}';
    access_log  /var/log/nginx/access.log  json;

最终效果

通过查看/var/log/nginx/access.log,可以看到最终效果如下

{"scheme":"http","http_host":"s3.cephbook.com","remote_addr":"127.0.0.1","server_addr":"127.0.0.1","time_local":"[15/Feb/2020:23:49:08 -0500]","request":"GET /rgw-demo/?delimiter=%2F HTTP/1.1","status":404,"body_bytes_sent":3650,"http_referer":"-","http_user_agent":"-","upstream_addr":"-","upsteam_response_time":-,"request_time":0.000,"http_x_forwarded_for":"-","content_length":"0","request_length":453,"request_method":"GET","server_protocol":"HTTP/1.1","request_uri":"/rgw-demo/?delimiter=%2F","x_rgw_request_id":"-","access_key":"B45IHF34SQPKDNHAUVVV"}
{"scheme":"http","http_host":"localhost","remote_addr":"127.0.0.1","server_addr":"127.0.0.1","time_local":"[15/Feb/2020:23:50:30 -0500]","request":"GET / HTTP/1.1","status":200,"body_bytes_sent":4833,"http_referer":"-","http_user_agent":"curl/7.29.0","upstream_addr":"-","upsteam_response_time":-,"request_time":0.000,"http_x_forwarded_for":"-","content_length":"-","request_length":73,"request_method":"GET","server_protocol":"HTTP/1.1","request_uri":"/","x_rgw_request_id":"-","access_key":"anonymous"}
{"scheme":"http","http_host":"s3.cephbook.com","remote_addr":"127.0.0.1","server_addr":"127.0.0.1","time_local":"[15/Feb/2020:23:52:40 -0500]","request":"GET / HTTP/1.1","status":200,"body_bytes_sent":4833,"http_referer":"-","http_user_agent":"-","upstream_addr":"-","upsteam_response_time":-,"request_time":0.000,"http_x_forwarded_for":"-","content_length":"0","request_length":207,"request_method":"GET","server_protocol":"HTTP/1.1","request_uri":"/","x_rgw_request_id":"-","access_key":"B45IHF34SQPKDNHAUVVV"}