vlambda博客
学习文章列表

MacOS 安装 Logstash、同步 MySQL 数据

在安装Logstash之前,我们先做一些准备工作,为以后做准备。

安装elasticsearch-analysis-ik分词器
下载地址:
  • https://github.com/medcl/elasticsearch-analysis-ik

  • https://github.com/medcl/elasticsearch-analysis-ik/releases


选择对应的版本,这里我们选择版本v7.5.2


下载地址:
  • https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v7.5.2/elasticsearch-analysis-ik-7.5.2.zip


下载完成后,创建插件文件夹 cd your-es-root/plugins/ && mkdir ik,然后将插件解压缩到文件夹 your-es-root/plugins/ik 目录下,最后重新启动elasticsearch。

MacdeMacBook-Pro:~ mac$ cd /usr/local/opt/elasticsearch-7.5.2/pluginsMacdeMacBook-Pro:plugins mac$ mkdir ikMacdeMacBook-Pro:plugins mac$ lsikMacdeMacBook-Pro:plugins mac$ cd ik/MacdeMacBook-Pro:ik mac$ lscommons-codec-1.9.jar httpclient-4.5.2.jarcommons-logging-1.2.jar httpcore-4.4.4.jarconfig plugin-descriptor.propertieselasticsearch-analysis-ik-7.5.2.jar plugin-security.policyMacdeMacBook-Pro:ik mac$ MacdeMacBook-Pro:ik mac$ ps -ef | grep elastic 501 29044 1 0 10:12上午 ttys000 0:55.60 /usr/local/opt/elasticsearch-7.5.2/jdk.app/Contents/Home/bin/java -Des.networkaddress.cache.ttl=60 -Des.networkaddress.cache.negative.ttl=10 -XX:+AlwaysPreTouch -Xss1m -Djava.awt.headless=true -Dfile.encoding=UTF-8 -Djna.nosys=true -XX:-OmitStackTraceInFastThrow -Dio.netty.noUnsafe=true -Dio.netty.noKeySetOptimization=true -Dio.netty.recycler.maxCapacityPerThread=0 -Dio.netty.allocator.numDirectArenas=0 -Dlog4j.shutdownHookEnabled=false -Dlog4j2.disable.jmx=true -Djava.locale.providers=COMPAT -Xms1g -Xmx1g -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -Djava.io.tmpdir=/var/folders/qn/0y2jrchd39zg39gh33046g_80000gn/T/elasticsearch-7903441459326882255 -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=data -XX:ErrorFile=logs/hs_err_pid%p.log -Xlog:gc*,gc+age=trace,safepoint:file=logs/gc.log:utctime,pid,tags:filecount=32,filesize=64m -XX:MaxDirectMemorySize=536870912 -Des.path.home=/usr/local/opt/elasticsearch-7.5.2 -Des.path.conf=/usr/local/opt/elasticsearch-7.5.2/config -Des.distribution.flavor=default -Des.distribution.type=tar -Des.bundled_jdk=true -cp /usr/local/opt/elasticsearch-7.5.2/lib/* org.elasticsearch.bootstrap.Elasticsearch -d 501 29074 29044 0 10:12上午 ttys000 0:00.04 /usr/local/opt/elasticsearch-7.5.2/modules/x-pack-ml/platform/darwin-x86_64/controller.app/Contents/MacOS/controller 501 33376 31070 0 10:35上午 ttys001 0:00.00 grep elasticMacdeMacBook-Pro:ik mac$ kill -9 29044MacdeMacBook-Pro:ik mac$ cd ../../MacdeMacBook-Pro:elasticsearch-7.5.2 mac$ bin/elasticsearch...[2020-05-19T10:38:08,872][INFO ][o.w.a.d.Monitor ] [MacdeMacBook-Pro.local] try load config from /usr/local/opt/elasticsearch-7.5.2/config/analysis-ik/IKAnalyzer.cfg.xml[2020-05-19T10:38:08,874][INFO ][o.w.a.d.Monitor ] [MacdeMacBook-Pro.local] try load config from /usr/local/opt/elasticsearch-7.5.2/plugins/ik/config/IKAnalyzer.cfg.xml...


下面我们测试下是否安装成功。


我们访问kibana并打开它的控制台。



依次执行下面的命令:

# 创建一个索引PUT ik_index
# 创建ik_index索引的mappingPOST /ik_index/_mapping{ "properties": { "content": { "type": "text", "analyzer": "ik_max_word", "search_analyzer": "ik_smart" } }}
# 索引一些内容POST /ik_index/_create/1{"content":"美国留给伊拉克的是个烂摊子吗"}
POST /ik_index/_create/2{"content":"公安部:各地校车将享最高路权"}
POST /ik_index/_create/3{"content":"中韩渔警冲突调查:韩警平均每天扣1艘中国渔船"}
POST /ik_index/_create/4{"content":"中国驻洛杉矶领事馆遭亚裔男子枪击 嫌犯已自首"}
# 查询并高亮显示POST /ik_index/_search{ "query" : { "match" : { "content" : "中国" }}, "highlight" : { "pre_tags" : ["<tag1>", "<tag2>"], "post_tags" : ["</tag1>", "</tag2>"], "fields" : { "content" : {} } }}


查询结果:

{ "took" : 595, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 2, "relation" : "eq" }, "max_score" : 0.642793, "hits" : [ { "_index" : "ik_index", "_type" : "_doc", "_id" : "3", "_score" : 0.642793, "_source" : { "content" : "中韩渔警冲突调查:韩警平均每天扣1艘中国渔船" }, "highlight" : { "content" : [ "中韩渔警冲突调查:韩警平均每天扣1艘<tag1>中国</tag1>渔船" ] } }, { "_index" : "ik_index", "_type" : "_doc", "_id" : "4", "_score" : 0.642793, "_source" : { "content" : "中国驻洛杉矶领事馆遭亚裔男子枪击 嫌犯已自首" }, "highlight" : { "content" : [ "<tag1>中国</tag1>驻洛杉矶领事馆遭亚裔男子枪击 嫌犯已自首" ] } } ] }}


分词器我们已经安装成功了。


接下来我们安装logstash。


官网下载:

  • https://www.elastic.co/cn/downloads/

  • logstash-7.5.2:

    https://artifacts.elastic.co/downloads/logstash/logstash-7.5.2.tar.gz


下载完成后解压缩并将文件移动到 /usr/local/opt/ 目录下


启动前我们需要先配置一下,准备一个 logstash.conf

MacdeMacBook-Pro:~ mac$ cd /usr/local/opt/logstash-7.5.2MacdeMacBook-Pro:logstash-7.5.2 mac$ vi logstash.conf
input { stdin { } }output { elasticsearch { hosts => ["localhost:9200"] } stdout { codec => rubydebug }}
# 保存并退出
MacdeMacBook-Pro:logstash-7.5.2 mac$ lsCONTRIBUTORS config modulesGemfile data toolsGemfile.lock lib vendorLICENSE.txt logstash-core x-packNOTICE.TXT logstash-core-plugin-apibin logstash.confMacdeMacBook-Pro:logstash-7.5.2 mac$


下面我们可以执行下面的命令来启动 logstash,不过我们的目的是同步MySQL数据,所以这步操作我们先不做。

MacdeMacBook-Pro:logstash-7.5.2 mac$ bin/logstash -f logstash.conf


要想同步MySQL数据,我们需要进行一些配置:
  • 使用logstash的plugins-inputs-jdbc。

  • 我们连接到MySQL数据库,还需要下载官网提供的JDBC驱动程序库


我下载的是mysql-connector-java-5.1.49版本


  • https://dev.mysql.com/downloads/connector/j/



下载完成后解压,将mysql-connector-java-5.1.49-bin.jar放到logstash的bin目录下。


我们先创建一个my_test数据库,然后创建一个表并写入一些数据。

CREATE TABLE `admin_log` ( `id` int(10) unsigned NOT NULL AUTO_INCREMENT COMMENT '管理员记录表id', `admin_id` int(10) unsigned NOT NULL DEFAULT '0' COMMENT '管理员id', `admin_name` varchar(50) NOT NULL DEFAULT '' COMMENT '管理员名称', `content` varchar(200) NOT NULL DEFAULT '' COMMENT '内容', `create_time` datetime DEFAULT NULL COMMENT '创建时间', PRIMARY KEY (`id`), KEY `idx_admin_id` (`admin_id`)) ENGINE=InnoDB DEFAULT CHARSET=utf8;
INSERT INTO `admin_log` (`id`, `admin_id`, `admin_name`, `content`, `create_time`)VALUES (1,1,'kly','管理员,测试内容1-1','2019-12-12 10:10:10'), (2,1,'kly','测试内容1-2','2019-12-13 10:10:10'), (3,1,'kly','测试内容1-3','2019-12-14 10:10:10'), (4,2,'lm','测试内容110','2019-12-15 10:10:10'), (5,2,'lm','测试内容119','2019-12-16 10:10:10'), (6,2,'lm','测试内容120','2019-12-17 10:10:10'), (7,2,'lm','测试内容100','2019-12-18 10:10:10'), (8,1,'kly','测试内容200','2019-12-19 10:10:10'), (9,1,'kly','测试内容201','2019-12-20 10:10:10'), (10,1,'kly','测试内容202','2019-12-21 10:10:10');


我们重新配置下 logstash.conf:

input { jdbc { jdbc_driver_library => "/usr/local/opt/logstash-7.5.2/bin/mysql-connector-java-5.1.49.jar" jdbc_driver_class => "com.mysql.jdbc.Driver" jdbc_connection_string => "jdbc:mysql://localhost:3306/my_test&useSSL=false" jdbc_user => "root" jdbc_password => "123456" parameters => { "admin_id" => 1 } schedule => "* * * * *" statement => "SELECT * from admin_log where admin_id = :admin_id" }}
output { elasticsearch { hosts => ["localhost:9200"] document_id => "%{id}" index => "logstash-admin-log" }}


在此示例中,我们使用用户 root 连接到 my_test 数据库,并希望在 admin_log 表中获取 admin_id=1 匹配的所有内容。此示例中的选项将指示插件每分钟在执行一次此查询语句。


下面我们启动logstash:

MacdeMacBook-Pro:logstash-7.5.2 mac$ bin/logstash -f logstash.conf
# 一些启动信息Thread.exclusive is deprecated, use Thread::MutexSending Logstash logs to /usr/local/opt/logstash-7.5.2/logs which is now configured via log4j2.properties[2020-05-19T11:56:17,857][WARN ][logstash.config.source.multilocal] Ignoring the 'pipelines.yml' file because modules or command line options are specified[2020-05-19T11:56:18,075][INFO ][logstash.runner ] Starting Logstash {"logstash.version"=>"7.5.2"}[2020-05-19T11:56:20,754][INFO ][org.reflections.Reflections] Reflections took 55 ms to scan 1 urls, producing 20 keys and 40 values [2020-05-19T11:56:22,775][INFO ][logstash.outputs.elasticsearch][main] Elasticsearch pool URLs updated {:changes=>{:removed=>[], :added=>[http://localhost:9200/]}}[2020-05-19T11:56:23,082][WARN ][logstash.outputs.elasticsearch][main] Restored connection to ES instance {:url=>"http://localhost:9200/"}[2020-05-19T11:56:23,162][INFO ][logstash.outputs.elasticsearch][main] ES Output version determined {:es_version=>7}[2020-05-19T11:56:23,170][WARN ][logstash.outputs.elasticsearch][main] Detected a 6.x and above cluster: the `type` event field won't be used to determine the document _type {:es_version=>7}[2020-05-19T11:56:23,259][INFO ][logstash.outputs.elasticsearch][main] New Elasticsearch output {:class=>"LogStash::Outputs::ElasticSearch", :hosts=>["//localhost:9200"]}[2020-05-19T11:56:23,322][INFO ][logstash.outputs.elasticsearch][main] Using default mapping template[2020-05-19T11:56:23,444][INFO ][logstash.outputs.elasticsearch][main] Attempting to install template {:manage_template=>{"index_patterns"=>"logstash-*", "version"=>60001, "settings"=>{"index.refresh_interval"=>"5s", "number_of_shards"=>1}, "mappings"=>{"dynamic_templates"=>[{"message_field"=>{"path_match"=>"message", "match_mapping_type"=>"string", "mapping"=>{"type"=>"text", "norms"=>false}}}, {"string_fields"=>{"match"=>"*", "match_mapping_type"=>"string", "mapping"=>{"type"=>"text", "norms"=>false, "fields"=>{"keyword"=>{"type"=>"keyword", "ignore_above"=>256}}}}}], "properties"=>{"@timestamp"=>{"type"=>"date"}, "@version"=>{"type"=>"keyword"}, "geoip"=>{"dynamic"=>true, "properties"=>{"ip"=>{"type"=>"ip"}, "location"=>{"type"=>"geo_point"}, "latitude"=>{"type"=>"half_float"}, "longitude"=>{"type"=>"half_float"}}}}}}}[2020-05-19T11:56:23,448][WARN ][org.logstash.instrument.metrics.gauge.LazyDelegatingGauge][main] A gauge metric of an unknown type (org.jruby.specialized.RubyArrayOneObject) has been create for key: cluster_uuids. This may result in invalid serialization. It is recommended to log an issue to the responsible developer/development team.[2020-05-19T11:56:23,465][INFO ][logstash.javapipeline ][main] Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>4, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50, "pipeline.max_inflight"=>500, "pipeline.sources"=>["/usr/local/opt/logstash-7.5.2/logstash.conf"], :thread=>"#<Thread:0x3cd23c45 run>"}[2020-05-19T11:56:23,790][INFO ][logstash.javapipeline ][main] Pipeline started {"pipeline.id"=>"main"}[2020-05-19T11:56:23,930][INFO ][logstash.agent ] Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}[2020-05-19T11:56:24,348][INFO ][logstash.agent ] Successfully started Logstash API endpoint {:port=>9600}/usr/local/opt/logstash-7.5.2/vendor/bundle/jruby/2.5.0/gems/rufus-scheduler-3.0.9/lib/rufus/scheduler/cronline.rb:77: warning: constant ::Fixnum is deprecated[2020-05-19T11:57:01,665][INFO ][logstash.inputs.jdbc     ][main] (0.028732s) SELECT * from admin_log where admin_id = 1


我们在控制台查看一下:

POST /logstash-admin-log/_search
# 查询结果{ "took" : 2, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 6, "relation" : "eq" }, "max_score" : 1.0, "hits" : [ { "_index" : "logstash-admin-log", "_type" : "_doc", "_id" : "3", "_score" : 1.0, "_source" : { "id" : 3, "@timestamp" : "2020-05-19T04:04:00.300Z", "admin_id" : 1, "@version" : "1", "admin_name" : "kly", "content" : "测试内容1-3", "create_time" : "2019-12-14T16:10:10.000Z" } }, { "_index" : "logstash-admin-log", "_type" : "_doc", "_id" : "8", "_score" : 1.0, "_source" : { "id" : 8, "@timestamp" : "2020-05-19T04:04:00.301Z", "admin_id" : 1, "@version" : "1", "admin_name" : "kly", "content" : "测试内容200", "create_time" : "2019-12-19T16:10:10.000Z" } }, { "_index" : "logstash-admin-log", "_type" : "_doc", "_id" : "10", "_score" : 1.0, "_source" : { "id" : 10, "@timestamp" : "2020-05-19T04:04:00.302Z", "admin_id" : 1, "@version" : "1", "admin_name" : "kly", "content" : "测试内容202", "create_time" : "2019-12-21T16:10:10.000Z" } }, { "_index" : "logstash-admin-log", "_type" : "_doc", "_id" : "9", "_score" : 1.0, "_source" : { "id" : 9, "@timestamp" : "2020-05-19T04:04:00.301Z", "admin_id" : 1, "@version" : "1", "admin_name" : "kly", "content" : "测试内容201", "create_time" : "2019-12-20T16:10:10.000Z" } }, { "_index" : "logstash-admin-log", "_type" : "_doc", "_id" : "1", "_score" : 1.0, "_source" : { "id" : 1, "@timestamp" : "2020-05-19T04:04:00.299Z", "admin_id" : 1, "@version" : "1", "admin_name" : "kly", "content" : "管理员,测试内容1-1", "create_time" : "2019-12-12T16:10:10.000Z" } }, { "_index" : "logstash-admin-log", "_type" : "_doc", "_id" : "2", "_score" : 1.0, "_source" : { "id" : 2, "@timestamp" : "2020-05-19T04:04:00.300Z", "admin_id" : 1, "@version" : "1", "admin_name" : "kly", "content" : "测试内容1-2", "create_time" : "2019-12-13T16:10:10.000Z" } } ] }}


好了,这次就先到这,后面我们再实践写其他内容。