ELK 学习笔记(一)—— 服务部署与 Elasticsearch 基础知识


概要

ELK 是 Elasticsearch、Logstash、Kibana的简称,Elasticsearch 是用于全文检索,Logstash 用于处理实时数据的汇集,Kibana 提供页面用于对数据进行分析统计并展示

环境

物理机 Centos 7

Docker 版本:19.03.5

部署

本次学习使用 Docker 进行部署,镜像使用 sebp/elk,部署很方便

首先,在物理机执行如下命令,避免报错: max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144]

sysctl -w vm.max_map_count=262144

启动容器

docker run --ulimit nofile=65536:65536 -p 5601:5601 -p 9200:9200 -p 5044:5044 -p 5045:5045 -p 5046:5046 -d --restart=always -v /etc/logstash:/etc/logstash -v /etc/localtime:/etc/localtime --name elk sebp/elk

稍等片刻后使用浏览器访问 http://服务器IP:9200 可以看到 Elasticsearch 服务版本及基础信息

{
  "name" : "elk",
  "cluster_name" : "elasticsearch",
  "cluster_uuid" : "CtPcMe_ERQKzvt8Sul3MNQ",
  "version" : {
    "number" : "7.4.0",
    "build_flavor" : "default",
    "build_type" : "tar",
    "build_hash" : "22e1767283e61a198cb4db791ea66e3f11ab9910",
    "build_date" : "2019-09-27T08:36:48.569419Z",
    "build_snapshot" : false,
    "lucene_version" : "8.2.0",
    "minimum_wire_compatibility_version" : "6.8.0",
    "minimum_index_compatibility_version" : "6.0.0-beta1"
  },
  "tagline" : "You Know, for Search"
}

Elasticsearch 的接口都是 REST 风格的,一般 GET 用于获取数据,PUT / POST 用于更新数据,POST 用于新增数据,DELETE 用于删除

使用 Curl 命令可以方便的调用接口,基本参数如下

参数 说明 示例
-H 指定请求头 -H 'Content-Type:application/json' 提交的内容为JSON格式
-d 指定请求体 -d '
-X 指定请求方法 -XGET 指定访问接口方法为GET

创建索引

创建一个索引,用于保存数据

curl -XPUT http://127.0.0.1:9200/books_idx

如果想要更好的JSON格式化输出,添加 pretty 说明

curl -XPUT http://127.0.0.1:9200/books_idx?pretty

删除索引

curl -XDELETE http://127.0.0.1:9200/books_idx

插入数据

原始 json 格式数据

{"name": "深入浅出Rust", "author": "范长春", "publish": "机械工业出版社", "isbn": "9787111606420", "pubdate": "2018"}
{"name": "活着", "author": "余华", "publish": "作家出版社", "isbn": "9787506365437", "pubdate": "2017"}
{"name": "数学之美", "author": "吴军", "publish": "人民邮电出版社", "isbn": "9787115373557", "pubdate": "2014"}
{"name": "LaTeX入门", "author": "刘海洋", "publish": "电子工业出版社", "isbn": "9787121202087", "pubdate": "2013"}
{"name": "Rust编程之道", "author": "张汉东", "publish": "电子工业出版社", "isbn": "9787121354854", "pubdate": "2019"}

转换为 Curl 命令

curl -XPOST http://127.0.0.1:9200/books_idx/_doc/9787111606420?pretty -H 'Content-Type: application/json' -d '{"name": "深入浅出Rust", "author": "范长春", "publish": "机械工业出版社", "isbn": "9787111606420", "pubdate": "2018"}'

curl -XPOST http://127.0.0.1:9200/books_idx/_doc/9787506365437?pretty -H 'Content-Type: application/json' -d '{"name": "活着", "author": "余华", "publish": "作家出版社", "isbn": "9787506365437", "pubdate": "2017"}'

curl -XPOST http://127.0.0.1:9200/books_idx/_doc/9787115373557?pretty -H 'Content-Type: application/json' -d '{"name": "数学之美", "author": "吴军", "publish": "人民邮电出版社", "isbn": "9787115373557", "pubdate": "2014"}'

curl -XPOST http://127.0.0.1:9200/books_idx/_doc/9787121202087?pretty -H 'Content-Type: application/json' -d '{"name": "LaTeX入门", "author": "刘海洋", "publish": "电子工业出版社", "isbn": "9787121202087", "pubdate": "2013"}'

curl -XPOST http://127.0.0.1:9200/books_idx/_doc/9787121354854?pretty -H 'Content-Type: application/json' -d '{"name": "Rust编程之道", "author": "张汉东", "publish": "电子工业出版社", "isbn": "9787121354854", "pubdate": "2019"}'

提交的地址 http://127.0.0.1:9200/books_idx/_doc/9787111606420?pretty 其中 9787111606420 是我自己用数据指定的本条数据 id

curl -XPOST http://127.0.0.1:9200/books_idx/_doc?pretty -H 'Content-Type: application/json' -d '{"name": "深入浅出Rust", "author": "范长春", "publish": "机械工业出版社", "isbn": "9787111606420", "pubdate": "2018"}'

如果不指定,系统会自动生成本条数据的 id

{
  "_index" : "books_idx",
  "_type" : "_doc",
  "_id" : "plCP9G8BXUFnJZ4gtu4W",
  "_version" : 1,
  "result" : "created",
  "_shards" : {
    "total" : 2,
    "successful" : 1,
    "failed" : 0
  },
  "_seq_no" : 8,
  "_primary_term" : 1
}

更新数据

使用 PUT 和 POST 都可以覆盖之前提交的数据

curl -XPOST http://127.0.0.1:9200/books_idx/_doc/9787111606420?pretty -H 'Content-Type: application/json' -d '{"name": "深入浅出Rust", "author": "范长春", "publish": "机械工 业出版社", "isbn": "9787111606420", "pubdate": "2018"}'

每次提交后返回的 json 版本是不同的 "_version" : 7

以上更新其实是移除原有数据,然后重新创建一个相同 id 值的数据

如果只需要更新数据的一部分,可以使用部分更新

curl -XPOST http://127.0.0.1:9200/books_idx/_doc/9787111606420/_update?pretty -H 'Content-Type: application/json' -d '{"doc": {"pubdate": "2000"}}'

查询数据

全局模糊搜索:http://192.168.0.101:9200/books_idx/_search?q=Rust

{"took":1,"timed_out":false,"_shards":{"total":1,"successful":1,"skipped":0,"failed":0},"hits":{"total":{"value":3,"relation":"eq"},"max_score":0.62883455,"hits":[{"_index":"books_idx","_type":"_doc","_id":"9787121354854","_score":0.62883455,"_source":{"name": "Rust编程之道", "author": "张汉东", "publish": "电子工业出版社", "isbn": "9787121354854", "pubdate": "2019"}},{"_index":"books_idx","_type":"_doc","_id":"9787111606421","_score":0.62883455,"_source":{"name": "深入浅出Rust", "author": "范长春", "publish": "机械工业出版社", "isbn": "9787111606420", "pubdate": "2018"}},{"_index":"books_idx","_type":"_doc","_id":"9787111606420","_score":0.62883455,"_source":{"name": "深入浅出Rust", "author": "范长春", "publish": "机械工业出版社", "isbn": "9787111606420", "pubdate": "2018"}}]}}

查询name字段包含:http://192.168.0.101:9200/books_idx/_search?q=name:%E6%95%B0%E5%AD%A6

{"took":1,"timed_out":false,"_shards":{"total":1,"successful":1,"skipped":0,"failed":0},"hits":{"total":{"value":1,"relation":"eq"},"max_score":3.0808902,"hits":[{"_index":"books_idx","_type":"_doc","_id":"9787115373557","_score":3.0808902,"_source":{"name": "数学之美", "author": "吴军", "publish": "人民邮电出版社", "isbn": "9787115373557", "pubdate": "2014"}}]}}

指定两个字段包含:http://192.168.0.101:9200/books_idx/_search?q=name:Rust&q=author:张汉东

{"took":1,"timed_out":false,"_shards":{"total":1,"successful":1,"skipped":0,"failed":0},"hits":{"total":{"value":1,"relation":"eq"},"max_score":4.3965135,"hits":[{"_index":"books_idx","_type":"_doc","_id":"9787121354854","_score":4.3965135,"_source":{"name": "Rust编程之道", "author": "张汉东", "publish": "电子工业出版社", "isbn": "9787121354854", "pubdate": "2019"}}]}}

获取指定数据详情:http://192.168.0.101:9200/books_idx/_doc/9787115373557/_source

{"name": "数学之美", "author": "吴军", "publish": "人民邮电出版社", "isbn": "9787115373557", "pubdate": "2014"}

获取数据详情并指定字段:http://192.168.0.101:9200/books_idx/_doc/9787115373557/_source?_source=author,name

{"author":"吴军","name":"数学之美"}

重要补充

Elasticsearch 一直在弱化 Type 类型,为后续移除它做准备,比如网上很多教程,创建索引操作为:

PUT {index}/{type}/{id} 7.0以后版本应使用 PUT {index}/_doc/{id}

花时间了解 Elasticsearch 后,发现其 REST 接口非常简洁方便,CRUD仅仅是开始,比如目前还是单机环境,集群部署需要了解,大数据量的录入需要测试,另外elk还有另外两个,想要真正在项目中发挥其威力,后续需要学习的还有很多

参考

  1. 最新ElasticSearch快速入门教程【 2019 千锋大数据】
  2. Elasticsearch 移除 type 之后的新姿势