ES 与 MongoDB 的搭配

装之前，请参考 https://github.com/richardwilly98/elasticsearch-river-mongodb 根据你的 MongoDB 版本号决定需要的 elasticsearch 版本号和插件号。

1)安装 ES

下载 ElasticSearch_版本号.tar.gz,官网上有，下载好之后。

tar -zvxf elasticsearch-1.1.0.tar.gz
cd elasticsearch-1.1.0

安装一下插件，也可以不安装，这个插件用来监控用的

./bin/plugin -i elasticsearch/marvel/latest

想了解这个插件可以参考官方文档

Free and Open Search: The Creators of Elasticsearch, ELK & Kibana | Elastic We're the creators of the Elastic (ELK) Stack -- Elasticsearch, Kibana, Beats, and Logstash. Securely and reliably search, analyze, and visualize your data in the cloud or on-prem. www.elasticsearch.org

2）执行程序

./elasticsearch

看到以下的就表示成功了

[2014-04-09 10:12:41,414][INFO ][node ] [Lorna Dane] version[1.1.0], pid[839], build[2181e11/2014-03-25T15:59:51Z]
[2014-04-09 10:12:41,415][INFO ][node ] [Lorna Dane] initializing ...
[2014-04-09 10:12:41,431][INFO ][plugins ] [Lorna Dane] loaded [], sites []
[2014-04-09 10:12:44,383][INFO ][node ] [Lorna Dane] initialized
[2014-04-09 10:12:44,384][INFO ][node ] [Lorna Dane] starting ...
[2014-04-09 10:12:44,495][INFO ][transport ] [Lorna Dane] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/XXXXXX:9300]}
[2014-04-09 10:12:47,522][INFO ][cluster.service ] [Lorna Dane] new_master [Lorna Dane][Ml-gTu_ZTniHR2mkpbMQ_A][XXXXX][inet[/XXXXXX:9300]], reason: zen-disco-join (elected_as_master)
[2014-04-09 10:12:47,545][INFO ][discovery ] [Lorna Dane] elasticsearch/Ml-gTu_ZTniHR2mkpbMQ_A
[2014-04-09 10:12:47,572][INFO ][http ] [Lorna Dane] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/XXXXX:9200]}
[2014-04-09 10:12:47,607][INFO ][gateway ] [Lorna Dane] recovered [0] indices into cluster_state
[2014-04-09 10:12:47,607][INFO ][node ] [Lorna Dane] started

如果想后台运行，则执行

./elasticsearch -d

想确认程序是否运行，则运行

lsof -i:9200
lsof -i:9300
一个是节点对外服务端口，一个是节点间交互端口（如果有集群的话）。

3）建立集群

配置文件路径是：

.....（你的实际路径)/config/elasticsearch.yml

默认是全部配置项都屏蔽的，

我修改后配置项如下：

cluster.name: ctoes ---配置集群的名字
node.name: "QiangZiGeGe"---配置节点的名字，注意有双引号

bootstrap.mlockall: true

没有提到的配置项都采用默认值，具体参数如何设置，还需要具体情况具体分析。

修改好后，启动 es,可以看到打印的消息里有别的节点名字，就表示建立集群成功。

注意：es 是自动探测局域网内的同名集群节点的。

查看集群的状态，可以通过：

curl 'http://localhost:9200/_cluster/health?pretty'

响应如下：

{
"cluster_name" : "ctoes",
"status" : "green",
"timed_out" : false,
"number_of_nodes" : 2,
"number_of_data_nodes" : 2,
"active_primary_shards" : 5,
"active_shards" : 10,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0
}

接下来来使用一下来得到直观感受

4）使用数据库感受一下

创建索引(相当于创建数据库)

示例如下：

[deployer@XXXXXXX0013 ~]$ curl -XPUT 'http://localhost:9200/test1?pretty' -d'

{
"settings":{
"number_of_shards":2,
"number_of_replicas":1
}
}
'
{
"acknowledged" : true
}

注意，这里的 number_of_shards 参数是一次性设置，设置之后永远不可以再修改的，但是 number_of_replicas 是可以随后可以修改的。

上面的 url 里的 test1 其实就是建立的索引(数据库)的名字，根据需要自己修改即可。

创建文档

curl -XPUT 'http://localhost:9200/test1/table1/1' -d '
{ "first":"dewmobile",
"last":"technology",
"age":3000,
"about":"hello,world",
"interest":["basketball","music"]
}
'
响应如下：
{"_index":"test1","_type":"table1","_id":"1","_version":1,"created":true}

表明创建文档成功

test1:建立的数据库名字

table1:建立的 type 名字，type 与关系数据库的 table 对应

1:自己制定的文档的主键，也可以不指定主键由数据库自己分配。

5）安装数据库同步插件

由于我们的数据源是放在 MongoDB 中的，所以这里只讲 MongoDB 数据源的数据同步。

插件源码：https://github.com/richardwilly98/elasticsearch-river-mongodb/

MongoDB River Plugin (作者 Richard Louapre)

简介：mongodb 同步插件，mongodb 必须搭成副本集的模式，因为这个插件的原理是通过定期读取 mongodb 中的 oplog 来同步数据。

如何安装使用呢？需要安装 2 个插件

1）插件 1

./plugin -install elasticsearch/elasticsearch-mapper-attachments/2.0.0

2)插件 2

./bin/plugin --install com.github.richardwilly98.elasticsearch/elasticsearch-river-mongodb/2.0.0

安装过程如下：

./bin/plugin --install com.github.richardwilly98.elasticsearch/elasticsearch-river-mongodb/2.0.0
-> Installing com.github.richardwilly98.elasticsearch/elasticsearch-river-mongodb/2.0.0...
Trying http://download.elasticsearch.org/com.github.richardwilly98.elasticsearch/elasticsearch-river-mongodb/elasticsearch-river-mongodb-2.0.0.zip...
Trying http://search.maven.org/remotecontent?filepath=com/github/richardwilly98/elasticsearch/elasticsearch-river-mongodb/2.0.0/elasticsearch-river-mongodb-2.0.0.zip...
Trying https://oss.sonatype.org/service/local/repositories/releases/content/com/github/richardwilly98/elasticsearch/elasticsearch-river-mongodb/2.0.0/elasticsearch-river-mongodb-2.0.0.zip...
Downloading .............................................................................................DONE
Installed com.github.richardwilly98.elasticsearch/elasticsearch-river-mongodb/2.0.0 into /usr/local/elasticsearch_1.1.0/elasticsearch/elasticsearch-1.1.0/plugins/river-mongodb

3）安装 elasticsearch-MySql 插件

具体请参考：

https://github.com/jprante/elasticsearch-river-jdbc 可以直接下载二进制 jar 包。

https://github.com/jprante/elasticsearch-river-jdbc

4）安装 mysql 驱动 jar 包（必须！）

这样，插件就装好了。

6）使用插件告知 ES 添加监听数据库任务

模板如下：

curl -XPUT localhost:9200/_river/mongo_resource/_meta -d '
{
"type":"mongodb",
"mongodb":{
"servers":
[{"host":"10.XX.XX.XX","port":"60004"}
],
"db":"zapya_api",
"collection":"resources"
},
"index":{
"name":"mongotest",
"type":"resources"
}}'

如果看到下面的内容表示创建成功

{"_index":"_river","_type":"mongodb","_id":"_meta","_version":1,"created":true}

然后，数据就导入到了 es 中了，索引建立成功。



如果是导入mysql,模板如下：

[deployer@XXX0014 ~]$ curl -XPUT 'localhost:9200/_river/my_jdbc_river/_meta' -d '{
> "type":"jdbc",
> "jdbc":{
> "url":"jdbc:mysql://localhost:3306/fastooth",
> "user":"XXX",
> "password":"XXX",
> "sql":"select *,base62Decode(display_name) as name from users"
> }
> }
> '

 更详细的是：

{
    "jdbc" :{
        "strategy" : "simple",
        "url" : null,
        "user" : null,
        "password" : null,
        "sql" : null,
        "schedule" : null,
        "poolsize" : 1,
        "rounding" : null,
        "scale" : 2,
        "autocommit" : false,
        "fetchsize" : 10, /* Integer.MIN for MySQL */
        "max_rows" : 0,
        "max_retries" : 3,
        "max_retries_wait" : "30s",
        "locale" : Locale.getDefault().toLanguageTag(),
        "index" : "jdbc",
        "type" : "jdbc",
        "bulk_size" : 100,
        "max_bulk_requests" : 30,
        "bulk_flush_interval" : "5s",
        "index_settings" : null,
        "type_mapping" : null
    }
}

对于schedule参数：设置调度时刻的

格式参考：http://www.quartz-scheduler.org/documentation/quartz-1.x/tutorials/crontrigger

http://elasticsearch-users.115913.n3.nabble.com/Ann-JDBC-River-Plugin-for-ElasticSearch-td4019418.html

http://www.quartz-scheduler.org/documentation/quartz-1.x/tutorials/crontrigger

https://github.com/jprante/elasticsearch-river-jdbc/issues/186

官方文档：

http://elasticsearch-users.115913.n3.nabble.com/Ann-JDBC-River-Plugin-for-ElasticSearch-td4019418.html

https://github.com/jprante/elasticsearch-river-jdbc/wiki/JDBC-River-parameters

https://github.com/jprante/elasticsearch-river-jdbc/wiki/Quickstart（包含如何删除任务）

附录：http://my.oschina.net/wenhaowu/blog/215219#OSC_h2_7 

 

测试过程中，会出现错误：

[7]: index [yyyy], type [rrrr], id [1964986], message [RemoteTransportException[[2sdfsdf][inet[/xxxxxxxxxx:9300]][bulk/shard]]; nested: EsRejectedExecutionException[rejected execution (queue capacity 50) on org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$1@3e82ee89]; ]

 

修改配置文件，在最后增加：

threadpool:
    bulk:
        type: fixed
        size: 60
        queue_size: 1000

至于这几个参数是什么意思，还请读者自己去弄明白。

参考：

http://stackoverflow.com/questions/20683440/elasticsearch-gives-error-about-queue-size

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-threadpool.html

 

~~~~~~~~~~~~~~~

关于客户端，我们使用了Play框架，正如数据库都需要驱动包一样，我们从官方网站上看到了这个

https://github.com/cleverage/play2-elasticsearch

关于中文分词，可以尝试使用Ansj.

关于创建索引：

curl -i -XPUT 'XXX:9200/fasth' -d '
{
"settings" :
{
"number_of_shards" : 3 ,
"number_of_replicas" : 1
}

}
'



创建映射

 

curl -i -XPUT  'http://localhost:9200/fa/users/_mapping' -d '
{

 "properties":
 {
  "_id":
  { 
  "type":"string",
  "index":"not_analyzed"
  },
  "name":
  {
  "type":"string"
  },
  "gender":
  {
  "type":"string",
  "index":"not_analyzed"
  },
  "primary_avatar":
  {
  "type":"string",
  "index":"not_analyzed"
  },
  "signature":
  {
  "type":"string",
  "index":"not_analyzed"
  }
 }

}
'


 

全量任务：
curl -XPUT  'xxx:9200/_river/mysql_users/_meta' -d '
{
 "type":"jdbc",
 "jdbc":
 {
 "url":"jdbc:mysql://XXX:3306/fastooth",
 "user":"XXX",
 "password":"XXX",
 "sql":"select distinct _id,base62Decode(display_name) as name,gender,primary_avatar,signature from users",
 "index":"XXX",
 "type":"XXX"
 }
}
'

 http://www.nosqldb.cn/1368777378160.html

ES ILM 策略

[图片] Index Lifecycle Management（ILM）策略 Elasticsearch 可以通过 Index Lifecycle Management (ILM) 策略自动创建每日滚动索引。以下是一个创建每日滚动索引的示例，配合 ILM 策略可以让索引根据数据增长自动创建新的每日索引，并在数据老化时移 ..

2023-12-12 ES ILM

Index Lifecycle Management 索引生命周期管理 (ILM) 是在 Elasticsearch 6.6（公测版）首次引入并在 6.7 版正式推出的一项功能。ILM 是 Elasticsearch 的一部分，主要用来管理索引 [图片] 标记节点属性首先标记哪些节点是热节点、温节点和（可选）冷节点。 ..

Docker 安装 ElasticSearch 和 Kibana

一、前言本篇博客主要记录了我安装最新版的 ElasticSearch 和 Kibana 的过程。我的操作系统是 Arch Linux，使用 Docker 来安装，相较于 7.x 版本，8.x 版本增加了一些安全配置，安装过程会更复杂一些，所以写了这篇博客来记录一下，希望可以帮到有需要的朋友。二、安装配置 Elast ..

【一】技术探索：SpringBoot 与 Elasticsearch 完美融合，WebFlux 响应式编程实现

【其一】安装新版的 Elasticsearch(8.8.0)与 Kibana(8.8.0) 原文发布于：实战：SpringBoot 与 Elasticsearch 完美融合，WebFlux 响应式编程实现，欢迎使用 RSS 订阅获取最新更新。 1. 前言文章包含以下内容：安装新版的 Elasticsearch(8. ..

es-client

elasticsearch 查询客户端。 elasticsearch 的客户端比较出名的就是 elasticsearch head 和 Kibana 了，但是 elasticsearch head 已经停止更新，且样式老旧，功能不全；而 Kibana 虽功能全面，但是启动麻烦，大部分功能用不上，很不灵活，所以采用 ..

ES 数据库备份快照

背景：某客户 UCSS-HA+DB 高可用环境，由于事件和日志量非常大，预估 20G+，考虑到导出事件和日志备份有一定风险导出失败，故考虑该手工备份 ES 数据库相关表下述操作部署，为 3.10 db 高可用环境操作，参考文档：[链接] 登录 db-master 服务器，启用并配置 nfs 挂载信息说明：为什么要 ..

欢迎来到这里！

我们正在构建一个小众社区，大家在这里相互信任，以平等 • 自由 • 奔放的价值观进行分享交流。最终，希望大家能够找到与自己志同道合的伙伴，共同成长。

关于

ES 与 MongoDB 的搭配

相关帖子

ES ILM 策略

2023-12-12 ES ILM

Docker 安装 ElasticSearch 和 Kibana

elasticsearch 知识总结

【一】技术探索：SpringBoot 与 Elasticsearch 完美融合，WebFlux 响应式编程实现

es-client

ES 数据库备份快照

欢迎来到这里！

近期热议

推荐标签标签

最新标签

ES 与 MongoDB 的搭配

相关帖子

ES ILM 策略

2023-12-12 ES ILM

Docker 安装 ElasticSearch 和 Kibana

elasticsearch 知识总结

【一】技术探索：SpringBoot 与 Elasticsearch 完美融合，WebFlux 响应式编程实现

es-client

ES 数据库备份快照

欢迎来到这里！

近期热议

推荐标签 标签

最新标签

推荐标签标签