elasticsearch-sql 示例

本贴最后更新于 2686 天前,其中的信息可能已经事过景迁

背景

elasticsearch-sql 项目是一个非常活跃的 ES SQL 插件项目,能够紧跟 ES 官方的最新版本。

目前 ES 官方稳定版本为 5.5.1, 开发版本为 6.x;elasticsearch-sql 支持从 1.7.6 到 5.5.1 的所有 ES 稳定版本。

SQL 语法

Query

SELECT * FROM bank WHERE age >30 AND gender = 'm'

Aggregation

select COUNT(*),SUM(age),MIN(age) as m, MAX(age),AVG(age) FROM bank GROUP BY gender ORDER BY SUM(age), m DESC

Delete

DELETE FROM bank WHERE age >30 AND gender = 'm'
SELECT address FROM bank WHERE address = matchQuery('880 Holmes Lane') ORDER BY _score DESC LIMIT 3

Aggregations

range age group 20-25,25-30,30-35,35-40

SELECT COUNT(age) FROM bank GROUP BY range(age, 20,25,30,35,40)

range date group by day

SELECT online FROM online GROUP BY date_histogram(field='insert_time','interval'='1d')

range date group by your config

SELECT online FROM online GROUP BY date_range(field='insert_time','format'='yyyy-MM-dd' ,'2014-08-18','2014-08-17','now-8d','now-7d','now-6d','now')

ES Geographic

SELECT * FROM locations WHERE GEO_BOUNDING_BOX(fieldname,100.0,1.0,101,0.0)

Select type

SELECT * FROM indexName/type

SQL Features

  • SQL Select
  • SQL Delete
  • SQL Where
  • SQL Order By
  • SQL Group By
  • SQL AND & OR
  • SQL Like
  • SQL COUNT distinct
  • SQL In
  • SQL Between
  • SQL Aliases
  • SQL Not Null
  • SQL(ES) Date
  • SQL avg()
  • SQL count()
  • SQL last()
  • SQL max()
  • SQL min()
  • SQL sum()
  • SQL Nulls
  • SQL isnull()
  • SQL now()
  • SQL floor
  • SQL split
  • SQL trim
  • SQL log
  • SQL log10
  • SQL substring
  • SQL round
  • SQL sqrt
  • SQL concat_ws
  • SQL union and minus

Beyond sql features

  • ES TopHits
  • ES MISSING
  • ES STATS
  • ES GEO_INTERSECTS
  • ES GEO_BOUNDING_BOX
  • ES GEO_DISTANCE
  • ES GEOHASH_GRID aggregation

运行方式

官方推荐两种运行方式

  1. 插件方式:通过 rest api 调用
http://localhost:9200/_sql?sql=select * from indexName limit 10
  1. WebUI 方式:通过 Web UI 调用
http://localhost:9200/_plugin/sql

安装

官方安装文档

测试

mvn test

开发

  1. 下载 github 代码
  2. 本地启动一个 ES 5.5
  3. 跑单元测试

算子示例

聚合算子

COUNT

SELECT COUNT(*) FROM testindex/account

SUM

SELECT SUM(balance) FROM testindex/account

MIN

SELECT MIN(age) FROM testindex/account

MAX

SELECT MAX(age) FROM testindex/account

AVG

SELECT AVG(age) FROM testindex/account

STATS

SELECT STATS(age) FROM testindex/account

EXTENDED_STATS

SELECT EXTENDED_STATS(age) FROM testindex/account

PERCENTILES

SELECT PERCENTILES(age) FROM testindex/account

SELECT PERCENTILES(age,25.0,75.0) x FROM testindex/account

GROUP BY

SELECT COUNT(*) FROM testindex/account GROUP BY gender

SELECT COUNT(*) FROM testindex/account GROUP BY gender, terms('field'='age','size'=200,'alias'='age')"

SELECT COUNT(*) FROM testindex/account GROUP BY gender, terms('alias'='ageAgg','field'='age','size'=3)
  • TERMS WITH SIZE
SELECT COUNT(*) FROM testindex/account GROUP BY terms('alias'='ageAgg','field'='age','size'=3)
  • TERMS WITH MISSING
SELECT count(*) FROM testindex/gotCharacters GROUP BY terms('alias'='nick','field'='nickname','missing'='no_nickname')
  • TERMS WITH ORDER
SELECT count(*) FROM testindex/dog GROUP BY terms('field'='dog_name', 'alias'='dog_name', order='desc')

SELECT count(*) FROM testindex/dog GROUP BY terms('field'='dog_name', 'alias'='dog_name', order='asc')
  • ORDER BY ASC
SELECT COUNT(*) FROM testindex/account GROUP BY age ORDER BY COUNT(*)
  • ORDER BY DESC
SELECT COUNT(*) FROM testindex/account GROUP BY age ORDER BY COUNT(*) DESC
  • LIMIT
SELECT COUNT(*) FROM testindex/account GROUP BY age ORDER BY COUNT(*) LIMIT 5
  • COUNT GROUP BY
SELECT COUNT(age) FROM testindex/account GROUP BY range(age, 20,25,30,35,40)
  • COUNT GROUP BY DATE
select insert_time from online group by date_histogram(field='insert_time','interval'='1.5h','format'='yyyy-MM','min_doc_count'=5)
  • COUNT GROUP BY DATE WITH ALIAS
select insert_time from online group by date_histogram(field='insert_time','interval'='1.5h','format'='yyyy-MM','alias'='myAlias')
  • COUNT DATE RANGE
select online from online group by date_range(field='insert_time','format'='yyyy-MM-dd' ,'2014-08-18','2014-08-17','now-8d','now-7d','now-6d','now')

TOP HITS

  • top hit
select topHits('size'=3,age='desc') from testindex/accounts group by gender
  • top hit with include
select topHits('size'=3,age='desc',include=age) from testindex/account group by gender
  • top hit with include two fields
select topHits('size'=3,'include'='age,firstname',age='desc') from testindex/account group by gender
  • top hit with exclude
select topHits('size'=3,'exclude'='lastname',age='desc') from testindex/account group by gender
  • top hit with include and exclude
select topHits('size'=3,'exclude'='lastname','include'='firstname,lastname',age='desc') from testindex/account group by gender

未完待续

从 SQL 到 ES Query

上面啰嗦了一大堆,或许你根本对于 JDBC 不感兴趣,也对直接传入 SQL 返回结果不感兴趣,而仅仅对 SQL 解析为 ES Query 部分感兴趣。

这样,可以从 SQL 到 ES Query,利用熟悉的 SQL 来探索复杂的 ES 语法。

示例

Talk is cheap, Show me the code.

pom.xml

	<dependencies>
		<dependency>
			<groupId>org.elasticsearch</groupId>
			<artifactId>elasticsearch</artifactId>
			<version>5.5.1</version>
		</dependency>
		<dependency>
			<groupId>org.elasticsearch.plugin</groupId>
			<artifactId>parent-join-client</artifactId>
			<version>5.5.1</version>
		</dependency>
		<dependency>
			<groupId>org.elasticsearch.client</groupId>
			<artifactId>transport</artifactId>
			<version>5.5.1</version>
		</dependency>
	</dependencies>

下载 zip 包放到工程的目录下

https://github.com/NLPchina/elasticsearch-sql/releases/download/5.5.1.0/elasticsearch-sql-5.5.1.0.zip

文件

import java.sql.SQLFeatureNotSupportedException;

import org.elasticsearch.client.transport.TransportClient;
import org.elasticsearch.common.settings.Settings;
import org.elasticsearch.transport.client.PreBuiltTransportClient;
import org.nlpcn.es4sql.SearchDao;
import org.nlpcn.es4sql.exception.SqlParseException;
import org.nlpcn.es4sql.query.SqlElasticSearchRequestBuilder;

public class MainElasticsearchSql {
    public static void main(String[] args) throws SQLFeatureNotSupportedException, SqlParseException {
        Settings settings = Settings.builder().put("client.transport.ignore_cluster_name", true).build();

        TransportClient client = new PreBuiltTransportClient(settings);
        SearchDao searchDao = new SearchDao(client);

        String query = String.format("SELECT COUNT(*) as count FROM index/type");

        SqlElasticSearchRequestBuilder select = (SqlElasticSearchRequestBuilder)searchDao.explain(query).explain();
        System.out.println(select);
    }
}

执行结果

{
  "from" : 0,
  "size" : 0,
  "_source" : {
    "includes" : [
      "COUNT"
    ],
    "excludes" : [ ]
  },
  "aggregations" : {
    "count" : {
      "value_count" : {
        "field" : "_index"
      }
    }
  }
}

  • Java

    Java 是一种可以撰写跨平台应用软件的面向对象的程序设计语言,是由 Sun Microsystems 公司于 1995 年 5 月推出的。Java 技术具有卓越的通用性、高效性、平台移植性和安全性。

    3190 引用 • 8214 回帖 • 1 关注
  • Elasticsearch

    Elasticsearch 是一个基于 Lucene 的搜索服务器。它提供了一个分布式多用户能力的全文搜索引擎,基于 RESTful 接口。Elasticsearch 是用 Java 开发的,并作为 Apache 许可条款下的开放源码发布,是当前流行的企业级搜索引擎。设计用于云计算中,能够达到实时搜索,稳定,可靠,快速,安装使用方便。

    117 引用 • 99 回帖 • 209 关注

相关帖子

欢迎来到这里!

我们正在构建一个小众社区,大家在这里相互信任,以平等 • 自由 • 奔放的价值观进行分享交流。最终,希望大家能够找到与自己志同道合的伙伴,共同成长。

注册 关于
请输入回帖内容 ...