Solr实践|Solr常用检索查询业务Demo
概述
本文针对已经入门的同学,提供各种类型的场景查询Demo,以及一些分析统计型的查询Demo。如果未接触过Solr的同学,首先参考Solr快速入门文档阅读推荐https://yq.aliyun.com/articles/727867文章快速入门学习一下Solr。本Demo为企业用户使用咨询时整理的,并不是特别多,正因如此,说明大部分企业查询检索功能都是Solr的基础功能,上手简单。还有少许排序导出、分析统计的Demo。
基础任意查询功能Demo
下载Git项目aliyun-apsaradb-hbase-demo:
https://github.com/aliyun/aliyun-apsaradb-hbase-demo
在Solr模块中,包含常用的每种简单查询SolrJ api使用方式,还有cursor分页、facet统计等demo,可以基于此直接改造用户业务合适的查询逻辑使用。
数据准备:
创建collection
使用SolrAddDocumentDemo.java示例插入100行示例数据
SolrQueryDemo.java包含的查询demo列表:
match all query 全匹配查询
term query 词汇精确查询
wildcard query 通配符查询
fuzzy query 模糊查询
phrase query 短语查询
proximity query 邻近查询
range query 范围查询
multi condition query 多条件任意组合
pagination 常规分页与cursor深翻
facet统计&function query等
全量匹配与导出Demo
以下两个Demo使用预先插入数据如下:
{"id":"1","group_s":"group1","test_i":"5","test_l":"10"},
{"id":"2","group_s":"group1","test_i":"5","test_l":"1000"},
{"id":"3","group_s":"group1","test_i":"5","test_l":"1000"},
{"id":"4","group_s":"group1","test_i":"10","test_l":"10"},
{"id":"5","group_s":"group2","test_i":"5","test_l":"10"},
{"id":"6","group_s":"group2","test_i":"5","test_l":"10"},
{"id":"7","group_s":"group2","test_i":"5","test_l":"1000"},
{"id":"8","group_s":"group2","test_i":"5","test_l":"1000"},
{"id":"9","group_s":"group2","test_i":"10","test_l":"10"},
{"id":"10","group_s":"group3","test_i":"4","test_l":"7"},
{"id":"11","group_s":"group3","test_i":"3","test_l":"9"}
Demo1:全量导出复杂条件并带排序的匹配结果数据集合
例如:某公司400亿数据根据各种匹配条件,并按照时间排序导出结果数据列表,再提供第三方机构二次利用。
CloudSolrClient solrClient = new CloudSolrClient.Builder().withZkHost("localhost:9983").withZkChroot("/").build();
String collection = "test";
String currentMark = null;
String nextCursorMark = CursorMarkParams.CURSOR_MARK_START;
do {
currentMark = nextCursorMark;
SolrQuery solrQuery = new SolrQuery("test_i:5 AND test_l:[999 TO *]");
solrQuery.setParam("sort","test_i desc,id desc");
solrQuery.add(CursorMarkParams.CURSOR_MARK_PARAM, currentMark);
solrQuery.setRows(1);
QueryResponse response = solrClient.query(collection, solrQuery);
nextCursorMark = response.getNextCursorMark();
System.out.println(response);
} while (!nextCursorMark.equals(currentMark));
solrClient.close();
Demo2:分组后取每个分组取min test_i的一条记录,并全量导出所有group
public static void main(String[] args) throws Exception{
CloudSolrClient solrClient = new CloudSolrClient.Builder().withZkHost("localhost:9983").withZkChroot("/").build();
String collection = "test";
String currentMark = null;
String nextCursorMark = CursorMarkParams.CURSOR_MARK_START;
do {
currentMark = nextCursorMark;
SolrQuery solrQuery = new SolrQuery("*:*");
solrQuery.setParam("fq","{!collapse field=group_s sort=$sort}");
solrQuery.setParam("sort","test_i desc,id asc");
solrQuery.add(CursorMarkParams.CURSOR_MARK_PARAM, currentMark);
solrQuery.setRows(1);
QueryResponse response = solrClient.query(collection, solrQuery);
nextCursorMark = response.getNextCursorMark();
System.out.println(response);
} while (!nextCursorMark.equals(currentMark));
solrClient.close();
}
聚合统计
以下两个Demo使用预先插入数据如下:
{"id":"1","group_s":"group1","test_i":"5","test_l":"10"},
{"id":"2","group_s":"group1","test_i":"5","test_l":"1000"},
{"id":"3","group_s":"group1","test_i":"5","test_l":"1000"},
{"id":"4","group_s":"group1","test_i":"10","test_l":"10"},
{"id":"5","group_s":"group2","test_i":"5","test_l":"10"},
{"id":"6","group_s":"group2","test_i":"5","test_l":"10"},
{"id":"7","group_s":"group2","test_i":"5","test_l":"1000"},
{"id":"8","group_s":"group2","test_i":"5","test_l":"1000"},
{"id":"9","group_s":"group2","test_i":"10","test_l":"10"},
{"id":"10","group_s":"group3","test_i":"4","test_l":"7"},
{"id":"11","group_s":"group3","test_i":"3","test_l":"9"}
Demo1:求min、max、avg、sum,单个field 与多个条件function处理的case
最大值如下:
curl "http://localhost:8983/solr/test/query" -d 'q=*:*&rows=0&json.facet={x:"max(sub(test_l,test_i))"}'
平均值如下:
curl "http://localhost:8983/solr/test/query" -d 'q=*:*&rows=0&json.facet={x:"avg(sub(test_l,test_i))"}'
其中sub(test_l,test_i)即为 "test_l - test_i"的意思,更多函数可以参考Solr官方文档function Query部分:
http://lucene.apache.org/solr/guide/7_3/function-queries.html#function-queries
Demo2:针对结果进行facet分类统计,做侧边导航
需求背景为常见的博客、电商网站搜索栏,输入关键字后,除了匹配结果列表给用户展示外,对这些结果能进行二次分类统计,作为二次导航,方便用户快速搜索更相关的内容。场景如下图:
阿里云栖社区搜索“solr”关键字的结果展示中,除了出现右边的相关文章列表外,还对所有匹配solr关键字的结果,进行一个分类统计,例如左上角的框框中,展示这些相关的文章中,博客、问答、聚能聊等主题的匹配条数,如果用户此时需要的是问答相关的文章,那么他就可以快速点击这个问答的tab进一步搜索想要的内容,从而提升了用户搜索体验。
实现方式可以参考aliyun-apsaradb-hbase-demo中SolrQueryDemo.java的 facetRangeDemo、facetFieldDemo 两个示例,可以自定义各种类型的查询,来对匹配结果的二次分类统计:
https://github.com/aliyun/aliyun-apsaradb-hbase-demo
小结
本文整理了常用的Solr查询Demo代码,以及部分需要分析统计的查询示例,如有其他查询需求与疑问,可以留言提问,后续更新可以增加上,方便其他用户拿来即用。
HBase 官方社区推荐必读好文
扫描二维码关注 HBase 技术社区