博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
lucene .doc里存储的skiplist跳表
阅读量:6825 次
发布时间:2019-06-26

本文共 2479 字,大约阅读时间需要 8 分钟。

http://forfuture1978.iteye.com/blog/546841

见图:

lucene-6.5.1-src/lucene-6.5.1

$ grep "skiplistwriter" * -ril
core/src/java/org/apache/lucene/codecs/lucene50/Lucene50PostingsFormat.java
core/src/java/org/apache/lucene/codecs/lucene50/Lucene50SkipWriter.java
core/src/java/org/apache/lucene/codecs/MultiLevelSkipListReader.java
core/src/java/org/apache/lucene/codecs/MultiLevelSkipListWriter.java

 

测试代码位置:

lucene-6.5.1-src/lucene-6.5.1

$ vim core/src/test/org/apache/lucene/codecs/lucene50/TestBlockPostingsFormat3.java

/**   * checks advancing docs   */  public void assertDocsSkipping(int docFreq, PostingsEnum leftDocs, PostingsEnum rightDocs) throws Exception {    if (leftDocs == null) {      assertNull(rightDocs);      return;    }    int docid = -1;    int averageGap = MAXDOC / (1+docFreq);    int skipInterval = 16;    while (true) {      if (random().nextBoolean()) {        // nextDoc()        docid = leftDocs.nextDoc();        assertEquals(docid, rightDocs.nextDoc());      } else {        // advance()        int skip = docid + (int) Math.ceil(Math.abs(skipInterval + random().nextGaussian() * averageGap));        docid = leftDocs.advance(skip);        assertEquals(docid, rightDocs.advance(skip));      }      if (docid == DocIdSetIterator.NO_MORE_DOCS) {        return;      }      // we don't assert freqs, they are allowed to be different    }  }

 

/**   * checks advancing docs + positions   */  public void assertPositionsSkipping(int docFreq, PostingsEnum leftDocs, PostingsEnum rightDocs) throws Exception {    if (leftDocs == null || rightDocs == null) {      assertNull(leftDocs);      assertNull(rightDocs);      return;    }    int docid = -1;    int averageGap = MAXDOC / (1+docFreq);    int skipInterval = 16;    while (true) {      if (random().nextBoolean()) {        // nextDoc()        docid = leftDocs.nextDoc();        assertEquals(docid, rightDocs.nextDoc());      } else {        // advance()        int skip = docid + (int) Math.ceil(Math.abs(skipInterval + random().nextGaussian() * averageGap));        docid = leftDocs.advance(skip);        assertEquals(docid, rightDocs.advance(skip));      }      if (docid == DocIdSetIterator.NO_MORE_DOCS) {        return;      }      int freq = leftDocs.freq();      assertEquals(freq, rightDocs.freq());      for (int i = 0; i < freq; i++) {        assertEquals(leftDocs.nextPosition(), rightDocs.nextPosition());        // we don't compare the payloads, it's allowed that one is empty etc      }    }  }

 

转载地址:http://ozrzl.baihongyu.com/

你可能感兴趣的文章
HTAP数据库 PostgreSQL 场景与性能测试之 34 - (OLTP+OLAP) 不含索引单表单点写入
查看>>
SSH整合(一)——直接获取ApplicationContext
查看>>
.NET RazorEngine Razor知识集合 Razor也可以这样玩
查看>>
leetcode 27 Remove Element
查看>>
前亚马逊首席科学家薄列峰加盟京东金融,领衔AI实验室
查看>>
Spark入门:Spark Streaming 概览
查看>>
容器生态圈项目一览:引擎、编排、OS、Registry、监控
查看>>
中首光伏发电掀起光伏发电投资浪潮
查看>>
[译] Coursera 的 GraphQL 之路
查看>>
Linux Socket编程实例(一个Hello World程序)
查看>>
与积木机器人相比,这个人形机器人更适合教育
查看>>
【Nginx】Nginx下的Yii部署
查看>>
Eric Brewer:容器是云计算的未来
查看>>
【HEVC学习与研究】41、HEVC帧内编码的原理和实现(中)
查看>>
机器人在医疗领域的应用前景
查看>>
创新ICT,成就智慧机场
查看>>
区块链如何改变中小企业从事商业贸易的方式
查看>>
VDI直接连接存储 VS. 共享存储
查看>>
专家谈零售大数据:以前没想到能做的现在可以做了
查看>>
《R语言游戏数据分析与挖掘》一3.3 高级绘图函数
查看>>