public class EarlyTerminatingSortingCollector extends Collector
Collector
that early terminates collection of documents on a
per-segment basis, if the segment was sorted according to the given
Sort
.
NOTE: the Collector
detects sorted segments according to
SortingMergePolicy
, so it's best used in conjunction with it. Also,
it collects up to a specified numDocsToCollect
from each segment,
and therefore is mostly suitable for use in conjunction with collectors such as
TopDocsCollector
, and not e.g. TotalHitCountCollector
.
NOTE: If you wrap a TopDocsCollector
that sorts in the same
order as the index order, the returned TopDocs
will be correct. However the total of hit count
will be underestimated since not all matching documents will have
been collected.
NOTE: This Collector
uses Sort.toString()
to detect
whether a segment was sorted with the same Sort
. This has
two implications:
IndexWriter
's
SortingMergePolicy
to sort according to another criterion and if both
the old and the new Sort
s have the same identifier, this
Collector
will incorrectly detect sorted segments.Modifier and Type | Field and Description |
---|---|
protected Collector |
in
The wrapped Collector
|
protected int |
numDocsToCollect
Number of documents to collect in each segment
|
protected boolean |
segmentSorted
True if the current segment being processed is sorted by
sort |
protected int |
segmentTotalCollect
Number of documents to collect in the current segment being processed
|
protected Sort |
sort
Sort used to sort the search results
|
Constructor and Description |
---|
EarlyTerminatingSortingCollector(Collector in,
Sort sort,
int numDocsToCollect)
Create a new
EarlyTerminatingSortingCollector instance. |
Modifier and Type | Method and Description |
---|---|
boolean |
acceptsDocsOutOfOrder()
Return
true if this collector does not
require the matching docIDs to be delivered in int sort
order (smallest to largest) to Collector.collect(int) . |
void |
collect(int doc)
Called once for every document matching a query, with the unbased document
number.
|
void |
setNextReader(AtomicReaderContext context)
Called before collecting from each
AtomicReaderContext . |
void |
setScorer(Scorer scorer)
Called before successive calls to
Collector.collect(int) . |
protected final Collector in
protected final Sort sort
protected final int numDocsToCollect
protected int segmentTotalCollect
protected boolean segmentSorted
sort
public EarlyTerminatingSortingCollector(Collector in, Sort sort, int numDocsToCollect)
EarlyTerminatingSortingCollector
instance.in
- the collector to wrapsort
- the sort you are sorting the search results onnumDocsToCollect
- the number of documents to collect on each segment. When wrapping
a TopDocsCollector
, this number should be the number of
hits.public void setScorer(Scorer scorer) throws IOException
Collector
Collector.collect(int)
. Implementations
that need the score of the current document (passed-in to
Collector.collect(int)
), should save the passed-in Scorer and call
scorer.score() when needed.setScorer
in class Collector
IOException
public void collect(int doc) throws IOException
Collector
Note: The collection of the current segment can be terminated by throwing
a CollectionTerminatedException
. In this case, the last docs of the
current AtomicReaderContext
will be skipped and IndexSearcher
will swallow the exception and continue collection with the next leaf.
Note: This is called in an inner search loop. For good search performance,
implementations of this method should not call IndexSearcher.doc(int)
or
IndexReader.document(int)
on every hit.
Doing so can slow searches by an order of magnitude or more.
collect
in class Collector
IOException
public void setNextReader(AtomicReaderContext context) throws IOException
Collector
AtomicReaderContext
. All doc ids in
Collector.collect(int)
will correspond to IndexReaderContext.reader()
.
Add AtomicReaderContext.docBase
to the current IndexReaderContext.reader()
's
internal document id to re-base ids in Collector.collect(int)
.setNextReader
in class Collector
context
- next atomic reader contextIOException
public boolean acceptsDocsOutOfOrder()
Collector
true
if this collector does not
require the matching docIDs to be delivered in int sort
order (smallest to largest) to Collector.collect(int)
.
Most Lucene Query implementations will visit
matching docIDs in order. However, some queries
(currently limited to certain cases of BooleanQuery
) can achieve faster searching if the
Collector
allows them to deliver the
docIDs out of order.
Many collectors don't mind getting docIDs out of
order, so it's important to return true
here.
acceptsDocsOutOfOrder
in class Collector
Copyright © 2000-2015 The Apache Software Foundation. All Rights Reserved.