Thursday, July 31, 2014

Learning Lucene

Indexing:
Directory dir = FSDirectory.open(new File(indexPath));
Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_4_9);
IndexWriterConfig iwc = new IndexWriterConfig(Version.LUCENE_4_9,
analyzer);
if (create) {
// Create a new index in the directory, removing any
// previously indexed documents:
iwc.setOpenMode(OpenMode.CREATE);
} else {
// Add new documents to an existing index:
iwc.setOpenMode(OpenMode.CREATE_OR_APPEND);
}
// iwc.setRAMBufferSizeMB(256.0);
IndexWriter writer = new IndexWriter(dir, iwc);
Document doc = new Document();
doc.add(new LongField("modified", file.lastModified(),Field.Store.NO));
writer.addDocument(doc);
writer.updateDocument(new Term("path", file.getPath()),doc);

Searching
IndexReader reader = DirectoryReader.open(FSDirectory.open(new File(index)));
IndexSearcher searcher = new IndexSearcher(reader);
Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_4_9);
QueryParser parser = new QueryParser(Version.LUCENE_4_9, field,analyzer);
Query query = parser.parse(line);
searcher.search(query, null, 100);

TopDocs results = searcher.search(query, 5 * hitsPerPage);
ScoreDoc[] hits = results.scoreDocs;
int numTotalHits = results.totalHits;
Syste
QParser parser = QParser.getParser(rb.getQueryString(), defType, req);
Query q = parser.getQuery();

SolrIndexSearcher.QueryCommand cmd = rb.getQueryCommand();
cmd.setTimeAllowed(timeAllowed);
searcher.search(result,cmd);

SolrPluginUtils

How solr shrds query works:
org.apache.solr.handler.component.QueryComponent.handleRegularResponses(ResponseBuilder, ShardRequest)

org.apache.solr.handler.component.QueryComponent.mergeIds(ResponseBuilder, ShardRequest)
// id to shard mapping, to eliminate any accidental dups
HashMap<Object,String> uniqueDoc = new HashMap<Object,String>();  

// Merge the docs via a priority queue so we don't have to sort *all* of the
// documents... we only need to order the top (rows+start)
ShardFieldSortedHitQueue queue= new ShardFieldSortedHitQueue(sortFields, ss.getOffset() + ss.getCount());
// this is a min heap

ShardDoc shardDoc = new ShardDoc();
shardDoc.id = id;
shardDoc.shard = srsp.getShard();
shardDoc.orderInShard = i;

String path = doc.get("path");