You are viewing an old version of this page. View the current version.

Compare with Current View Page History

Version 1 Next »

Notes from attempt to move from Indri to Lucene for evaluation.

Similarity

  • The LMSimilarity classes are basic QL, not KL.
  • Similarities are used at both index and query time.
  • At index time:
    • computeNorm – stores per-document normalization value later used by getNormValues
  • At query time
    • computeWeight called once per query
    • getValueForNormalization is query normalization, called once per query
    • score() method called for each document
    • exactSimScorer
    • sloppySimScorer
  • Document length is only accessible to Similarity (not for our re-ranking approach without explicitly storing as a field)
  • For some reason Lucene JM is giving very different results than Indri JM.
  • No labels