Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

We currently use Indri for our evaluation process. The goal of NDS-867 is to implement a similar evaluation framework based on ElasticSearch.

Requirements

Basic requirements for an evaluation framework:

  • Ability to create an index controlling for specific transformations (stemming, stopping, field storage, etc)
  • Ability to index standard TREC collection formats as well as the BioCADDIE JSON or XML data.
  • Using a single index, ability to dynamically change retrieval models and parameters (i.e., IndriRunQuery)
  • Output in TREC format for evaluation using trec_eval and related tools
  • Ability to add new retrieval model implementations
  • Standard baselines for comparison
  • Handles standard TREC topic formats
  • Multi-threaded and distributed processing for parameter sweeps
  • Cross validation
  • Hypothesis/significance testing.

With Indri (and related tools) we can do the following:

...

In short, it looks like there's been recent work to develop an evaluation framework around Lucene.  We have some support for this in ir-utils, but it wasn't widely used (we've always used the Indri implementation for consistency). So we have a choice – work with the lucene4ir workshop code, which is open source but primarily developed for a single workshop. Or continue working in ir-utils, since that what we've got. In this case, we'd need to extend ir-utils to have improved support for Lucene similarities.

Lucene4IR Framework

Supports the following:

  • Indexing parameters in XML format
  • Retrieval parameters in XML format
  • Index support for CACM, TRECAquaint, TRECNEWS, Tipster formats
  • In addition to Lucene similarities, BM25L, Okapi BM25, SMART BNNBNN
  • IndexerApp
  • RetrievalApp
  • RetrievalAppQueryExpansion

Other notes

Re-reading Zhai's SLMIR, noticed different ranges for Okapi BM25 parameters.

...