Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Ability to create an index controlling for specific transformations (stemming, stopping, field storage, etc)
  • Ability to index standard TREC collection formats as well as the BioCADDIE JSON,  XML, HTML data etc.
  • Using a single index, ability to dynamically change retrieval models and parameters (i.e., IndriRunQuery)
  • Output in TREC format for evaluation using trec_eval and related tools
  • Ability to add new retrieval model implementations
  • Standard baselines for comparison
  • Handles standard TREC topic formats
  • Multi-threaded and distributed processing for parameter sweeps
    • Ideally, works with large collections, such as ClueWeb
  • Cross validation
  • Hypothesis/significance testing.
  • Query performance prediction: implement the basics

With Indri (and related tools) we can do the following:

...