Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Query performance prediction

A central goal of query performance prediction or query difficulty estimation is to identify features, both pre- and post-retrieval, that can be used to predict the performance of a query. This is generally done by predicting average precision. The predicted average precision can be used, for example, to select between two different models.  Unfortunately, there are no comprehensive reviews of predictor effectiveness.

For BioCADDIE, we are focused on expansion models and therefore are primarily concerned with adaptive feedback. Lv and Zhai's (2009) approach seems to be the most applicable – estimating the feedback mixing parameter per-query.  This will require the following:

  • A framework for implementing baseline and custom predictors (ir-utils or otherwise)
  • Ability to generate a set of pre- and post-retrieval predictor values for each query for multiple collections. This will output a matrix of queries to predictors.
  • Calculate correlation (Pearson and Spearman) between the predictor and a given metric or parameter (eg., RM3 lambda)
  • Ability to select features (manually or automatically) and to construct a predictive model (i.e, regression) using one or more predictors.
  • Evaluate the predictive model via cross validation.

Adaptive feedback

One approach explored by Lv and Zhai (2009) is to learn a model to predict the expansion mixing weight. They found six features to be predictive of the feedback weight in a linear model:

...