Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

From the BioCADDIE Results, we can see that the PubMed and Wikipedia expansion models provide some improvement, but not at the higher ranks. As is often the case with expansion, inspection of individual queries shows that , while some queries benefit from expansion , others do not. 

The following plots illustrate the effect of varying the Dirichlet mu parameter (a) and RM3 fbOrigWeight parameter for RM3 (b), PubMed RM3 (c), and Wikipedia RM (3d) on NDCG@1000 and NDCG@20.  For the RM3 models, all other parameters are fixed at their cross-validated values. The X axis shows the BioCADDIE topics, the Y-axis is the difference in NDCG from the cross-validated QL/Dirichlet baseline. The boxplots represent one point per parameter value – blue for low and red for high higher parameter values.  The green dots are the cross-validated fbOrigWeight values for the RM models.

...

  • In plot (a) we see that NDCG decreases for topic T10 as mu increases, but for T13 varying mu has little effect.
  • In plot (b) we see that decreasing the fbOrigWeight (and therefore increasing the effect of the expansion terms) has a positive effect for topics T11 and T13, but a negative effect for topic T7.
  • In plot (c) we see that PubMed expansion has a negative effect for T3 as compared to RM3 expansion. PubMed expansion is riskier than RM3 expansion.
  • Similarly, in plot (d) we see that Wikipedia expansion in general has a negative effect compared to RM3 or PubMed RM3 expansion.

...

Looking instead a the higher ranks, we see a much more muted effect of expansion with the BioCADDIE test collection. As above, expansion appears to be effective for a small number of test queries (< 0.5)

(a) Dirichlet (mu)(b) RM3 (fbOrigWeight)(c) PubMed RM3 (fbOrigWeight)(d) Wikiepdia RM3 (fbOrigWeight)

...

In each of these cases, the fbOrigWeight that controls the mixing of the original and feedback query is relatively fixed – learned on average from the training queries. We can In the next section, we explore whether we can reliably predict when to apply one model or another or to predict the fbOrigWeight mixing parameter via query performance prediction methods.

...

A central goal of query performance prediction or query difficulty estimation is to identify features, both pre- and post-retrieval, that can be used to predict the performance of a query. This is generally done by predicting average precision . The predicted average precision can be used, for example, to select between two different models.  Unfortunatelytfor use in model selection or to predict a parameter, such as the RM3 fbOrigWeight.   Unfortunately, there are no comprehensive reviews of predictor effectiveness.

...

One approach explored by Lv and Zhai (2009) is to learn a model to predict the expansion mixing weight. They found six features to be predictive of the feedback weight in a linear model :(in order of significance)

  • Topic model clarity (FBEnt_R3^
  • Divergence (QFBDiv_A): KL-divergence of query and feedback documents
  • Query clarity (QEnt_R1): Relative entropy of the query compared feedback document topic model to the collection.Log query
  • Exponentiated feedback clarity (QEntFBEnt_R3R2^):  Log of the Exponentiated relative entropy of the query compared feedback documents to the collection
  • Divergence (QFBDiv_A): KL-divergence of query and feedback documents
  • Feedback radius (FBRadius): average divergence between each document and the centroid of the feedback documents.
  • Exponentiated feedback Query clarity (FBEntQEnt_R2R1): Exponentiated relative Relative entropy of feedback documents the query compared to the collection
  • Topic model Log query clarity (FBEntQEnt_R3): Relative  Log of the relative entropy of the feedback document topic model query compared to the collection.

All of these predictors are post-retrieval predictors, some with significant overhead (^)

There are a variety of other predictors, such as those discussed in Carmel and Yom-Tov's (2010) monograph. These include:

Pre-retrieval predictors:

  • IDF (mean, min, max, variance)
  • Inverse collection term frequency (ICTF: mean, min, max, variance)
  • Collection query similarity (SCQ, Zhao et al)

  • Simplified clarity score (SCS)

  • Predictors requiring additional computation
    • Coherency (He et al)
    • PMI/avgPMI/maxPMI
    • Term-weight variability (maxVar) - Zhao et al, also Hauff.

Post-retrieval predictors:

  • Clarity (Cronen-Townshend, 2002)
  • Drift ( Shtok, Kurland, Carmel, 2009)
  • Deviation (Perez-Iglesia and Araujo, 2010)
  • Absolute/relative divergence (Lv & Zhai)

References

Carmel, D., & Yom-Tov, E. (2010). Estimating the Query Difficulty for Information Retrieval. Synthesis Lectures on Information Concepts, Retrieval, and Services, 2(1), 1–89. http://doi.org/10.2200/S00235ED1V01Y201004ICR015

Lv, Y., & Zhai, C. (2009). Adaptive Relevance Feedback in Information Retrieval. In Proceedings of the 18th ACM Conference on Information and Knowledge Management (pp. 255–264). New York, NY, USA: ACM. http://doi.org/10.1145/1645953.1645988

...