Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The following baselines were run using the combined, short topics on the biocaddie_all index using the baselines/ scripts to sweep parameter combinations and the mkeval.sh script to perform LOOCV for each desired metric.

ModelMAPNDCGP@20NDCG@20P@100NDCG@100NotesDate
TFIDF0.
2524
27810.
459
53240.
4643
54050.
4144
45930.
3638
40240.
3973
4369Sweep b and k15/17/17
Okapi0.
2548
28940.
466
5629+0.
5+
56430.
4414
4927+0.
3419
40290.
3981
4516Sweep b, k1, k35/17/17
QL (JM)0.
2573
26610.
5159
53110.
5524+
53810.
4886+
48530.
3752
38520.
4249
437Sweep lambda5/17/17
QL (Dir)0.
2837
2976+0.
5306
5656+0.
5667+
56430.
5054+
50040.
3981
41190.
4541+
457

Sweep mu

5/17/17
QL (TS)0.
2794
2931+0.
5391
5653+0.
531
57140.
4702
48270.
391
40240.
4455
4507Sweep mu and lambda5/17/17

+ indicates significant improvement over TFIDF baselines (p < 0.05)

Results when using the combined, orig topics on the biocaddie_all index using the baselines/ scripts to sweep parameter combinations and the mkeval.sh script to perform LOOCV for each desired metric.

ModelMAPp-valueNDCGp-valueP@20p-valueNDCG@20p-valueP@100p-valueNDCG@100p-valueNotesDate
TFIDF0.1661-0.3749-0.395-0.3234-0.2505-0.2962-Sweep b and k15/16/17
Okapi0.1348-0.99380.3279-0.99350.3225-0.96260.2558-0.97660.2120.94820.2547-0.9717Sweep b, k1, k35/16/17
QL (JM)0.1283-0.980.33880.79390.28250.92670.2079-0.96280.230.72780.25080.891Sweep lambda5/16/17
QL (Dir)0.14470.90510.36890.56420.285-0.96880.2324-0.96760.24150.63190.26660.8402

Sweep mu

5/16/17
QL (TS)0.16060.6320.37240.52590.360.79770.3040.7380.24750.54240.29910.4605Sweep mu and lambda5/16/17
RM30.17950.30930.3730.51560.40250.44280.32340.84380.2470.53670.29130.5462Sweep mu, fbDocs, fbTerms, and lambda5/22/17
SDM0.15750.70990.39620.3021 0.420.35540.34460.29890.26950.27640.30450.3995Sweep mu, w1, w2, w35/22/17

***No improvement over TFIDF baselines when using original queries.
- indicates significant decrement in performance over TFIDF baselines (p>0.95)  


Feedback/SDM models

Green rows are runs with Garrick's framework, all others are with Craig's framework.

ModelMAPNDCGP@20NDCG@20P@100NDCG@100NotesDate
QL (Dir)0.28372976+0.53065656+0.566756430.505450040.398141190.4541457

 

5/15/17
QL (Dir)0.28530.53040.56670.50540.39860.4541Sweep mu={50, 250, 500, 1000, 2500, 5000, 10000}5/15/17
RM30.28460.53810.57380.50540.41570.4710Sweep mu, fbDocs, fbTerms, and lambda5/15/17
RM30.298228460.538153950.545256190.515250720.403841570.47684710Sweep mu={100, 250, 500, 750, 1000, 2000, 3000}, fbDocs={10, 25, 50, 75, 100}, fbTerms={10, 25, 50, 75, 100}, and lambda=[0.0, 1.0]5/16/17
RM3 (stopped)0.302329480.547158340.547655710.506450040.388638050.4634599Sweep mu, fbDocs, fbTerms, and lambda5/17/17
RM3 (stopped)0.300130230.542354830.583357380.515051180.393839050.45344626Sweep mu={100, 250, 500, 750, 1000, 2000, 3000}, fbDocs={10, 25, 50, 75, 100}, fbTerms={10, 25, 50, 75, 100}, and lambda=[0.0, 1.0]5/16/17
Pubmed0.3076+29450.58535998+0.538151900.485547590.424841900.45014607Use mu=2500 for pubmed query sweeping fbDocs, fbTerms, and lambda. Sweep mu for final retrieval.5/15/17
Wikipedia0.294730510.59565894+0.573856190.495648210.406239480.44874613Use mu=2500 for wikipedia query sweeping fbDocs, fbTerms, and lambda. Sweep mu for final retrieval.5/15/17
SDM0.28740.55580.48330.47060.40190.4559Sweep mu, w1, w2, w35/16/17
FDM0.2930.5594+0.55000.49600.39330.4560Sweep mu, w1, w2, w35/8/17
OKAPI Exp0.301131890.510657450.507158570.455248940.387142380.43754783OKAPI expansion assuming k1=1 and k3=1, for now5/17/17

+ indicates significant improvement over QL (Dir) baselines (p < 0.05)

Document expansion models

ModelMAPNDCGP@20NDCG@20P@100NDCG@100NotesDate
QL (Dir)0.28530.53040.56670.50540.39860.4541Sweep mu5/15/17
Pubmed QL0.24670.54160.41470.35000.35060.3608Sweep alpha5/22/17
Pubmed RM30.29490.5806+0.56430.47620.38190.4431Sweep mu, fbDocs, fbTerms, alpha, and lambda5/17/17
Wikipedia QL0.23910.52670.45000.37270.35350.3610Sweep alpha5/22/17
Wikipedia RM30.28000.5795+0.55240.47120.37240.4254Sweep mu, fbDocs, fbTerms, alpha, and lambda5/17/17
Pubmed+Wikipedia QL0.24670.54160.41470.35000.34590.3608Sweep alpha5/22/17
Pubmed+Wikipedia RM30.29230.5725+0.56670.46000.39330.4378Sweep mu, fbDocs, fbTerms, alpha, and lambda5/17/17


ModelMAPNDCGP@20NDCG@20P@100NDCG@100NotesDate
Dir (Krovetz)0.29442959+0.5443+0.5905+0.51500.40334124+0.46014655+ means significant improvement of Dir (unstemmed)5/15/17
RM3 (krovetz)0.285830290.559257340.56430.499947140.395240330.46054753
5/17/17
Pubmed (Krovetz)0.3285+0.6008+0.56670.48210.4333+0.4627+ means improvement of Dir (krovetz)

...