...
Eg: Rscript scripts/new/compare.R test tfidf dir short biocaddie
4. Results
Using biocaddie_all indexes.
Model | MAP | NDCG | P@20 | NDCG@20 | P@100 | NDCG@100 | Notes | Date |
---|---|---|---|---|---|---|---|---|
classic tfidf | 0.3282 | 0.5824 | 0.6867 | 0.5478 | 0.5013 | 0.5018 | No parameters | 07/05/17 |
BM25 | 0.3543 | 0.6105+ | 0.7467+ | 0.5917+ | 0.506 | 0.5186 | Sweep b, k1 | 07/05/17 |
QL (JM) | 0.3382 | 0.6022 | 0.7233 | 0.571 | 0.5 | 0.4996 | Sweep lambda | 07/05/17 |
QL (Dir) | 0.3675 (p-value=0.0548) | 0.6163+ (p-value=0.0502) | 0.6567 | 0.5664 | 0.5213 | 0.522 | Sweep mu | 07/05/17 |
Rocchio | 0.4044+ | 0.6417 (p-value= p= 0.0533) | 0.6967 | 0.5403 | 0.492 | 0.4912 | Sweep b, k1, fbTerms, fbDocs, fbOrigWeight | 07/05/17 |
Using biocaddie_all.snowball indexes
Model | MAP | NDCG | P@20 | NDCG@20 | P@100 | NDCG@100 | Notes | Date |
---|---|---|---|---|---|---|---|---|
classic tfidf | 0.3375 | 0.5944 | 0.6667 | 0.5256 | 0.4987 | 0.5002 | No parameters | 07/05/17 |
BM25 | 0.3764+ | 0.6239+ | 0.73+ | 0.6006+ | 0.5413+ | 0.539+ | Sweep b, k1 | 07/05/17 |
QL (JM) | 0.3448 | 0.6058 | 0.67 | 0.5813+ | 0.4987 | 0.5289+ | Sweep lambda | 07/05/17 |
QL (Dir) | 0.3776+ | 0.6315+ | 0.7033 | 0.6006+ | 0.5307 | 0.5365+ | Sweep mu | 07/05/17 |
Rocchio | Sweep b, k1, fbTerms, fbDocs, fbOrigWeight | 07/05/17 |
Verification
Using biocaddie_all indexes:
No Format |
---|
thphan@biocaddie-dev:/data/thphan/biocaddie$ Rscript scripts/new/compare.R test tfidf dir short biocaddie
Please enter run methods for comparison:
0: both are Indri
1: both are Lucene
2: from is Indri, to is Lucene
3: from is Lucene, to is Indri
1
[1] "map 0.3282 0.3675 p= 0.0548"
[1] "ndcg 0.5824 0.6163 p= 0.0502"
[1] "P_20 0.6867 0.6567 p= 0.9461"
[1] "ndcg_cut_20 0.5478 0.5664 p= 0.186"
[1] "P_100 0.5013 0.5213 p= 0.2168"
[1] "ndcg_cut_100 0.5018 0.522 p= 0.1401"
thphan@biocaddie-dev:/data/thphan/biocaddie$ Rscript scripts/new/compare.R test tfidf jm short biocadd ie
Please enter run methods for comparison:
0: both are Indri
1: both are Lucene
2: from is Indri, to is Lucene
3: from is Lucene, to is Indri
1
[1] "map 0.3282 0.3382 p= 0.1719"
[1] "ndcg 0.5824 0.6022 p= 0.0932"
[1] "P_20 0.6867 0.7233 p= 0.0831"
[1] "ndcg_cut_20 0.5478 0.571 p= 0.145"
[1] "P_100 0.5013 0.5 p= 0.5301"
[1] "ndcg_cut_100 0.5018 0.4996 p= 0.5552"
thphan@biocaddie-dev:/data/thphan/biocaddie$ Rscript scripts/new/compare.R test tfidf bm25 short biocaddie
Please enter run methods for comparison:
0: both are Indri
1: both are Lucene
2: from is Indri, to is Lucene
3: from is Lucene, to is Indri
1
[1] "map 0.3282 0.3543 p= 0.0846"
[1] "ndcg 0.5824 0.6105 p= 0.0148"
[1] "P_20 0.6867 0.7467 p= 0.0491"
[1] "ndcg_cut_20 0.5478 0.5917 p= 0.0496"
[1] "P_100 0.5013 0.506 p= 0.428"
[1] "ndcg_cut_100 0.5018 0.5186 p= 0.2195"
thphan@biocaddie-dev:/data/thphan/biocaddie$ Rscript scripts/new/compare.R test tfidf rocchio short biocaddie
Please enter run methods for comparison:
0: both are Indri
1: both are Lucene
2: from is Indri, to is Lucene
3: from is Lucene, to is Indri
1
[1] "map 0.3282 0.4044 p= 0.0188"
[1] "ndcg 0.5824 0.6417 p= 0.0533"
[1] "P_20 0.6867 0.6967 p= 0.3785"
[1] "ndcg_cut_20 0.5478 0.5403 p= 0.6276"
[1] "P_100 0.5013 0.492 p= 0.6184"
[1] "ndcg_cut_100 0.5018 0.4912 p= 0.6071" |
Using biocaddie_all.snowball indexes
No Format |
---|
root@integration-1:~/biocaddie# Rscript scripts/new/compare.R test tfidf dir short biocaddie
Please enter run methods for comparison:
0: both are Indri
1: both are Lucene
2: from is Indri, to is Lucene
3: from is Lucene, to is Indri
1
[1] "map 0.3375 0.3776 p= 0.0387"
[1] "ndcg 0.5944 0.6315 p= 0.0072"
[1] "P_20 0.6667 0.7033 p= 0.1042"
[1] "ndcg_cut_20 0.5256 0.6006 p= 0.0046"
[1] "P_100 0.4987 0.5307 p= 0.0652"
[1] "ndcg_cut_100 0.5002 0.5365 p= 0.0207"
root@integration-1:~/biocaddie# Rscript scripts/new/compare.R test tfidf jm short biocaddie
Please enter run methods for comparison:
0: both are Indri
1: both are Lucene
2: from is Indri, to is Lucene
3: from is Lucene, to is Indri
1
[1] "map 0.3375 0.3448 p= 0.2782"
[1] "ndcg 0.5944 0.6058 p= 0.2069"
[1] "P_20 0.6667 0.67 p= 0.475"
[1] "ndcg_cut_20 0.5256 0.5813 p= 0.0161"
[1] "P_100 0.4987 0.4987 p= 0.5"
[1] "ndcg_cut_100 0.5002 0.5289 p= 0.0117"
root@integration-1:~/biocaddie# Rscript scripts/new/compare.R test tfidf bm25 short biocaddie
Please enter run methods for comparison:
0: both are Indri
1: both are Lucene
2: from is Indri, to is Lucene
3: from is Lucene, to is Indri
1
[1] "map 0.3375 0.3764 p= 0.0284"
[1] "ndcg 0.5944 0.6239 p= 0.011"
[1] "P_20 0.6667 0.73 p= 0.0331"
[1] "ndcg_cut_20 0.5256 0.6006 p= 0.0045"
[1] "P_100 0.4987 0.5413 p= 0.0326"
[1] "ndcg_cut_100 0.5002 0.539 p= 0.0149" |