...
Model | MAP | NDCG | P@20 | NDCG@20 | P@100 | NDCG@100 | Notes | Date |
---|---|---|---|---|---|---|---|---|
classic tfidf | 0.3282 | 0.5824 | 0.6867 | 0.5478 | 0.5013 | 0.5018 | No parameters | 07/10/17 |
classic tfidf (tfidf-snowball) | 0.3375 | 0.5944 | 0.6667 | 0.5256 | 0.4987 | 0.5002 | No parameters | 07/10/17 |
BM25 | 0.3543 | 0.6105 | 0.7467 | 0.5917 | 0.506 | 0.5186 | Sweep b, k1 | 07/10/17 |
BM25 (bm25-snowball) | 0.3764+ | 0.6239 | 0.73 | 0.6006 | 0.5413+ | 0.539+ | Sweep b, k1 | 07/10/17 |
QL (JM) | 0.3382 | 0.6022 | 0.7233 | 0.571 | 0.5 | 0.4996 | Sweep lambda | 07/10/17 |
QL (JM) (jm-snowball) | 0.3448 | 0.6058 | 0.67 | 0.5813 | 0.4987 | 0.5289+ | Sweep lambda | 07/10/17 |
QL (Dir) | 0.3675 | 0.6163 | 0.6567 | 0.5664 | 0.5213 | 0.522 | Sweep mu | 07/10/17 |
QL (Dir) (dir-snowball) | 0.3776 | 0.6315 (p-value=0.0534) | 0.7033+ | 0.6006+ | 0.5307 | 0.5365 | Sweep mu | 07/10/17 |
Rocchio | 0.4044 | 0.6417 | 0.6967 | 0.5403 | 0.492 | 0.4912 | Sweep b, k1, fbTerms, fbDocs, fbOrigWeight | 07/1011/17 |
Rocchio (rocchio-snowball) | 0.3959 | 0.6052- | 0.7267 | 0.598+ | 0.5453+ | 0.525 | Sweep b, k1, fbTerms, fbDocs, fbOrigWeight | 07/1011/17 |
Verification
Using biocaddie_all indexes:
...
No Format |
---|
thphan@biocaddie-dev:/data/thphan/biocaddie$ Rscript scripts/new/compare.R test tfidf-snowball dir-snowball short biocaddie Please enter run methods for comparison: 0: both are Indri 1: both are Lucene 2: from is Indri, to is Lucene 3: from is Lucene, to is Indri 1 [1] "map 0.3375 0.3776 p= 0.0387" [1] "ndcg 0.5944 0.6315 p= 0.0072" [1] "P_20 0.6667 0.7033 p= 0.1042" [1] "ndcg_cut_20 0.5256 0.6006 p= 0.0046" [1] "P_100 0.4987 0.5307 p= 0.0652" [1] "ndcg_cut_100 0.5002 0.5365 p= 0.0207" thphan@biocaddie-dev:/data/thphan/biocaddie$ Rscript scripts/new/compare.R test tfidf-snowball jm-snowball short biocaddie Please enter run methods for comparison: 0: both are Indri 1: both are Lucene 2: from is Indri, to is Lucene 3: from is Lucene, to is Indri 1 [1] "map 0.3375 0.3448 p= 0.2782" [1] "ndcg 0.5944 0.6058 p= 0.2069" [1] "P_20 0.6667 0.67 p= 0.475" [1] "ndcg_cut_20 0.5256 0.5813 p= 0.0161" [1] "P_100 0.4987 0.4987 p= 0.5" [1] "ndcg_cut_100 0.5002 0.5289 p= 0.0117" thphan@biocaddie-dev:/data/thphan/biocaddie$ Rscript scripts/new/compare.R test tfidf-snowball bm25-snowball short biocaddie Please enter run methods for comparison: 0: both are Indri 1: both are Lucene 2: from is Indri, to is Lucene 3: from is Lucene, to is Indri 1 [1] "map 0.3375 0.3764 p= 0.0284" [1] "ndcg 0.5944 0.6239 p= 0.011" [1] "P_20 0.6667 0.73 p= 0.0331" [1] "ndcg_cut_20 0.5256 0.6006 p= 0.0045" [1] "P_100 0.4987 0.5413 p= 0.0326" [1] "ndcg_cut_100 0.5002 0.539 p= 0.0149" thphan@biocaddie-dev:/data/thphan/biocaddie$ Rscript scripts/new/compare.R test tfidf rocchio-snowball short biocaddie Please enter run methods for comparison: 0: both are Indri 1: both are Lucene 2: from is Indri, to is Lucene 3: from is Lucene, to is Indri 1 [1] "map 0.3282 0.3959 p= 0.0427" [1] "ndcg 0.5824 0.6052 p= 0.2869" [1] "P_20 0.6867 0.7267 p= 0.1189" [1] "ndcg_cut_20 0.5478 0.598 p= 0.0424" [1] "P_100 0.5013 0.5453 p= 0.1152" [1] "ndcg_cut_100 0.5018 0.525 p= 0.2733" |
Compare results between unstemmed and stemmed indexes:
No Format |
---|
thphan@biocaddie-dev:/data/thphan/biocaddie$ Rscript scripts/new/compare.R test tfidf tfidf-snowball short biocaddie
Please enter run methods for comparison:
0: both are Indri
1: both are Lucene
2: from is Indri, to is Lucene
3: from is Lucene, to is Indri
1
[1] "map 0.3282 0.3375 p= 0.1463"
[1] "ndcg 0.5824 0.5944 p= 0.1454"
[1] "P_20 0.6867 0.6667 p= 0.808"
[1] "ndcg_cut_20 0.5478 0.5256 p= 0.8715"
[1] "P_100 0.5013 0.4987 p= 0.5819"
[1] "ndcg_cut_100 0.5018 0.5002 p= 0.5652"
thphan@biocaddie-dev:/data/thphan/biocaddie$ Rscript scripts/new/compare.R test dir dir-snowball short biocaddie
Please enter run methods for comparison:
0: both are Indri
1: both are Lucene
2: from is Indri, to is Lucene
3: from is Lucene, to is Indri
1
[1] "map 0.3675 0.3776 p= 0.0842"
[1] "ndcg 0.6163 0.6315 p= 0.0534"
[1] "P_20 0.6567 0.7033 p= 0.0011"
[1] "ndcg_cut_20 0.5664 0.6006 p= 0.0222"
[1] "P_100 0.5213 0.5307 p= 0.1942"
[1] "ndcg_cut_100 0.522 0.5365 p= 0.0645"
thphan@biocaddie-dev:/data/thphan/biocaddie$ Rscript scripts/new/compare.R test jm jm-snowball short biocaddie
Please enter run methods for comparison:
0: both are Indri
1: both are Lucene
2: from is Indri, to is Lucene
3: from is Lucene, to is Indri
1
[1] "map 0.3382 0.3448 p= 0.2603"
[1] "ndcg 0.6022 0.6058 p= 0.3358"
[1] "P_20 0.7233 0.67 p= 0.8885"
[1] "ndcg_cut_20 0.571 0.5813 p= 0.2551"
[1] "P_100 0.5 0.4987 p= 0.55"
[1] "ndcg_cut_100 0.4996 0.5289 p= 0.0026"
thphan@biocaddie-dev:/data/thphan/biocaddie$ Rscript scripts/new/compare.R test bm25 bm25-snowball short biocaddie
Please enter run methods for comparison:
0: both are Indri
1: both are Lucene
2: from is Indri, to is Lucene
3: from is Lucene, to is Indri
1
[1] "map 0.3543 0.3764 p= 0.0317"
[1] "ndcg 0.6105 0.6239 p= 0.0945"
[1] "P_20 0.7467 0.73 p= 0.8548"
[1] "ndcg_cut_20 0.5917 0.6006 p= 0.2775"
[1] "P_100 0.506 0.5413 p= 0.0209"
[1] "ndcg_cut_100 0.5186 0.539 p= 0.0441"
thphan@biocaddie-dev:/data/thphan/biocaddie$ Rscript scripts/new/compare.R test rocchio rocchio-snowball short biocaddie
Please enter run methods for comparison:
0: both are Indri
1: both are Lucene
2: from is Indri, to is Lucene
3: from is Lucene, to is Indri
1
[1] "map 0.4044 0.3959 p= 0.6841"
[1] "ndcg 0.6417 0.6052 p= 0.9667"
[1] "P_20 0.6967 0.7267 p= 0.2037"
[1] "ndcg_cut_20 0.5403 0.598 p= 0.0465"
[1] "P_100 0.492 0.5453 p= 0.0035"
[1] "ndcg_cut_100 0.4912 0.525 p= 0.0625" |
...