...
Model | MAP | NDCG | P@20 | NDCG@20 | P@100 | NDCG@100 | Notes | Date |
---|---|---|---|---|---|---|---|---|
classic tfidf | 0.3282 | 0.5824 | 0.6867 | 0.5478 | 0.5013 | 0.5018 | No parameters | 07/05/17 |
BM25 | 0.3543 | 0.6105+ | 0.7467+ | 0.5917+ | 0.506 | 0.5186 | Sweep b, k1 | 07/05/17 |
QL (JM) | 0.3382 | 0.6022 | 0.7233 | 0.571 | 0.5 | 0.4996 | Sweep lambda | 07/05/17 |
QL (Dir) | 0.3675 (p-value=0.0548) | 0.6163+ (p-value=0.0502) | 0.6567 | 0.5664 | 0.5213 | 0.522 | Sweep mu | 07/05/17 |
Rocchio | 0.4044+ | 0.6417 (p-value= p= 0.0533) | 0.6967 | 0.5403 | 0.492 | 0.4912 | Sweep b, k1, fbTerms, fbDocs, fbOrigWeight | 07/05/17 |
...
Model | MAP | NDCG | P@20 | NDCG@20 | P@100 | NDCG@100 | Notes | Date |
---|---|---|---|---|---|---|---|---|
classic tfidf | 0.3282 | 0.5824 | 0.6867 | 0.5478 | 0.5013 | 0.5018 | No parameters | 07/10/17 |
classic tfidf (tfidf-snowball) | 0.3375 | 0.5944 | 0.6667 | 0.5256 | 0.4987 | 0.5002 | No parameters | 07/10/17 |
BM25 | 0.3543 | 0.6105 | 0.7467 | 0.5917 | 0.506 | 0.5186 | Sweep b, k1 | 07/10/17 |
BM25 (bm25-snowball) | 0.3764+ | 0.6239 | 0.73 | 0.6006 | 0.5413+ | 0.539+ | Sweep b, k1 | 07/10/17 |
QL (JM) | 0.3382 | 0.6022 | 0.7233 | 0.571 | 0.5 | 0.4996 | Sweep lambda | 07/10/17 |
QL (JM) (jm-snowball) | 0.3448 | 0.6058 | 0.67 | 0.5813 | 0.4987 | 0.5289+ | Sweep lambda | 07/10/17 |
QL (Dir) | 0.3675 | 0.6163 | 0.6567 | 0.5664 | 0.5213 | 0.522 | Sweep mu | 07/10/17 |
QL (Dir) (dir-snowball) | 0.3776 | 0.6315 (p-value=0.0534) | 0.7033+ | 0.6006+ | 0.5307 | 0.5365 | Sweep mu | 07/10/17 |
Rocchio | 0.4044 | 0.6417 | 0.6967 | 0.5403 | 0.492 | 0.4912 | Sweep b, k1, fbTerms, fbDocs, fbOrigWeight | 07/1011/17 |
Rocchio (rocchio-snowball) | 0.3959 | 0.6052- | 0.7267 | 0.598+ | 0.5453+ | 0.525 | Sweep b, k1, fbTerms, fbDocs, fbOrigWeight | 07/1011/17 |
Verification
Using biocaddie_all indexes:
No Format |
---|
thphan@biocaddie-dev:/data/thphan/biocaddie$ Rscript scripts/new/compare.R test tfidf dir short biocaddie Please enter run methods for comparison: 0: both are Indri 1: both are Lucene 2: from is Indri, to is Lucene 3: from is Lucene, to is Indri 1 [1] "map 0.3282 0.3675 p= 0.0548" [1] "ndcg 0.5824 0.6163 p= 0.0502" [1] "P_20 0.6867 0.6567 p= 0.9461" [1] "ndcg_cut_20 0.5478 0.5664 p= 0.186" [1] "P_100 0.5013 0.5213 p= 0.2168" [1] "ndcg_cut_100 0.5018 0.522 p= 0.1401" thphan@biocaddie-dev:/data/thphan/biocaddie$ Rscript scripts/new/compare.R test tfidf jm short biocaddbiocaddie Please enter run methods for comparison: 0: both are Indri 1: both are Lucene 2: from is Indri, to is Lucene 3: from is Lucene, to is Indri 1 [1] "map 0.3282 0.3382 p= 0.1719" [1] "ndcg 0.5824 0.6022 p= 0.0932" [1] "P_20 0.6867 0.7233 p= 0.0831" [1] "ndcg_cut_20 0.5478 0.571 p= 0.145" [1] "P_100 0.5013 0.5 iep= 0.5301" [1] "ndcg_cut_100 0.5018 0.4996 p= 0.5552" thphan@biocaddie-dev:/data/thphan/biocaddie$ Rscript scripts/new/compare.R test tfidf bm25 short biocaddie Please enter run methods for comparison: 0: both are Indri 1: both are Lucene 2: from is Indri, to is Lucene 3: from is Lucene, to is Indri 1 [1] "map 0.3282 0.3543 p= 0.0846" [1] "ndcg 0.5824 0.6105 p= 0.0148" [1] "P_20 0.6867 0.7467 p= 0.0491" [1] "ndcg_cut_20 0.5478 0.5917 p= 0.0496" [1] "P_100 0.5013 0.506 p= 0.428" [1] "ndcg_cut_100 0.5018 0.5186 p= 0.2195" thphan@biocaddie-dev:/data/thphan/biocaddie$ Rscript scripts/new/compare.R test tfidf rocchio short biocaddie Please enter run methods for comparison: 0: both are Indri 1: both are Lucene 2: from is Indri, to is Lucene 3: from is Lucene, to is Indri 1 [1] "map 0.3282 0.33824044 p= 0.17190188" [1] "ndcg 0.5824 0.60226417 p= 0.09320533" [1] "P_20 0.6867 0.72336967 p= 0.08313785" [1] "ndcg_cut_20 0.5478 0.5715403 p= 0.1456276" [1] "P_100 0.5013 0.5492 p= 0.53016184" [1] "ndcg_cut_100 0.5018 0.49964912 p= 0.5552" 6071" |
Using biocaddie_all.snowball indexes
No Format |
---|
thphan@biocaddie-dev:/data/thphan/biocaddie$ Rscript scripts/new/compare.R test tfidf-snowball bm25dir-snowball short biocaddie Please enter run methods for comparison: 0: both are Indri 1: both are Lucene 2: from is Indri, to is Lucene 3: from is Lucene, to is Indri 1 [1] "map 0.32823375 0.35433776 p= 0.08460387" [1] "ndcg 0.58245944 0.61056315 p= 0.01480072" [1] "P_20 0.68676667 0.74677033 p= 0.04911042" [1] "ndcg_cut_20 0.54785256 0.59176006 p= 0.04960046" [1] "P_100 0.50134987 0.5065307 p= 0.4280652" [1] "ndcg_cut_100 0.50185002 0.51865365 p= 0.21950207" thphan@biocaddie-dev:/data/thphan/biocaddie$ Rscript scripts/new/compare.R test tfidf-snowball rocchiojm-snowball short biocaddie Please enter run methods for comparison: 0: both are Indri 1: both are Lucene 2: from is Indri, to is Lucene 3: from is Lucene, to is Indri 1 [1] "map 0.32823375 0.40443448 p= 0.01882782" [1] "ndcg 0.58245944 0.64176058 p= 0.05332069" [1] "P_20 0.68676667 0.696767 p= 0.3785475" [1] "ndcg_cut_20 0.54785256 0.54035813 p= 0.62760161" [1] "P_100 0.50134987 0.4924987 p= 0.61845" [1] "ndcg_cut_100 0.50185002 0.49125289 p= 0.6071" |
Using biocaddie_all.snowball indexes
No Format |
---|
0117" thphan@biocaddie-dev:/data/thphan/biocaddie$ Rscript scripts/new/compare.R test tfidf-snowball dirbm25-snowball short biocaddie Please enter run methods for comparison: 0: both are Indri 1: both are Lucene 2: from is Indri, to is Lucene 3: from is Lucene, to is Indri 1 [1] "map 0.3375 0.37763764 p= 0.03870284" [1] "ndcg 0.5944 0.63156239 p= 0.0072011" [1] "P_20 0.6667 0.703373 p= 0.10420331" [1] "ndcg_cut_20 0.5256 0.6006 p= 0.00460045" [1] "P_100 0.4987 0.53075413 p= 0.06520326" [1] "ndcg_cut_100 0.5002 0.5365539 p= 0.02070149" thphan@biocaddie-dev:/data/thphan/biocaddie$ Rscript scripts/new/compare.R test tfidf-snowball jmrocchio-snowball short biocaddie Please enter run methods for comparison: 0: both are Indri 1: both are Lucene 2: from is Indri, to is Lucene 3: from is Lucene, to is Indri 1 [1] "map 0.33753282 0.34483959 p= 0.27820427" [1] "ndcg 0.59445824 0.60586052 p= 0.20692869" [1] "P_20 0.66676867 0.677267 p= 0.4751189" [1] "ndcg_cut_20 0.52565478 0.5813598 p= 0.01610424" [1] "P_100 0.49875013 0.49875453 p= 0.51152" [1] "ndcg_cut_100 0.50025018 0.5289525 p= 0.01172733" |
Compare results between unstemmed and stemmed indexes:
No Format |
---|
thphan@biocaddie-dev:/data/thphan/biocaddie$ Rscript scripts/new/compare.R test tfidf-snowball bm25tfidf-snowball short biocaddie Please enter run methods for comparison: 0: both are Indri 1: both are Lucene 2: from is Indri, to is Lucene 3: from is Lucene, to is Indri 1 [1] "map 0.33753282 0.37643375 p= 0.02841463" [1] "ndcg 0.59445824 0.62395944 p= 0.0111454" [1] "P_20 0.66676867 0.736667 p= 0.0331808" [1] "ndcg_cut_20 0.52565478 0.60065256 p= 0.00458715" [1] "P_100 0.49875013 0.54134987 p= 0.03265819" [1] "ndcg_cut_100 0.50025018 0.5395002 p= 0.0149" |
Compare results between unstemmed and stemmed indexes:
No Format |
---|
5652" thphan@biocaddie-dev:/data/thphan/biocaddie$ Rscript scripts/new/compare.R test tfidfdir tfidfdir-snowball short biocaddie Please enter run methods for comparison: 0: both are Indri 1: both are Lucene 2: from is Indri, to is Lucene 3: from is Lucene, to is Indri 1 [1] "map 0.32823675 0.33753776 p= 0.14630842" [1] "ndcg 0.58246163 0.59446315 p= 0.14540534" [1] "P_20 0.68676567 0.66677033 p= 0.8080011" [1] "ndcg_cut_20 0.54785664 0.52566006 p= 0.87150222" [1] "P_100 0.50135213 0.49875307 p= 0.58191942" [1] "ndcg_cut_100 0.5018522 0.50025365 p= 0.56520645" thphan@biocaddie-dev:/data/thphan/biocaddie$ Rscript scripts/new/compare.R test dirjm dirjm-snowball short biocaddie Please enter run methods for comparison: 0: both are Indri 1: both are Lucene 2: from is Indri, to is Lucene 3: from is Lucene, to is Indri 1 [1] "map 0.36753382 0.37763448 p= 0.08422603" [1] "ndcg 0.61636022 0.63156058 p= 0.05343358" [1] "P_20 0.65677233 0.703367 p= 0.00118885" [1] "ndcg_cut_20 0.5664571 0.60065813 p= 0.02222551" [1] "P_100 0.52135 0.53074987 p= 0.194255" [1] "ndcg_cut_100 0.5224996 0.53655289 p= 0.06450026" thphan@biocaddie-dev:/data/thphan/biocaddie$ Rscript scripts/new/compare.R test jmbm25 jmbm25-snowball short biocaddie Please enter run methods for comparison: 0: both are Indri 1: both are Lucene 2: from is Indri, to is Lucene 3: from is Lucene, to is Indri 1 [1] "map 0.33823543 0.34483764 p= 0.26030317" [1] "ndcg 0.60226105 0.60586239 p= 0.33580945" [1] "P_20 0.72337467 0.6773 p= 0.88858548" [1] "ndcg_cut_20 0.5715917 0.58136006 p= 0.25512775" [1] "P_100 0.5506 0.49875413 p= 0.550209" [1] "ndcg_cut_100 0.49965186 0.5289539 p= 0.00260441" thphan@biocaddie-dev:/data/thphan/biocaddie$ Rscript scripts/new/compare.R test bm25rocchio bm25rocchio-snowball short biocaddie Please enter run methods for comparison: 0: both are Indri 1: both are Lucene 2: from is Indri, to is Lucene 3: from is Lucene, to is Indri 1 [1] "map 0.35434044 0.37643959 p= 0.03176841" [1] "ndcg 0.61056417 0.62396052 p= 0.09459667" [1] "P_20 0.74676967 0.737267 p= 0.85482037" [1] "ndcg_cut_20 0.59175403 0.6006598 p= 0.27750465" [1] "P_100 0.506492 0.54135453 p= 0.02090035" [1] "ndcg_cut_100 0.51864912 0.539525 p= 0.04410625" |