Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

ModelMAPNDCGP@20NDCG@20P@100NDCG@100NotesDate
classic tfidf0.32820.58240.68670.54780.50130.5018No parameters07/10/17
classic tfidf (tfidf-snowball)0.33750.59440.66670.52560.49870.5002No parameters07/10/17
BM250.35430.61050.74670.59170.5060.5186Sweep b, k107/10/17
BM25 (bm25-snowball)0.3764+0.62390.730.60060.5413+0.539+Sweep b, k107/10/17
QL (JM)0.33820.60220.72330.5710.50.4996Sweep lambda07/10/17
QL (JM) (jm-snowball)0.34480.60580.670.58130.49870.5289+Sweep lambda07/10/17
QL (Dir)

0.3675

0.6163

0.65670.56640.52130.522

Sweep mu

07/10/17
QL (Dir) (dir-snowball)

0.3776

0.6315

(p-value=0.0534)

0.7033+0.6006+0.53070.5365

Sweep mu

07/10/17
Rocchio0.4044

0.6417

0.69670.54030.4920.4912Sweep b, k1, fbTerms, fbDocs, fbOrigWeight07/1011/17
Rocchio (rocchio-snowball)0.3959

0.6052-

0.72670.598+0.5453+0.525Sweep b, k1, fbTerms, fbDocs, fbOrigWeight07/1011/17


Verification

Using biocaddie_all indexes:

...

No Format
thphan@biocaddie-dev:/data/thphan/biocaddie$ Rscript scripts/new/compare.R test tfidf-snowball dir-snowball short biocaddie
Please enter run methods for comparison:
        0: both are Indri
        1: both are Lucene
        2: from is Indri, to is Lucene
        3: from is Lucene, to is Indri
1
[1] "map 0.3375 0.3776 p= 0.0387"
[1] "ndcg 0.5944 0.6315 p= 0.0072"
[1] "P_20 0.6667 0.7033 p= 0.1042"
[1] "ndcg_cut_20 0.5256 0.6006 p= 0.0046"
[1] "P_100 0.4987 0.5307 p= 0.0652"
[1] "ndcg_cut_100 0.5002 0.5365 p= 0.0207"

thphan@biocaddie-dev:/data/thphan/biocaddie$ Rscript scripts/new/compare.R test tfidf-snowball jm-snowball short biocaddie
Please enter run methods for comparison:
        0: both are Indri
        1: both are Lucene
        2: from is Indri, to is Lucene
        3: from is Lucene, to is Indri
1
[1] "map 0.3375 0.3448 p= 0.2782"
[1] "ndcg 0.5944 0.6058 p= 0.2069"
[1] "P_20 0.6667 0.67 p= 0.475"
[1] "ndcg_cut_20 0.5256 0.5813 p= 0.0161"
[1] "P_100 0.4987 0.4987 p= 0.5"
[1] "ndcg_cut_100 0.5002 0.5289 p= 0.0117"

thphan@biocaddie-dev:/data/thphan/biocaddie$ Rscript scripts/new/compare.R test tfidf-snowball bm25-snowball short biocaddie
Please enter run methods for comparison:
        0: both are Indri
        1: both are Lucene
        2: from is Indri, to is Lucene
        3: from is Lucene, to is Indri
1
[1] "map 0.3375 0.3764 p= 0.0284"
[1] "ndcg 0.5944 0.6239 p= 0.011"
[1] "P_20 0.6667 0.73 p= 0.0331"
[1] "ndcg_cut_20 0.5256 0.6006 p= 0.0045"
[1] "P_100 0.4987 0.5413 p= 0.0326"
[1] "ndcg_cut_100 0.5002 0.539 p= 0.0149"

Compare results between unstemmed and stemmed indexes:

No Format



thphan@biocaddie-dev:/data/thphan/biocaddie$ Rscript scripts/new/compare.R test tfidf tfidfrocchio-snowball short biocaddie
Please enter run methods for comparison:
        0: both are Indri
        1: both are Lucene
        2: from is Indri, to is Lucene
        3: from is Lucene, to is Indri
1
[1] "map 0.3282 0.33753959 p= 0.14630427"
[1] "ndcg 0.5824 0.59446052 p= 0.14542869"
[1] "P_20 0.6867 0.66677267 p= 0.8081189"
[1] "ndcg_cut_20 0.5478 0.5256598 p= 0.87150424"
[1] "P_100 0.5013 0.49875453 p= 0.58191152"
[1] "ndcg_cut_100 0.5018 0.5002525 p= 0.56522733"

Compare results between unstemmed and stemmed indexes:

No Format

thphan@biocaddie-dev:/data/thphan/biocaddie$ Rscript scripts/new/compare.R test dirtfidf dirtfidf-snowball short biocaddie
Please enter run methods for comparison:
        0: both are Indri
        1: both are Lucene
        2: from is Indri, to is Lucene
        3: from is Lucene, to is Indri
1
[1] "map 0.36753282 0.37763375 p= 0.08421463"
[1] "ndcg 0.61635824 0.63155944 p= 0.05341454"
[1] "P_20 0.65676867 0.70336667 p= 0.0011808"
[1] "ndcg_cut_20 0.56645478 0.60065256 p= 0.02228715"
[1] "P_100 0.52135013 0.53074987 p= 0.19425819"
[1] "ndcg_cut_100 0.5225018 0.53655002 p= 0.06455652"

thphan@biocaddie-dev:/data/thphan/biocaddie$ Rscript scripts/new/compare.R test jmdir jmdir-snowball short biocaddie
Please enter run methods for comparison:
        0: both are Indri
        1: both are Lucene
        2: from is Indri, to is Lucene
        3: from is Lucene, to is Indri
1
[1] "map 0.33823675 0.34483776 p= 0.26030842"
[1] "ndcg 0.60226163 0.60586315 p= 0.33580534"
[1] "P_20 0.72336567 0.677033 p= 0.88850011"
[1] "ndcg_cut_20 0.5715664 0.58136006 p= 0.25510222"
[1] "P_100 0.55213 0.49875307 p= 0.551942"
[1] "ndcg_cut_100 0.4996522 0.52895365 p= 0.00260645"

thphan@biocaddie-dev:/data/thphan/biocaddie$ Rscript scripts/new/compare.R test bm25jm bm25jm-snowball short biocaddie
Please enter run methods for comparison:
        0: both are Indri
        1: both are Lucene
        2: from is Indri, to is Lucene
        3: from is Lucene, to is Indri
1
[1] "map 0.35433382 0.37643448 p= 0.03172603"
[1] "ndcg 0.61056022 0.62396058 p= 0.09453358"
[1] "P_20 0.74677233 0.7367 p= 0.85488885"
[1] "ndcg_cut_20 0.5917571 0.60065813 p= 0.27752551"
[1] "P_100 0.5065 0.54134987 p= 0.020955"
[1] "ndcg_cut_100 0.51864996 0.5395289 p= 0.0026"

thphan@biocaddie-dev:/data/thphan/biocaddie$ Rscript scripts/new/compare.R test bm25 bm25-snowball short biocaddie
Please enter run methods for comparison:
        0: both are Indri
        1: both are Lucene
        2: from is Indri, to is Lucene
        3: from is Lucene, to is Indri
1
[1] "map 0.3543 0.3764 p= 0.0317"
[1] "ndcg 0.6105 0.6239 p= 0.0945"
[1] "P_20 0.7467 0.73 p= 0.8548"
[1] "ndcg_cut_20 0.5917 0.6006 p= 0.2775"
[1] "P_100 0.506 0.5413 p= 0.0209"
[1] "ndcg_cut_100 0.5186 0.539 p= 0.0441"

thphan@biocaddie-dev:/data/thphan/biocaddie$ Rscript scripts/new/compare.R test rocchio rocchio-snowball short biocaddie
Please enter run methods for comparison:
        0: both are Indri
        1: both are Lucene
        2: from is Indri, to is Lucene
        3: from is Lucene, to is Indri
1
[1] "map 0.4044 0.3959 p= 0.6841"
[1] "ndcg 0.6417 0.6052 p= 0.9667"
[1] "P_20 0.6967 0.7267 p= 0.2037"
[1] "ndcg_cut_20 0.5403 0.598 p= 0.0465"
[1] "P_100 0.492 0.5453 p= 0.0035"
[1] "ndcg_cut_100 0.4912 0.525 p= 0.0625"