Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

ModelMAPNDCGP@20NDCG@20P@100NDCG@100NotesDate
classic tfidf0.32820.58240.68670.54780.50130.5018No parameters07/05/17
BM250.35430.6105+0.7467+0.5917+0.5060.5186Sweep b, k107/05/17
QL (JM)0.33820.60220.72330.5710.50.4996Sweep lambda07/05/17
QL (Dir)

0.3675

(p-value=0.0548)

0.6163+

(p-value=0.0502)

0.65670.56640.52130.522

Sweep mu

07/05/17
Rocchio0.4044+

0.6417

(p-value= p= 0.0533)

0.69670.54030.4920.4912Sweep b, k1, fbTerms, fbDocs, fbOrigWeight07/05/17

...

ModelMAPNDCGP@20NDCG@20P@100NDCG@100NotesDate
classic tfidf0.32820.58240.68670.54780.50130.5018No parameters07/10/17
classic tfidf (tfidf-snowball)0.33750.59440.66670.52560.49870.5002No parameters07/10/17
BM250.35430.61050.74670.59170.5060.5186Sweep b, k107/10/17
BM25 (bm25-snowball)0.3764+0.62390.730.60060.5413+0.539+Sweep b, k107/10/17
QL (JM)0.33820.60220.72330.5710.50.4996Sweep lambda07/10/17
QL (JM) (jm-snowball)0.34480.60580.670.58130.49870.5289+Sweep lambda07/10/17
QL (Dir)

0.3675

0.6163

0.65670.56640.52130.522

Sweep mu

07/10/17
QL (Dir) (dir-snowball)

0.3776

0.6315

(p-value=0.0534)

0.7033+0.6006+0.53070.5365

Sweep mu

07/10/17
Rocchio0.4044

0.6417

0.69670.54030.4920.4912Sweep b, k1, fbTerms, fbDocs, fbOrigWeight07/1011/17
Rocchio (rocchio-snowball)0.3959

0.6052-

0.72670.598+0.5453+0.525Sweep b, k1, fbTerms, fbDocs, fbOrigWeight07/1011/17


Verification

Using biocaddie_all indexes:

No Format
thphan@biocaddie-dev:/data/thphan/biocaddie$ Rscript scripts/new/compare.R test tfidf dir short biocaddie
Please enter run methods for comparison:
        0: both are Indri
        1: both are Lucene
        2: from is Indri, to is Lucene
        3: from is Lucene, to is Indri
1
[1] "map 0.3282 0.3675 p= 0.0548"
[1] "ndcg 0.5824 0.6163 p= 0.0502"
[1] "P_20 0.6867 0.6567 p= 0.9461"
[1] "ndcg_cut_20 0.5478 0.5664 p= 0.186"
[1] "P_100 0.5013 0.5213 p= 0.2168"
[1] "ndcg_cut_100 0.5018 0.522 p= 0.1401"


thphan@biocaddie-dev:/data/thphan/biocaddie$ Rscript scripts/new/compare.R test tfidf jm short biocaddbiocaddie
Please enter run methods for comparison:
        0: both are Indri
        1: both are Lucene
        2: from is Indri, to is Lucene
        3: from is Lucene, to is Indri
1
[1] "map 0.3282 0.3382 p= 0.1719"
[1] "ndcg 0.5824 0.6022 p= 0.0932"
[1] "P_20 0.6867 0.7233 p= 0.0831"
[1] "ndcg_cut_20 0.5478 0.571 p= 0.145"
[1] "P_100 0.5013 0.5                           iep= 0.5301"
[1] "ndcg_cut_100 0.5018 0.4996 p= 0.5552"


thphan@biocaddie-dev:/data/thphan/biocaddie$ Rscript scripts/new/compare.R test tfidf bm25 short biocaddie
Please enter run methods for comparison:
        0: both are Indri
        1: both are Lucene
        2: from is Indri, to is Lucene
        3: from is Lucene, to is Indri
1
[1] "map 0.3282 0.3543 p= 0.0846"
[1] "ndcg 0.5824 0.6105 p= 0.0148"
[1] "P_20 0.6867 0.7467 p= 0.0491"
[1] "ndcg_cut_20 0.5478 0.5917 p= 0.0496"
[1] "P_100 0.5013 0.506 p= 0.428"
[1] "ndcg_cut_100 0.5018 0.5186 p= 0.2195"

thphan@biocaddie-dev:/data/thphan/biocaddie$ Rscript scripts/new/compare.R test tfidf rocchio short biocaddie
Please enter run methods for comparison:
        0: both are Indri
        1: both are Lucene
        2: from is Indri, to is Lucene
        3: from is Lucene, to is Indri
1
[1] "map 0.3282 0.33824044 p= 0.17190188"
[1] "ndcg 0.5824 0.60226417 p= 0.09320533"
[1] "P_20 0.6867 0.72336967 p= 0.08313785"
[1] "ndcg_cut_20 0.5478 0.5715403 p= 0.1456276"
[1] "P_100 0.5013 0.5492 p= 0.53016184"
[1] "ndcg_cut_100 0.5018 0.49964912 p= 0.5552"


6071"

Using biocaddie_all.snowball indexes

No Format
thphan@biocaddie-dev:/data/thphan/biocaddie$ Rscript scripts/new/compare.R test tfidf-snowball bm25dir-snowball short biocaddie
Please enter run methods for comparison:
        0: both are Indri
        1: both are Lucene
        2: from is Indri, to is Lucene
        3: from is Lucene, to is Indri
1
[1] "map 0.32823375 0.35433776 p= 0.08460387"
[1] "ndcg 0.58245944 0.61056315 p= 0.01480072"
[1] "P_20 0.68676667 0.74677033 p= 0.04911042"
[1] "ndcg_cut_20 0.54785256 0.59176006 p= 0.04960046"
[1] "P_100 0.50134987 0.5065307 p= 0.4280652"
[1] "ndcg_cut_100 0.50185002 0.51865365 p= 0.21950207"

thphan@biocaddie-dev:/data/thphan/biocaddie$ Rscript scripts/new/compare.R test tfidf-snowball rocchiojm-snowball short biocaddie
Please enter run methods for comparison:
        0: both are Indri
        1: both are Lucene
        2: from is Indri, to is Lucene
        3: from is Lucene, to is Indri
1
[1] "map 0.32823375 0.40443448 p= 0.01882782"
[1] "ndcg 0.58245944 0.64176058 p= 0.05332069"
[1] "P_20 0.68676667 0.696767 p= 0.3785475"
[1] "ndcg_cut_20 0.54785256 0.54035813 p= 0.62760161"
[1] "P_100 0.50134987 0.4924987 p= 0.61845"
[1] "ndcg_cut_100 0.50185002 0.49125289 p= 0.6071"

Using biocaddie_all.snowball indexes

No Format
0117"

thphan@biocaddie-dev:/data/thphan/biocaddie$ Rscript scripts/new/compare.R test tfidf-snowball dirbm25-snowball short biocaddie
Please enter run methods for comparison:
        0: both are Indri
        1: both are Lucene
        2: from is Indri, to is Lucene
        3: from is Lucene, to is Indri
1
[1] "map 0.3375 0.37763764 p= 0.03870284"
[1] "ndcg 0.5944 0.63156239 p= 0.0072011"
[1] "P_20 0.6667 0.703373 p= 0.10420331"
[1] "ndcg_cut_20 0.5256 0.6006 p= 0.00460045"
[1] "P_100 0.4987 0.53075413 p= 0.06520326"
[1] "ndcg_cut_100 0.5002 0.5365539 p= 0.02070149"


thphan@biocaddie-dev:/data/thphan/biocaddie$ Rscript scripts/new/compare.R test tfidf-snowball jmrocchio-snowball short biocaddie
Please enter run methods for comparison:
        0: both are Indri
        1: both are Lucene
        2: from is Indri, to is Lucene
        3: from is Lucene, to is Indri
1
[1] "map 0.33753282 0.34483959 p= 0.27820427"
[1] "ndcg 0.59445824 0.60586052 p= 0.20692869"
[1] "P_20 0.66676867 0.677267 p= 0.4751189"
[1] "ndcg_cut_20 0.52565478 0.5813598 p= 0.01610424"
[1] "P_100 0.49875013 0.49875453 p= 0.51152"
[1] "ndcg_cut_100 0.50025018 0.5289525 p= 0.01172733"

Compare results between unstemmed and stemmed indexes:

No Format
thphan@biocaddie-dev:/data/thphan/biocaddie$ Rscript scripts/new/compare.R test tfidf-snowball bm25tfidf-snowball short biocaddie
Please enter run methods for comparison:
        0: both are Indri
        1: both are Lucene
        2: from is Indri, to is Lucene
        3: from is Lucene, to is Indri
1
[1] "map 0.33753282 0.37643375 p= 0.02841463"
[1] "ndcg 0.59445824 0.62395944 p= 0.0111454"
[1] "P_20 0.66676867 0.736667 p= 0.0331808"
[1] "ndcg_cut_20 0.52565478 0.60065256 p= 0.00458715"
[1] "P_100 0.49875013 0.54134987 p= 0.03265819"
[1] "ndcg_cut_100 0.50025018 0.5395002 p= 0.0149"

Compare results between unstemmed and stemmed indexes:

No Format
5652"

thphan@biocaddie-dev:/data/thphan/biocaddie$ Rscript scripts/new/compare.R test tfidfdir tfidfdir-snowball short biocaddie
Please enter run methods for comparison:
        0: both are Indri
        1: both are Lucene
        2: from is Indri, to is Lucene
        3: from is Lucene, to is Indri
1
[1] "map 0.32823675 0.33753776 p= 0.14630842"
[1] "ndcg 0.58246163 0.59446315 p= 0.14540534"
[1] "P_20 0.68676567 0.66677033 p= 0.8080011"
[1] "ndcg_cut_20 0.54785664 0.52566006 p= 0.87150222"
[1] "P_100 0.50135213 0.49875307 p= 0.58191942"
[1] "ndcg_cut_100 0.5018522 0.50025365 p= 0.56520645"

thphan@biocaddie-dev:/data/thphan/biocaddie$ Rscript scripts/new/compare.R test dirjm dirjm-snowball short biocaddie
Please enter run methods for comparison:
        0: both are Indri
        1: both are Lucene
        2: from is Indri, to is Lucene
        3: from is Lucene, to is Indri
1
[1] "map 0.36753382 0.37763448 p= 0.08422603"
[1] "ndcg 0.61636022 0.63156058 p= 0.05343358"
[1] "P_20 0.65677233 0.703367 p= 0.00118885"
[1] "ndcg_cut_20 0.5664571 0.60065813 p= 0.02222551"
[1] "P_100 0.52135 0.53074987 p= 0.194255"
[1] "ndcg_cut_100 0.5224996 0.53655289 p= 0.06450026"

thphan@biocaddie-dev:/data/thphan/biocaddie$ Rscript scripts/new/compare.R test jmbm25 jmbm25-snowball short biocaddie
Please enter run methods for comparison:
        0: both are Indri
        1: both are Lucene
        2: from is Indri, to is Lucene
        3: from is Lucene, to is Indri
1
[1] "map 0.33823543 0.34483764 p= 0.26030317"
[1] "ndcg 0.60226105 0.60586239 p= 0.33580945"
[1] "P_20 0.72337467 0.6773 p= 0.88858548"
[1] "ndcg_cut_20 0.5715917 0.58136006 p= 0.25512775"
[1] "P_100 0.5506 0.49875413 p= 0.550209"
[1] "ndcg_cut_100 0.49965186 0.5289539 p= 0.00260441"

thphan@biocaddie-dev:/data/thphan/biocaddie$ Rscript scripts/new/compare.R test bm25rocchio bm25rocchio-snowball short biocaddie
Please enter run methods for comparison:
        0: both are Indri
        1: both are Lucene
        2: from is Indri, to is Lucene
        3: from is Lucene, to is Indri
1
[1] "map 0.35434044 0.37643959 p= 0.03176841"
[1] "ndcg 0.61056417 0.62396052 p= 0.09459667"
[1] "P_20 0.74676967 0.737267 p= 0.85482037"
[1] "ndcg_cut_20 0.59175403 0.6006598 p= 0.27750465"
[1] "P_100 0.506492 0.54135453 p= 0.02090035"
[1] "ndcg_cut_100 0.51864912 0.539525 p= 0.04410625"