The goal of this task is to run our baselines against the CLEF eHealth test collections.
https://sites.google.com/site/clefehealth2016/
Basic steps:
- Download test collection. Put data in shared data directory
- Create new build_index scripts in biocaddie/index for collection.
- Build index, put output in shared directory (but also keep a copy on your VM for performance)
- Convert topics to indri format. Check converted topics into biocaddie/queries
- Run baselines (all non-feedback + rm3), add results to Wiki.
When done, create PR with your changes to the biocaddie repo and assign this ticket to Craig for review.