5/25/2017
Notes from NDS/BioCADDIE team meeting. This meeting is primarily to plan for the next sprint. The following are up for discussion:
- Evaluation framework -- where should we go from here?
- Clean-up/prune ir-utils
- Lucene-centric evaluation
- Distributed evaluation
- Stemming in ES
- Performance characterization (recommended by Kirk)
- ES RM plugin
- Custom scoring exploration
- VM resources:
- SDSC vs NCSA
- Shared data directories
- New ideas?
- Boolean/"sufficient" query
- Structured search (using the document structure somehow)
- Try other collections (UMLS/MeSH, medical subsets)
- Analyze relevance judgments
- Compare baselines against medical collections
- TREC CDS – uh, this is the PubMed Open Access collection...
- CLEF eHealth
- OHSUMED
- Cluster-based expansion models
5/23/2017
Notes from BioCADDIE core developer meeting
- Presented status update
- BioCADDIE is running ES 1.7.5 in production, but more recent versions in development
- Xiaoling emailed results from DataMed system for full test collection in TREC format.
- Kirk suggested that we look at a fallback strategy – use one model for higher precision, another for long tail
- When does it work? What queries does it work for?
- Better characterization of what's working
- DataMed is a P@20 system, mainly
- Gerard? has installed the current pipeline and will document. Maybe we can do the same.