5/25/2017
Notes from NDS/BioCADDIE team meeting. This meeting is primarily to plan for the next sprint. The following are up for discussion:
- Evaluation framework -- where should we go from here?
- Clean-up/prune ir-utils
- Lucene-centric evaluation (lucene4ir)
- Improving the shell-script approach (balance understandability/simplicity with scale)
- Possible tasks:
- Tie breaking
- Retrieval models without rescoring
- Hack Indri or extend Lucene
- Extend Lucene
- Dirichlet + TwoStage
- RM/RM3
- Is it KL
- PLM
- LDA
- Kmeans
- Handling priors
- CER
- Distributed evaluation (Kubernetes)
- Mike has a prototype working with hyperkube
- Comment about missing Okapi expansion
- Possible tasks:
- Test on a real cluster via deploy-tools (NDS-hackathon project)
- Provision attached storage for each node (already done with deploy-tools?)
- How can we get data to and from all of the nodes (for prototype, manual is fine). Ideally, something similar to hdfsput hdfs get from hadoop.
- Garrick: qrels/topics?
- Explore AWS/GCE/Azure?
- ES RM plugin
- Possible tasks:
- 1.7.5 support!! (NDS-897)
- Actually implement the plugin (NDS-868)
- Custom scoring exploration (Garrick)
- Possible tasks:
- Stemming in ES (NDS-885)
- Create index both stemmed (Snowball) and unstemmed
- VM resources:
- SDSC vs NCSA
- Shared data directories
- Performance characterization (recommended by Kirk)
- New ideas?
- Boolean/"sufficient" query - (Garrick)
- Boolean queries in Indri Queries: scoreif
- Structured search (using the document structure somehow)
- Try other collections (UMLS/MeSH, medical subsets)
- Analyze relevance judgments
- Compare baselines against medical collections
- TREC CDS – uh, this is the PubMed Open Access collection...
- CLEF eHealth
- OHSUMED
- Cluster-based expansion models
- Query performance prediction
- Boolean/"sufficient" query - (Garrick)
5/23/2017
Notes from BioCADDIE core developer meeting
- Presented status update
- BioCADDIE is running ES 1.7.5 in production, but more recent versions in development
- Xiaoling emailed results from DataMed system for full test collection in TREC format.
- Kirk suggested that we look at a fallback strategy – use one model for higher precision, another for long tail
- When does it work? What queries does it work for?
- Better characterization of what's working
- DataMed is a P@20 system, mainly
- Gerard? has installed the current pipeline and will document. Maybe we can do the same.