You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 12 Next »

6/27

  • Thuong's last day ~7/15; Garrick out next week.
  • Sprint 28 priorities
    • Create ElasticSearch indexes for PubMed and Wikipedia
    • Lucene baseline runs: Use LuceneRunQuery to run baselines for all collections for comparison
    • Lucene Rocchio runs: Once reviewed/merged, use LuceneRunQuery for Rocchio baselines for all collections
    • Plugin implementation: With the Rocchio implementation, it should be straightforward to finalize the ElasticSearch plugin
    • Audit/cleanup results: Review everything we've done, make sure we've run all models we want to
    • Finalize QPP analysis
    • Revisit repository priors
  • Revisit statement of work and task status (BioCADDIE)
    • What we've done:
      • Comparative evaluation of RM and Rocchio using BioCADDIE test collection
      • Comparative evaluation of SDM
      • Decided what to implement (ElasticSearch plugin, Rocchio expansion)
    • Still need to do
      • Implement actual plugin
      • Implement PubMed OA index and ingest process (ElasticSearch)
      • Testing (test plan, integration, performance, execution)
      • Release packaging (in progress)
      • Documentation
    • What we can't do
      • Analysis with respect to current pipeline (we never got it running)
    • What we did that wasn't on the SOW
      • Comparative evaluation with CDS, OHSUMED, Genomics
      • Document expansion
      • Train/test analysis
      • Query performance prediction

6/20

  • Sprint 27 extended until June 23
  • ElasticSearch 1.7.5: plugin framework not working, will implement with newer ElasticSearch version for BioCADDIE deliverable.
  • Train/test query analysis, rerunning test queries only (NDS-939)
  • Rocchio expansion with Lucene
  • Query performance prediction/adaptive feedback
  • TREC Genomics baseline

6/13

  • Sprint 28 extended until June 23
  • Craig in Seattle
  • Dirichlet scorer
    • Lucene does not support true language modeling. Index structure is designed for TFIDF/BM25
    • We will abandon LM in Lucene and focus on Rocchio expansion
  • CDS/OHSUMED analysis

6/8/2017

  • Mike is on vacation
  • Craig in Seattle next week
  • Dirichlet scorer (NDS-914)
    • Dense to get through
  • Boolean retrieval (NDS-912)
    • Surprising result: RM3 did reasonably well
    • Not pursue
  • TREC-CDS (NDS-917)
    • Why does OKAPI do so poorly?
    • RM3 is just as expected
    • Conclusion: 
  • OHSUMED (NDS-929)
    • Surprising that LM is lower
    • RM3 is better
    • No judged non-relevant
    • Why is TFIDF so much better?
  • Query performance prediction
    • Craig to send QPP papers
  • Query characterization
    • Garrick: 
      • There are a couple of queries that are really similar – look at query pairs
    • Error analysis
  • Sprint 27 tasks
    • Differences in Qrels for example/test queries, we haven't looked at it
      • Analysis of variance of scores for example/test
    • Error analysis
    • More on query characterization
    • More on QPP
    • More on Lucene

5/25/2017

Notes from NDS/BioCADDIE team meeting.  This meeting is primarily to plan for the next sprint. The following are up for discussion:

  • Evaluation framework -- where should we go from here? 
    • Clean-up/prune ir-utils
    • Lucene-centric evaluation (lucene4ir)
    • Improving the shell-script approach (balance understandability/simplicity with scale)
    • Possible tasks:
      • Tie breaking
      • Retrieval models without rescoring
        • Hack Indri or extend Lucene
      • Extend Lucene
        • Dirichlet + TwoStage
        • RM/RM3
        • Is it KL
        • PLM
        • LDA
        • Kmeans
        • Handling priors
        • CER 
  • Distributed evaluation (Kubernetes)
    • Mike has a prototype working with hyperkube
    • Comment about missing Okapi expansion
    • Possible tasks: 
      • Test on a real cluster via deploy-tools (NDS-hackathon project)
      • Provision attached storage for each node (already done with deploy-tools?)
      • How can we get data to and from all of the nodes (for prototype, manual is fine). Ideally, something similar to hdfsput hdfs get from hadoop.
      • Garrick: qrels/topics?
      • Explore AWS/GCE/Azure?
  • ES RM plugin
    • Possible tasks:
      • 1.7.5 support!! (NDS-897)
      • Actually implement the plugin (NDS-868)
      • Custom scoring exploration  (Garrick)
  • Stemming in ES (NDS-885)
    • Create index both stemmed (Snowball) and unstemmed
  • VM resources: 
    • SDSC vs NCSA
    • Shared data directories
  • Performance characterization (recommended by Kirk)
  • New ideas?
    • Boolean/"sufficient" query - (Garrick)
      • Boolean queries in Indri Queries: scoreif
    • Structured search (using the document structure somehow)
    • Try other collections (UMLS/MeSH, medical subsets)
    • Analyze relevance judgments
    • Compare baselines against medical collections
    • Cluster-based expansion models
    • Query performance prediction

Sprint 27 tasks

  • Thuong:
    • Finalize stemming work
    • TREC-CDS baseline runs
    • Boolean/sufficient-query runs
  • Garrick
    • Boolean/sufficient-query runs
    • Lucene Dirichlet implementation
    • Custom scoring exploration 
    • QPP
    • ir-utils cleanup
  • Craig
    • LOOCV tie-breaking
    • Output performance characterization
    • ir-utils evaluation framework
  • Mike
    • 1.7.5 plugin support (NDS-897)
    • Implement RM plugin (NSD-868)
    • Distributed evaluation on real cluster (NDS-hackathon)
    • Define process for copying index data to nodes. Ideally, similar to hadoop fs put
    • Explore running on AWS/GCE or Azure


5/23/2017

Notes from BioCADDIE core developer meeting

  • Presented status update
  • BioCADDIE is running ES 1.7.5 in production, but more recent versions in development 
  • Xiaoling emailed results from DataMed system for full test collection in TREC format.
  • Kirk suggested that we look at a fallback strategy – use one model for higher precision, another for long tail
    • When does it work? What queries does it work for?
    • Better characterization of what's working
    • DataMed is a P@20 system, mainly
  • Gerard? has installed the current pipeline and will document. Maybe we can do the same.


  • No labels