Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

7/18

  • Contract ends 7/30
  • What's left
    • ElasticSearch plugin – move repo (Mike)
    • Testing – at least a manual test plan, automated would be great (Mike)
    • PubMed ingest process (Craig)
    • biocaddie + plugin repo release (Craig)
    • Collect all data in place
    • Documentation/presentation
  • Bonus
    • Parallel documentation
    • Kubernetes review
    • Publish data?
    • Doc expansion on OHSUMED + Genomics (Garrick)
      • Also PubMed expansion (Craig)
    • "Priors" – if we wanted to implement priors in Lucene/ElasticSearch, how would we?

7/11

  • Mike at PEARC this week; Thuong's last week
  • Final deliverables:
    • ElasticSearch plugin (NDS-868) and test process (NDS-956)
    • PubMed ingest process (new)
    • biocaddie repo release
    • Documentation/whitepaper
      • Results of comparative evaluation
      • Indri v Lucene
      • Baselines
      • BM25, BM25+Rocchio, BM25+PubMed Rocchio
  • Others
    • Kubernetes + parallel
    • Publish data?
  • Report/paper points (ECIR/10-16-17; 
    • BioCADDIE
      • Baseline results
      • Query expansion and document expansion results
      • Indri > Lucene/ElasticSearch
        • Lucene's models aren't valid
        • No built-in query expansion
        • Limitations of the real-world search engine
      • Test collection
        • Train v test
        • Short v orig
      • Query characterization and QPP
    • Other collections
      • OHSUMED/TRECDS?/Genomics
    • Infrastructure
      • ir-tools/Maven
      • Cross-validation
      • Kubernetes/parellel

...