Uploaded image for project: 'National Data Service'
  1. National Data Service
  2. NDS-930

Consider TREC Genomics track for baseline comparison

XMLWordPrintableJSON

    • Icon: Task Task
    • Resolution: Fixed
    • Icon: Normal Normal
    • None
    • None
    • None
    • NDS Sprint 27, NDS Sprint 28

      One of the goals of the CDS and OHSUMED baseline is to find test collections that are comparable to the BioCADDIE collection, but for full-text search instead of data search. As we've discovered, CDS isn't a perfect fit (the queries are very different – focused on clinical summaries). 

      The Genomics track isn't a perfect fit either, but may be closer than some of the medical/health records collections.

      Take a look a the Genomics track guidelines and data:

      http://trec.nist.gov/data/genomics.html

      Do you think this is a reasonable baseline for comparison to BioCADDIE?  Are there any issues with comparing results from the two? 

      Write up a summary of your findings in this ticket or in the Wiki.  If it seems like a reasonable fit, create a new JIRA ticket describing what needs to be done to run the baselines. 

              thphan2 Thuong Phan
              willis8 Craig Willis
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: