Terra

  • 1TB per data image data
  • Images are transferred from Gantry to "Data Cache" via FTP
  • Images are transferred from Data Cache to Roger using Globus Transfer
  • Globus monitor watches endpoints for transfer completions. Once complete, files are ingested into Clowder
  • Files are stored on disk, metadata is stored in Mongo
  • Clowder/Globus monitor/Tool Launcher are all run in OpenStack
  • Storage:  1 PB online and 3 PB nearline
  • CyberGIS HPC allocation is used for extractor "elasticity"
  • Clowder uses centralized Rabbit MQ and Extractor bus, hosted at NCSA
  • BrownDog elasticity module is capable of expanding capacity via Docker or PBS.
  • Consider cases of Matlab licences (BYO) or WIndows VMs used by commercial partners

terra-overview

 

MDF

  • 3yr pilot: submit raw data, exchange big data, network of globus endpoints with front end (DSpace), Petrel
  • Administration: requires manual addition of users to endpoints for transfers
  • Keep endpoint up and running
  • Community outreach (get more data)
  • Metadata schema (electron microscopy)

 

mdf-overview

 

schleife-overview

Developer environment:

  • VM; 4CPU; TextWrangler + IntelliJ
  • To be able to replicate Scheife process

HTRC

 

SALAMI

This is the basic workflow used for Stephen Downie's Structural Analysis of Large Amounts of Music Information (SALAM).

  • Seven algorithms for music structure analysis were taken from Music Information Retrieval Exchange (MIREX) competition. These take a standard input (path to files) and produce a standard output (structure information), but could be written in a variety of languages.
  • SALAMI had a 250K hour allocation on XSEDE via ICHASS, Kraken was used because of Matlab licensing.
  • 252,169 songs, ~18,000 hours of audio, generated 1.8 million structure files

salami-overview

PSI

 

  • No labels