You are viewing an old version of this page. View the current version.

Compare with Current View Page History

Version 1 Next »

This page outlines an approach for Brown Dog to become the definitive source for scientific community curated box skills.

Architecture

When a Skill is registered with a Box account, the invocation URL is provided. This URL will resolve to an endpoint in Fence.

Extractor Invocation

The RabbitMQ message for invoking an extractor will be changed to add a new property source which can be set to:

  • clowder
  • box
  • dataverse
  • etc...

The pyclowder library will use this source property to determine how to download the file and to update metadata. The extractor may use the source property to determine how to format the metadata.

Tools Catalog

The Brown Dog Tools Catalog will be the source of scientific community curated extractors. Extractors will be categorized into Organizations and Repositories within an organization. This pattern matches GitHub and DockerHub. The organizations will reflect scientific communities which will be responsible for curating the extractors in their repos.

Different versions and configurations of extractors can be specified by the use of tags.

Exported image for the selected wireframe not found. Did someone delete it?


The tools catalog will rely on an underlying Git repo for storing extractor_info documents and keep track of versions, issues, branches, and pull requests. It will download the extractor_info.json file to populate information on the page.

It will furnish Box enterprise admins with URLs that expose the tool as a skill. Initially they will have to copy and paste the URL. Once Box exposes management of Skills through an API this can be further automated.

Stories

Here are some initial stories to help us implement this vision:

Skills Workflow

Endpoints for Box Skills

Add unsecured endpoints to Fence for skills notifications in the form of /skills/repo/extractor?tag=tag

Source property for rabbitMq message

Add a new property to the rabbitMQ message that contains the source for the invocation (clowder, box, dataverse...)

Route Box Sills Invocations to correct Extractor

Translate the repo and extractor name to a queue name. Translate the tag into a routing key. Implement bindings to enforce routing messages by tools catalog tag.

Pyclowder downloads file from Box

Extend Pyclowder to observe the source property in the message and use the Box SDK to download the file locally to the docker container. Retain the existing functionality for Clowder sourced files.

Pyclowder uploads metadata to Box

Extend Pyclowder to make the files.upload_metadata method respect the source property and use the Box SDK to upload metadata to a box file. Retain the existing functionality for Clowder sourced files.

Tools Catalog

Hello, Tools Catalog

Create a new Play 2.6 app based on the Clusterman code base. 

GitHub Social Auth

Configure Silhouette with GitHub Social Auth

Organization Page

Configure a MongoDB Collection with basic organization data (basically the name and the GitHub URL)

Display a basic organization page that includes the list of all of the repos owned by that organiztion

Repository Page

User can click on a repository link from the Organization page and see information about that repository

Link to BDFiddle to try out tool

Repo page shows a link to BDFiddle where the user can try out a tool

Tags

Repo page interrogates GitHub for list of tags and displays them

Initial Skills

Prioritise these, port to the new Pyclowder and deploy to Tools Catalog

  • Langid
  • DBPedia
  • Census From Cell
  • Handwritten Decimals
  • Killed Photos
  • Mean Grey
  • Faces
  • Eyes
  • Profiles
  • Closeups
  • NLTK Summary
  • Stanford CoreNLP
  • Tesseract
  • Tika
  • Versus
  • VLFeat



  • No labels