Blog from March, 2015

Towards highlighting what can be done with the Brown Dog services we have built a number of clients (e.g. a javascript library, bookmarklets for the DAP and DTS, and a Chrome extension).  We have added a couple more including a python library and a command line interface (shown below).  In the first video we show a conversion to access a directory of Kodak Photo CD images then chaining that with an extraction call to pull information from the image.  We emphasize that none of these capabilities reside locally but are pulled together from an extensible collection of tools distributed within the cloud.



In the next video we show how to use the extracted data to build an index over a collection of files and then perform a search to find similar data:


The Brown Dog Tools Catalog will serve as a means to both collect new conversion/extraction tools from the community while simultaneously serving as platform for finding and preserving such tools.  We have populated the Tools Catalog with a handful of the Medici extractors and Polyglot conversion scripts to get it started:

Adding a new tool is a simple matter of registering the tool along with a very simple wrapper script that will allow either the DAP or DTS to make use of it.  Below we show an example adding scripts to make use of PEcAn for ecological model conversions:

Working to add support for data conversions in support of ecological models via the tools developed in PEcAn.  For example Ameriflux data to the netCDF CF standard used with PEcAn:

and converting that to the format needed by various models, e.g. SIPNET:

The DAP, built on top of NCSA Polyglot, chains conversion tools together within the "cloud" allowing one to jump from format to format as needed.  The DAP will eventually support the intelligent moving of data/computation to handle large datasets (e.g. NARR) and support a variety of models within ecology (e.g. ED) as well as other domains.

Some videos of the new Brown Dog Google Chrome extension allowing the Data Tilling Service (DTS), based off of NCSA Medici, to be called on arbitrary pages in order to index collections of data.  Note the text in the queries is not part of the page or images on the page but being extracted from the image contents using cloud hosted tools within the DTS, specifically the face detector within OpenCV and the Tesserct OCR engine.  The DTS will host a suite of such tools and make it easy for users to add additional tools:


Added some higher resolution videos of the DAP bookmarklet being used for images:

for documents:

for 3D data:

and for archive/container files (e.g. zip, rar):

Added a couple higher resolution videos of the dap.js library which allows one to change formats of data within HTML so that they are more broadly viewable over the web.  An example within links converting a JPEG-2000 and SID image to JPEG and a PNG image to XML via the opensource Daffodil implementation of DFDL:

and an example within image tags: