Adding Conversions

The DAP manages conversions through wrapper scripts that control input/output operations within specific applications. Any text based scripting language can be used to create these wrapper scripts as all that is required is that they follow certain naming, comment, and input/output conventions (see http://opensource.ncsa.illinois.edu/projects/artifacts/POL/2.0/documentation/ScriptingManual.pdf for full details). Specifically, a script should be named as follows:

[Application Alias][#Optional Comment]_[operation].[extension]

where the underscore in the name is required and separates the application alias which uniquely identifies the program carrying out the conversion and the operation (e.g. open, save, convert).  Open and save operations perform one half of a conversion, opening a file in a program or saving an already open file in a program, while the convert operation does both within the same script.  As an example consider an R script that uses PEcAn to convert data in the PEcAn standard netCDF CF format to the format required by the SIPNET model:

PEcAn#Sipnet_convert.R

Here "PEcAn" is the alias identifying PEcAn as the software being used.  This alias will be used to group together operations that use the same application.  The comment "Sipnet", while helpful to a user seeing this script, provides this script with a unique name should there be multiple scripts for PEcAn.  Inside the script there must be a small header made up of comments.  What this header looks like depends on the operation, as outlined he scripting manual mentioned above.  For the case of the convert operation carried out by the above PEcAn script this would look as follows:

#Full software name (Version)
#Data types dealt with
#Comma separated list of inputs accepted
#Comma separated list of outputs accepted

and specifically for the example PEcAn script:

#Predictive Ecosystem Analyzer (v1.4.1)
#data
#pecan.nc, pecan.zip
#clim

The remaining script can be essentially a black box carrying out the conversion itself or calling other pieces of code to carry out the conversion (as is the case here for the PEcAn script) so long as it understands that the arguments given to the script represent:

  1. the file name, containing an absolute path, to the input file
  2. the file name, containing an absolute path, to the output file that will be generated
  3. the absolute path to a temporary scratch directory that can be used by the script

Once a wrapper script is created it can be added to the Polyglot repository for deployment to the DAP:

https://opensource.ncsa.illinois.edu/stash/scm/pol/polyglot.git

Specifically:

https://opensource.ncsa.illinois.edu/stash/projects/POL/repos/polyglot/browse/scripts

This will eventually be simplified to make use of the Brown Dog Tools Catalog (analogous to an app store).

Calling the DAP

The DAP can be accessed at:

http://dap.ncsa.illinois.edu:8184/convert/<output_format>/<input_file>

For example posting the file below to the following endpoint will convert the native Ameriflux data over the range specified in the XML to the PEcAn netCDF CF format:

http://browndog.ncsa.illinois.edu/examples/US-Dk3-2001-2003.xml

http://dap.ncsa.illinois.edu:8184/convert/pecan.zip/

In the case of web hosted files as the XML file above one can simply URL encode it directly at the end of the REST endpoint above:

http://dap.ncsa.illinois.edu:8184/convert/pecan.zip/http%3A%2F%2Fbrowndog.ncsa.illinois.edu%2Fexamples%2FUS-Dk3-2001-2003.xml

Note this will actually execute if clicked on.  The endpoint will returne immeidiatley with a URL pointing to the eventual location of the resutling output file.  This file will not exist until the Software Servers needed pick up the job and carry out the conversion.  If the result URL is accessed before the job is completed a 404 File Note Found will be returned.  This can be used by an application to poll the DAP until the job is completed.  One should note also that as a distributed service it may never return (e.g. if the needed Software Servers are all killed) thus programs should be smart enough to handle this. Most modern program languages provide means of making HTTP requests, which is what is required to access a REST service.  For example in R one option is:

http://cran.r-project.org/web/packages/httpRequest/index.html

  • No labels