...
The purpose of this document is to provide quick examples of each means of incorporating tools so as to bootstrap ones ability to include their code within one of the two services.
Anchor
...
StartHere
StartHere |
...
StartHere
Start Here
StartHere |
To begin does your code, software, or tool carry out a data conversion or a data extraction? If a conversion the tool should be included in the Data Access Proxy. If an extraction the tool should be included in the Data Tilling Service.
Anchor | ||||
---|---|---|---|---|
|
The Data Access Proxy handles data conversions. If a piece of software or tool exists to carry out the conversion its incorporation into the DAP will be through Polyglot. If the specification of the file format is known then in can be incorporated as a DFDL schema within Daffodil.
Anchor Polyglot Polyglot
Polyglot Software Server Scripts
Polyglot | |
Polyglot |
Software Server scripts are used by Polyglot to automate the interaction with software that is capable of converting from one file format to another. These scripts can directly wrap command line utilities that carry out conversions for use in Polyglot or split the steps of opening a file in one format and saving a file in a different format, typical of GUI driven applications. These wrapper scripts can be written in pretty much any text based scripting language. Below we show a few simple examples. Full details on the creation of these wrapper scripts, the required naming conventions, and required header conventions please refer to the the Scripting Manual.
Anchor CommandLIne CommandLIne
Command Line Applications
CommandLIne | |
CommandLIne |
Bash Script
The following is an example of a bash wrapper script for ImageMagick. Note that it is fairly straight forward. The comments at the top contain the information Polyglot needs to use the application: the name and version of the application, they type of data it supports, the input formats it supports, and the output formats it supports.
Code Block | ||
---|---|---|
| ||
#!/bin/sh
#ImageMagick (v6.5.2)
#image
#bmp, dib, eps, fig, gif, ico, jpg, jpeg, pdf, pgm, pict, pix, png, pnm, ppm, ps, rgb, rgba, sgi, sun, svg, tga, tif, tiff, ttf, x, |
Software Server scripts are used by Polyglot to automate the interaction with software that is capable of converting from one file format to another. These scripts can directly wrap command line utilities that carry out conversions for use in Polyglot or split the steps of opening a file in one format and saving a file in a different format, typical of GUI driven applications. These wrapper scripts can be written in pretty much any text based scripting language. Below we show a few simple examples. Full details on the creation of these wrapper scripts, the required naming convensions, and required header convensions please refer to the the Scripting Manual.
Bash Script
The following is an example of a bash wrapper script for ImageMagick. Note that it is fairly straight forward. The comments at the top contain the information Polyglot needs to use the application: the name and version of the application, they type of data it supports, the input formats it supports, and the output formats it supports.
Code Block | ||
---|---|---|
| ||
#!/bin/sh
#ImageMagick (v6.5.2)
#image
#bmp, dib, eps, fig, gif, ico, jpg, jpeg, pdf, pgm, pict, pix, png, pnm, ppm, ps, rgb, rgba, sgi, sun, svg, tga, tif, tiff, ttf, x, xbm, xcf, xpm, xwd, yuv
#bmp, dib, eps, gif, jpg, jpeg, pdf, pgm, pict, png, pnm, ppm, ps, rgb, rgba, sgi, sun, svg, tga, tif, tiff, ttf, x, xbm, xpm, xwd, yuv
convert $1 $2 |
Batch File
Some GUI based applications are capable of being called in a headless mode. The following is an example wrapper script for OpenOffice called in its headless mode.
Code Block | ||
---|---|---|
| ||
REM OpenOffice (v3.1.0) REM document REM doc, odt, rtf, txt REM doc, odt, pdf, rtf, txt "C:\Program Files\OpenOffice.org 3\program\soffice.exe" -headless -norestore "-accept=socket`,host=localhost`,port=8100;urp;StarOffice.ServiceManager" "C:\Program Files\OpenOffice.org 3\program\python.exe" "C:\Converters\DocumentConverter.py" "%1%" "%2%" |
Anchor | ||||
---|---|---|---|---|
|
AutoHotKey
The following is an example of an AutoHotKey script to convert files with Adobe Acrobat, a GUI driven application. Note it contains a similar header in the comments at the beginning of the script. Also note that the open and save operation can be broken into two separate scripts.
...
Code Block |
---|
<ex:file xmlns:ex="http://example.com"> <ex:header> <ex:type>P2</ex:type> <ex:dimensions> <ex:width>16</ex:width> <ex:height>16</ex:height> </ex:dimensions> <ex:depth>255</ex:depth> </ex:header> <ex:pixels> <ex:pixel>136</ex:pixel> <ex:pixel>136</ex:pixel> <ex:pixel>136</ex:pixel> ... <ex:pixel>136</ex:pixel> <ex:pixel>136</ex:pixel> </ex:pixels> </ex:file> |
Anchor DTS DTS
The Data Tilling Service (DTS)
DTS | |
DTS |
The Data Tilling Services handles data extractions. If your code, tool, or software extracts information such as keywords from a file or its contents then it should be included in the DTS as a Medici extractor. If your code, tool, or software extracts a signature from the file's contents which in turn can be compared to the signatures of other files via some distance measure to find similar pieces of data, then, it should be included in the DTS as a Versus extractor.
AnchorMedici Medici
Medici Extractors
Medici | |
Medici |
...