Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Software Server scripts are used by Polyglot to automate the interaction with software that is capable of converting from one file format to another.  These scripts can directly wrap command line utilities that carry out conversions for use in Polyglot or split the steps of opening a file in one format and saving a file in a different format, typical of GUI driven applications.  These wrapper scripts can be written in pretty much any text based scripting language.  Below we show a few simple examples.  Full details on the creation of these wrapper scripts, the required naming convensions, and required header convensions please refer to the the Scripting Manual.

Anchor

AutoHotKeyAutoHotKeyAutoHotKey AnchorOpenOfficeOpenOfficeCommand Line Application
Code Block
REM OpenOffice (v3.1.0)
REM document
REM doc, odt, rtf, txt
REM doc, odt, pdf, rtf, txt

"C:\Program Files\OpenOffice.org 3\program\soffice.exe" -headless -norestore "-accept=socket`,host=localhost`,port=8100;urp;StarOffice.ServiceManager"
"C:\Program Files\OpenOffice.org 3\program\python.exe" "C:\Converters\DocumentConverter.py" "%1%" "%2%"
AnchorGUIGUIGUI Application

The script below opens a .pdf file for conversion.

Code Block
titleOpen
;Adobe Acrobat (v9.3.0 Pro Extended)
;document
;pdf

;Parse input filename
arg1 = %1%
StringGetPos, index, arg1, \, R
ifLess, index, 0, ExitApp
index += 2
input_filename := SubStr(arg1, index)

;Run program if not already running
IfWinNotExist, Adobe 3D Reviewer
{
  Run, C:\Program Files\Adobe\Acrobat 9.0\Acrobat\Acrobat.exe
  WinWait, Adobe Acrobat Pro Extended
}

;Activate the window
WinActivate, Adobe Acrobat Pro Extended
WinWaitActive, Adobe Acrobat Pro Extended

;Open document
Send, ^o
WinWait, Open
ControlSetText, Edit1, %1%
ControlSend, Edit1, {Enter}

;Make sure model is loaded before exiting
Loop
{
  IfWinExist, %input_filename% - Adobe Acrobat Pro Extended
  {
    break
  }

  Sleep, 500
}

The script below saves a converted .pdf file to the specified output format

CommandLIne
CommandLIne
Command Line Applications

Bash Script

The following is an example of a bash wrapper script for ImageMagick.  Note that it is fairly straight forward.  The comments at the top contain the information Polyglot needs to use the application: the name and version of the application, they type of data it supports, the input formats it supports, and the output formats it supports.

Code Block
titleImgMgk_convert.sh
#!/bin/sh
#ImageMagick (v6.5.2)
#image
#bmp, dib, eps, fig, gif, ico, jpg, jpeg, pdf, pgm, pict, pix, png, pnm, ppm, ps, rgb, rgba, sgi, sun, svg, tga, tif, tiff, ttf, x, xbm, xcf, xpm, xwd, yuv
#bmp, dib, eps, gif, jpg, jpeg, pdf, pgm, pict, png, pnm, ppm, ps, rgb, rgba, sgi, sun, svg, tga, tif, tiff, ttf, x, xbm, xpm, xwd, yuv

convert $1 $2

Batch File

Some GUI based applications are capable of being called in a headless mode.  The following is an example wrapper script for OpenOffice called in its headless mode.

Code Block
REM OpenOffice (v3.1.0)
REM document
REM doc, odt, rtf, txt
REM doc, odt, pdf, rtf, txt

"C:\Program Files\OpenOffice.org 3\program\soffice.exe" -headless -norestore "-accept=socket`,host=localhost`,port=8100;urp;StarOffice.ServiceManager"
"C:\Program Files\OpenOffice.org 3\program\python.exe" "C:\Converters\DocumentConverter.py" "%1%" "%2%"

Anchor
GUI
GUI
GUI Applications

AutoHotKey

The following is an example of an AutoHotKey script to convert files with Adobe Acrobat, a GUI driven application.  Note it contains a similar header in the comments at the beginning of the script.  Also note that the open and save operation can be broken into two separate scripts.

Code Block
titleAcrobat_open.ahk
;Adobe Acrobat (v9.3.0 Pro Extended)
;document
;pdf

;Parse input filename
Code Block
titleSave
;Adobe Acrobat (v9.3.0 Pro Extended)
;document
;doc, html, jpg, pdf, ps, rtf, txt

;Parse output format
arg1 = %1%
StringGetPos, index, arg1, .\, R
ifLess, index, 0, ExitApp
index += 2
outinput_filename := SubStr(arg1, index)

;ParseRun program filename root
StringGetPos, index, arg1, \, R
ifLess, index, 0, ExitApp
index += 2
name := SubStr(arg1, index)
StringGetPos, index, name, ., R
ifLess, index, 0, ExitApp
name := SubStr(name, 1, index)if not already running
IfWinNotExist, Adobe 3D Reviewer
{
  Run, C:\Program Files\Adobe\Acrobat 9.0\Acrobat\Acrobat.exe
  WinWait, Adobe Acrobat Pro Extended
}

;Activate the window
WinActivate, %name%.pdf - Adobe Acrobat Pro Extended
WinWaitActive, %name%.pdf - Adobe Acrobat Pro Extended

;SaveOpen document
Send, ^S^o
WinWait, Save As

if(out = "doc"){
  Open
ControlSetText, Edit1, %1%
ControlSend, ComboBox3Edit1, m
}else if(out = "html"){Enter}

;Make sure model is loaded before exiting
Loop
{
  controlSendIfWinExist, ComboBox3, h
}else if(out = "jpg") %input_filename% - Adobe Acrobat Pro Extended
  {
  controlSend, ComboBox3, jbreak
  }else

  Sleep, 500
}
Code Block
titleAcrobat_save.ahk
;Adobe Acrobat (v9.3.0 Pro Extended)
;document
;doc, html, jpg, pdf, ps, rtf, txt

;Parse output format
arg1 = %1%
StringGetPos, index, arg1, ., R
ifLess, index, 0, ExitApp
index += 2
out := SubStr(arg1, index)

;Parse filename root
StringGetPos, index, arg1, \, R
ifLess, index, 0, ExitApp
index += 2
name := SubStr(arg1, index)
StringGetPos, index, name, ., R
ifLess, index, 0, ExitApp
name := SubStr(name, 1, index)

;Activate the window
WinActivateif(out = "pdf"){
  controlSend, ComboBox3, a
}else if(out = "ps"){
  controlSend, ComboBox3, p
  controlSend, ComboBox3, p
  controlSend, ComboBox3, p
  controlSend, ComboBox3, p
  controlSend, ComboBox3, p
}else if(out = "rtf"){
  controlSend, ComboBox3, r
}else if(out = "txt"){
  controlSend, ComboBox3, t
  controlSend, ComboBox3, t
}

ControlSetText, Edit1, %1%
ControlSend, Edit1, {Enter}

;Return to main window before exiting
Loop
{
  ;Continue on if main window is active
  IfWinActive, %name%.pdf - Adobe Acrobat Pro Extended
WinWaitActive, %name%.pdf {- 
Adobe Acrobat Pro Extended

;Save breakdocument
Send,  }^S

WinWait,  ;ClickSave As

if(out = "Yes" if asked to overwrite filesdoc"){
  IfWinExistControlSend, SaveComboBox3, Asm
}else if(out = "html"){
    ControlGetTextcontrolSend, tmpComboBox3, Button1, Save As

    if(tmph
}else if(out = "&Yesjpg"){
  controlSend, ComboBox3, {j
}else if(out = "pdf"){
   ControlClickcontrolSend, Button1ComboBox3, Save Asa
}else if(out   }= "ps"){
  }

  Sleep, 500
}

;Wait a lit bit more just in case
Sleep, 1000

;Close whatever document is currently open
Send, ^w

;Make sure it actually closed before exiting
Loop
{
  ;Continue on if main window is active
  IfWinActive, Adobe Acrobat Pro Extended
  { 
    break
  }

  Sleep, 500
}

...

Applescript is also supported by Polyglot.  An example script will be provided in the future.

...

Python is also supported by Polyglot.  An example script will be provided in the future.

...

controlSend, ComboBox3, p
  controlSend, ComboBox3, p
  controlSend, ComboBox3, p
  controlSend, ComboBox3, p
  controlSend, ComboBox3, p
}else if(out = "rtf"){
  controlSend, ComboBox3, r
}else if(out = "txt"){
  controlSend, ComboBox3, t
  controlSend, ComboBox3, t
}

ControlSetText, Edit1, %1%
ControlSend, Edit1, {Enter}

;Return to main window before exiting
Loop
{
  ;Continue on if main window is active
  IfWinActive, %name%.pdf - Adobe Acrobat Pro Extended
  { 
    break
  }

  ;Click "Yes" if asked to overwrite files
  IfWinExist, Save As
  {
    ControlGetText, tmp, Button1, Save As

    if(tmp = "&Yes")
    {
      ControlClick, Button1, Save As
    }
  }

  Sleep, 500
}

;Wait a lit bit more just in case
Sleep, 1000

;Close whatever document is currently open
Send, ^w

;Make sure it actually closed before exiting
Loop
{
  ;Continue on if main window is active
  IfWinActive, Adobe Acrobat Pro Extended
  { 
    break
  }

  Sleep, 500
}

Anchor
Medici
Medici
Medici Extractors

Medici extractors typically serve to automatically extract some new kind of information from a file's content when it is uploaded into Medici.  These extractors do this by connecting to a shared RabbitMQ bus.  When a new file is uploaded to Medici it is announced on this bus.  Extractors that can handle a file of the type posted on the bus are triggered and the data they in turn create is returned to Medici as derived data to be associated with that file.  The extractors themselves can be implemented in a variety of languages.

...

Code Block
languagecpp
titleReceiver
namespace CPPExample {

  /**
   *  Parse data that was recevied from RabbitMQ
   *  
   *  Every time that data comes in from RabbitMQ, you should call this method to parse
   *  the incoming data, and let it handle by the AMQP-CPP library. This method returns the number
   *  of bytes that were processed.
   *
   *  If not all bytes could be processed because it only contained a partial frame, you should
   *  call this same method later on when more data is available. The AMQP-CPP library does not do
   *  any buffering, so it is up to the caller to ensure that the old data is also passed in that
   *  later call.
   *
   *  @param  buffer      buffer to decode
   *  @param  size        size of the buffer to decode
   *  @return             number of bytes that were processed
   */
  size_t parse(char *buffer, size_t size)
  {
     return _implementation.parse(buffer, size);
  }
}

Anchor
Python
Python
Python

 

Code Block
themeEmacs
languagepy
titleInstantiating the logger and starting the extractor
def main():
 global logger

 # name of receiver
receiver='ExamplePythonExtractor'

 # configure the logging system
logging.basicConfig(format="%(asctime)-15s %(name)-10s %(levelname)-7s : %(message)s", level=logging.WARN)
logger = logging.getLogger(receiver)
logger.setLevel(logging.DEBUG)
 
 if len(sys.argv) != 4:
logger.info("Input RabbitMQ username, followed by RabbitMQ password and Medici REST API key.")
sys.exit()
 
 global playserverKey
playserverKey = sys.argv[3]

...