Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Code Block
languagebash
cd /etc/init
for x in clowder-*.conf; do
  start `basename $x .conf`
done

 

Converting from pyClowder to pyClowder2

Given an extractor that is written to use pyClowder 1, the process of migrating to pyClowder 2 is fairly straightforward.

Key differences

  • config.py is no longer used or needed.
    • Several of the common entries in config.py are accessible to all extractors via the basic Extractor class: https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/pyclowder2/browse/pyclowder/extractors.py#66 (here you can also see defaults)
    • You can implement your own command line arguments to include any special parameters in config.py. Another option is to read them from environment variables.
    • the 'messageType' parameter (telling what types of messages to listen for) will go into extractor_info.json and uses a a more MIME-like definition format.
  • your extractor will now be an extension of pyClowder2's Extractor class, which contains many useful methods.
    • init is where you can define custom command line arguments beyond the standard ones.
    • check_message and process_message now get explicit parameters such as clowder host and secret key, rather than embedding them in a 'params' object. information about the entity in clowder that triggered the extraction (file, dataset, etc.) is in the 'resource' parameter. The old 'parameters' is kept for back compatibility, but is deprecated.
  • As a result of config.py going away, you should provide parameters at runtime
    • python my_extractor.py --rabbitmqExchange="terra" --rabbitmqURI="rabbitmw.ncsa.illinois.edu/clowder-dev"
  • new cleaner functions in pyClowder 2 for interacting with clowder, including packages for files, datasets, etc.
    • OLD - extractors.upload_file_to_dataset(outfile, parameters)
    • NEW - pyclowder.files.upload_to_dataset(connector, host, secret_key, resource['id'], outfile)

Migration steps

  1. If there are parameters in config.py that don't use the default values in the link under Key differences, they should be listed as command line parameters in your new extractor class __init__ or simply coded into the script. It's possible to make the parameters read from environment variables as well.
    1. https://github.com/terraref/extractors-stereo-rgb/pull/3/files - in this example, 
      1. https://github.com/terraref/extractors-stereo-rgb/pull/3/files#diff-6be4f9dea03b90eac1407a1012cdf34eL42 is moved to
      2. https://github.com/terraref/extractors-stereo-rgb/pull/3/files#diff-f53b0090553dbecd9e15f5eb59549c00R32
    2. ...and below the self.parser.add_argument the input values can be adjusted before assinging to self.args (e.g. cast a string to an int):
      1. https://github.com/terraref/extractors-stereo-rgb/pull/3/files#diff-f53b0090553dbecd9e15f5eb59549c00R48
    3. Add the messageType from config.py into extractor_info.json
      1. Before: https://github.com/terraref/extractors-stereo-rgb/pull/3/files#diff-38d737ae3b969ee995bd1b34ebe93be4L25
      2. After: https://github.com/terraref/extractors-stereo-rgb/pull/3/files#diff-40099abc8fb726838bb4c7a44b8b5958R10 
  2. Move your extractor python functions into a new Extractor subclass 
    1. https://github.com/terraref/extractors-stereo-rgb/pull/3/files#diff-924a575b0595fcd52d5531433471b109R23 Here a new extractor class called StereoBin2JpgTiff is created.
    2. main() -> __init__(self) (but only for handling inputs)
    3. check_message() and process_message() must be named as such now, and receive explicit inputs:
      1. self, connector, host, secret_key, resource, parameters
      2. typically, old references to parameters['xyz'] can be replaced either with resource['xyz'] or with secret_key, host, etc.
      3. if you aren't sure when writing, you can use print(resource) in your extractor testing to see what fields are included.
  3. Modify old extractor.method() to use the new pyclowder.files.method() or pyclowder.datasets.method()
    1. https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/pyclowder2/browse/pyclowder/files.py
    2. https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/pyclowder2/browse/pyclowder/datasets.py
    3. https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/pyclowder2/browse/pyclowder/collections.py
    4. https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/pyclowder2/browse/pyclowder/sections.py
    5. https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/pyclowder2/browse/pyclowder/utils.py
    6. more to come
  4. finally, the call to main() is replaced with a simple instantiation of your extractor class.
    1. https://github.com/terraref/extractors-stereo-rgb/pull/3/files#diff-924a575b0595fcd52d5531433471b109R174
    2. extractor = StereoBin2JpgTiff(); extractor.start()