NDS-1004 - Getting issue details... STATUS

Background

The TERRA-REF in part project provides a computing pipeline for the LemnaTec Scanalyzer Field deployed at the University of Arizona Maricopa Agricultural Center.  The Scanalyzer includes a variety of devices and sensors, including stereo RGB cameras.  One requirement for the computing pipeline is to generate a daily fullfield image of RGB data.


The fullfield "stitching" process is implemented as the fieldmosiac extractor in the Clowder platform. The stitching process consists of the following steps:

  • Convert all raw sensor output to georeferenced GeoTIFFs
  • For a given day, generate a list of files to the included in the full-field image
  • Create a VRT from the input file list
  • Use gdal_translate to convert the VRT to stitched GeoTIFF both full-sized and thumbnail

The extractor is part of the TERRA-REF compute pipeline and automatically executed each day.

Jupyter Notebook

We have developed a simple Jupyter Notebook to demonstrate the stitching process on a subset of the TERRA-REF data.  

https://github.com/craig-willis/stitching-demo/

This notebook must be run in the TERRA-REF Analysis Workbench, as it requires access to the TERRA-REF data (read-only).  The TERRA-REF Jupyter environment contains all of the required dependencies  (e.g., GDAL).

The notebook is only runnable on the TERRA workbench, so example screenshots are provided:

 

From Jupyter to Clowder

The provided notebook doesn't require any interaction with Clowder or the extractor bus framework.  It is a subset of the fieldmosiac extractor code that operates only on data available via the filesystem.  Scaling this process up via Clowder would require the ability to register a custom extractor on a private development queue and trigger dataset processing.  This is not a process that is easily supported today.

From Jupyter to HPC

Since the data is also on ROGER, an alternative approach is to launch a standard TORQUE job on ROGER, possibly via the Agave API.  Since ROGER doesn't support Singularity, this will require ensuring all of the dependencies are available (they should be).

Story: A TERRA-REF Workbench user can launch the full-field image stitching process on ROGER via Jupyter.

This will require the following:

  • Configure Agave system and storage objects for ROGER (proof-of-concept completed with John Fonner)
  • Develop Agave App implementing the full-field stitch process
  • Add Agave API client to Jupyter environment (https://github.com/TACC/agavepy)
  • Optionally, implement Jupyter Agave "magic" plugin (similar to slurm-magic)

A ticked has been created to address this requirement for SC17:  NDS-1025 - Getting issue details... STATUS


Fullfield stitch process

  • Given a set of GeoTIFFs for a given time-period
  • Create VRT from the input file list (find command on date)
  • GDAL-translate VRT to stitched GeoTIFF 
  • Can parallelize thumbnail process
  • Create a VRT from the input file list
  • Use gdal_translate to convert the VRT to stitched GeoTIFF both full-sized and thumbnail


Additional notes:

  • Can parallelize gdal_translate
  • Can use either gdal_warp, gdal_translateo or gdal_merge to create TIFF from VRT (not clear which performs better/is better).



  • No labels