NDS-1004 - Getting issue details... STATUS
Background
The TERRA-REF in part project provides a computing pipeline for the LemnaTec Scanalyzer Field deployed at the University of Arizona Maricopa Agricultural Center. The Scanalyzer includes a variety of devices and sensors, including stereo RGB cameras. One requirement for the computing pipeline is to generate a daily fullfield image of RGB data.
The fullfield "stitching" process is implemented as the fieldmosiac extractor in the Clowder platform. The stitching process consists of the following steps:
- Convert all raw sensor output to georeferenced GeoTIFFs
- For a given day, generate a list of files to the included in the full-field image
- Create a VRT from the input file list
- Use gdal_translate to convert the VRT to stitched GeoTIFF both full-sized and thumbnail
The extractor is part of the TERRA-REF compute pipeline and automatically executed each day.
Jupyter Notebook
We have developed a simple Jupyter Notebook to demonstrate the stitching process on a subset of the TERRA-REF data.
https://github.com/craig-willis/stitching-demo/
This notebook must be run in the TERRA-REF Analysis Workbench, as it requires access to the TERRA-REF data (read-only). The TERRA-REF Jupyter environment contains all of the required dependencies (e.g., GDAL).
The notebook is only runnable on the TERRA workbench, so example screenshots are provided:
From Jupyter to Clowder
The provided notebook doesn't require any interaction with Clowder or the extractor bus framework. It is a subset of the fieldmosiac extractor code that operates only on data available via the filesystem. Scaling this process up via Clowder would require the ability to register a custom extractor on a private development queue and trigger dataset processing. This is not a process that is easily supported today.
From Jupyter to HPC
Since the data is also on ROGER, an alternative approach is to launch a standard TORQUE job on ROGER, possibly via the Agave API. Since ROGER doesn't support Singularity, this will require ensuring all of the dependencies are available (they should be).
Story: A TERRA-REF Workbench user can launch the full-field image stitching process on ROGER via Jupyter.
This will require the following:
- Configure Agave system and storage objects for ROGER (proof-of-concept completed with John Fonner)
- Develop Agave App implementing the full-field stitch process
- Add Agave API client to Jupyter environment (https://github.com/TACC/agavepy)
- Optionally, implement Jupyter Agave "magic" plugin (similar to slurm-magic)
A ticked has been created to address this requirement for SC17: NDS-1025 - Getting issue details... STATUS
Fullfield stitch process
- Given a set of GeoTIFFs for a given time-period
- Create VRT from the input file list (find command on date)
- GDAL-translate VRT to stitched GeoTIFF
- Can parallelize thumbnail process
- Create a VRT from the input file list
- Use gdal_translate to convert the VRT to stitched GeoTIFF both full-sized and thumbnail
Additional notes:
- Can parallelize gdal_translate
- Can use either gdal_warp, gdal_translateo or gdal_merge to create TIFF from VRT (not clear which performs better/is better).