Overview

Resources:

Tentative plan

  • Day 1:
    • Roman: 
      • Timeseries of plants in AZ 
      • Kinship data (1400 time series from AZ) and matrix (300 genotypes in fields), 85 measurements
      • Need data from BETYdb (height histogram)
      • Octave and Python scripts to get started
    • David: 
      • SWIR prediction problem
      • NREL training data
      • As many spectra as we can get (we have 5)
      • Daily measurements over the season – we may only have some made by Solmaz on one day
      • 2500 solar spectra, 10nm resolution
  • Day 2:
    • Jack
      • Commercial small satellite data (Planet Labs), licensed data
      • Good temporal resolution coincides with sampling on field
      • ~270 images
      • Python notebooks for exploratory analysis
      • QGIS for satellite data (ideally using geoserver or similar for desktop access)

Workbench requirements

  • SDSC Workbench supporting up to 50 users (2 core x GB RAM)
  • www.workshop1.nationaldataservice.org (TLS/DNS)
  • Data mounted via NFS /data/ and symlinked to ~/data for all containers
  • Images should pre-checkout tutorial materials
  • Shared data mounted via usual Gluster
  • Disable approval
  • No timeouts?
  • Cloud9 upload size limits?
  • Pre-load accounts

Deliverables


DeliverableStatusNotes

Workbench instance

Deployed master+gfs+1 at SDSC as https://www.datadrivenag.ndslabs.org.

Scaled up to 4 nodes (48 cores, 128G)



Move DNS/TLS for workshop1.nationaldataservice.orgDone

In-cluster BETY-db instanceDone
Day 1/RomanTimeseries dataDone
Day 1/RomanKinship matrixTBD
Day 1/RomanNREL data

Available under

/data/shared/roman/nrel


Day 1/RomanDay 1 Octave script

In github


Day 1/RomanDay 1 Python notebookIn github
Day 1/RomanOctave environmentDone
Day 1/RomanPython environmentDone
Day1/DavidSpectra


TERRA-REF subset

Week of 6/18

Level_1 vnir_netcdf, laser3d_mergedlas, fullfield, rgb_geotiff,



UAV dataDone

Planet Labs dataWon't do

Day 2 notebooksWon't do

GeoserverDone

Pre-load user accountsDone


What we should have working (2/26/2018)




Setup and support log

  • 2/6/2018
    • Initial conference call with organizers
  • 2/9/2018
    • Sent new estimates based on increased scope based on organizers meeting
    • Confirmed resource availability at SDSC
  • 2/12/2018
    • Decided to go foward with instance at SDSC
    • Began provisioning instance.  This workshop is special since we'll be setting up an external NFS server and transferring ~2+TB of sample data for users
    • Created NFS instance manually (in hindsight, should've just made it a labeled node) and basic master+GFS (2) + node (1) Kubernetes cluster
    • Setup local Globus personal endpoint on NFS server to handle transfer from TERRA-REF endpoint
  • 2/13/2018
    • Setup datadrivenag.slack.com and github.com/datadrivenag organizations
  • 2/15/2018
    • Extended cloud9all image to include gcc and octave 
    • After some coordination with organizers, began transfer of sample data (~1 week duration). Babysit transfers, troubleshoot performance problem, resolve permission and other issues
  • 2/16/2018
    • Basic Geoserver configuration to run under Kubernetes
    • Requested DNS change
    • Began investigation of how to setup/scale geoserver (Geoserver on Steroids – 

      https://www.slideshare.net/geosolutions/geoserver-on-steroids)

    • Updated Jupyter image to include octave kernel, provided sample notebook for Roman's plant height estimation
    • Had to scale up NFS volume size from 2TB to 3TB to support RGB, LAS and VNIR data
  • 2/21 - 2/22/2018
    • Began tracking down UAV data. Loaded UAV data into endpoint and Geoserver
    • Had to transfer via Drive due to lack of access to Globus endpoint
  • 2/23/2018
    • Scaled up cluster to 4 nodes (48cores) with monitoring
    • Confirmed geoserver can indeed scale horizontally (scaled RC), but appears to present problem to UI session?
    • Created simple NGINX server to allow users to browse data for download
  • 2/24/2018
    • Had to troubleshoot networking issue (flannel subnet bug)
    • Fixed Jupyter image dependency problems for terrautils
    • Tried to create example code to get sensor files by plot (sensorquery API)
    • Updated github repo, documentation, created videos for Workbench, QGIS and Xpra
    • etcd2 error on master suggests possible problems with slow filesystem at SDSC.


  • No labels