Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Thoughts on generalizing what we currently call  "Labs Workbench"  as a general platform that can be used to support multiple distinct use cases.

Table of Contents

Potential Use Cases

NDS Labs Workbench

...

  • Ability to install Labs Workbench at Cyverse or 
  • Ability to use Labs Workbench to access Cyverse data
  • Start an R or Jupyter container that can access the Cyverse data store via iRODS
    • Data mounted directly (local install at Cyverse)
    • Data transfered via iRODS
  • Ability to handle Cyverse authentication

Other: Collaborative Development Cloud (work in progress)

One issue that has come up recently on the KnowEnG UI development is the need for TLS-protected development instances with basic auth in front. Since we offer a slew of development environments with built-in TLS and basic auth, this seemed like a natural fit.

We also offer Jenkins CI. ARI already has set up for some of the KnowEnG folks, but could help other similar teams gain experience with setting up their own CI, and even testing applications that they could develop from within Labs. I played around over the weekend, and discovered that there are also several GitLab and Atlassian suite (JIRA + Confluence) images floating around that might be usable from within Labs.

Given the above, we have the potential to offer the following powerful combination of tools for any team of collaborating developers:

  • Private Source Control (via GitLab)
  • Project Documentation / Internal Development Wiki (via Confluence)
  • Ticketing workflow system (via JIRA)
  • Continuous Integration (via Jenkins CI)
  • Development Environments for several popular languages (via Cloud9 and friends - with the potential to add more)

True, you could outsource for any one of these (Atlassian provides the first three), but Labs is the only place I can think of where you could get them all! (wink)

Pros:

  • development-in-a-box: give teams all the tools they need to succeed right away
  • no need to remember 10 different URLs (if development teams shared a project in Labs) - access all of your development tools from one place!
  • automatic TLS with basic auth protecting all services (albeit self-signed, unless you have a CA)
  • quickly spin up new team members without spending a week installing dependencies and preparing environments

Cons:

  • full disclosure: I made this use case up... I have no idea if this is a real need that is unmet
  • storage is flaky, and hosting a source-code repository or ticket backlog directly violates my original ideology
    • "DO NOT store anything critical on Workbench. Storage is volatile and may go away at any point - save a hard copy externally."
  • requires that any service developed be runnable from within Labs, or else testing your code becomes more difficult than on a VM
    • currently: this would require that all services run from Labs (as well as, by extension, all things developed in Labs) be available via Docker Hub, which is may be too public for KnowEnG / ARI's current licensing needs

Other: Workflow Orchestration (work in progress)

See 

Jira
serverJIRA
serverIdb14d4ad9-eb00-3a94-88ac-a843fb6fa1ca
keyNDS-664

Another need that has come up on the KnowEnG project is the ability to run a cluster of compute resources for scheduling analysis jobs. These jobs come in the form of a DAG (directed acyclical graph) and are effectively Docker containers with dependencies. Since the API server already contains much of the logic to talk with etcd and Kubernetes, it might not be so difficult to extend Workbench to run these types analysis jobs.

Our "spec" architecture is already set up to handle running dependent containers and ensuring that they are running before continuing on to the next containers in the chain. If we were to add a flag (i.e. type == "job") to the specs, that could signal to the API server to run a job, instead of a service/rc/ingress, and to wait for the job to be "Completed" before running the next dependency.

I created a simple example of a Job spec YAML on raw Kubernetes just to see how a multi-container job would run and be scheduled. Apparently multiple Jobs can be scheduled at once, containing multiple containers. Each container within the Job will run sequentially (in the order listed in the spec).

I still need to ask for an example of a real life example of both a simple and a complex DAG from Charles Blatti to DAG to gather more details and create a more realistic prototype. We had previously discussed investigating Kubernetes to handle the scheduling, but we decided to look into BD2K's cwltoil framework instead.

Pros:

  • seems relatively small-effort to extend Labs in this way
  • more control over the scheduler than with raw Kubernetes, with direct access to the developers (ourselves)
  • we offer a user interface, which toil and kubernetes do not (aside from the mesos / kubernetes dashboard, which are fairly limited)

Cons:

  • BD2K created cwltoil, and KnowEnG is a product out of the BD2K, so we miss out on a political win by using Labs
  • toil was created for exactly this purpose: scalable DAG / CWL jobs
  • toil would allow us to run jobs using the CGC's CWL system
  • still some kinks in our platform (actual bugs, storage, commercial cloud deployment is not formalized, etc)

...