Thoughts on generalizing workbench as Project X based on recent discussions.
One of the clearest proven uses of the platform is for education and training purposes. Labs Workbench was used for:
Each environment is unique, but there are a few basic requirements:
We can also envision the platform working as a replacement for the TERRA-REF toolserver or as a DataDNS analysis service. In this case, the requirements are:
Another use case, really a re-purposing of the platform, is to support the development and deployment of research data portals – aka, the Zuhone case. Requirements include:
We currently have two methods of deploying the Labs Workbench service: 1) ndslabs-startup (single node) and 2) deploy-tools (multi-node OpenStack)
The ndslabs-startup tool provides a set of scripts to deploy NDS Labs services to a single VM. This is intended primarily for development and testing. The deployment is incomplete (no shared storage, NRPE, LMA, backup), but adding these services would be minor. Minikube was considered as an option, but is problematic when running on a VM in OpenStack and might require additional investigation.
The deploy-tools image provides a set of Ansible plays designed specifically to support the provisioning and deployment of a Kubernetes cluster on OpenStack with a hard dependencies on CoreOS and GlusterFS. It's unclear whether this can be replaced by openstack-heat. Deploy-tools has 3 parts: 1) OpenStack provision, 2) Kubernetes install, and 3) Labs components install. The OpenStack provision uses the OpenStack API and Ansible support to provision instances and volumes. The Kubernetes install is based on the contrib/ansible community tools with very minor local modifications. The Labs components install is primarily deploying Kubernetes objects.
For commercial cloud providers, we cannot use our deployment process. Fortunately, these services already have the ability to provision Kubernetes clusters: AWS, Azure, and GCE.
The deploy-tools image assumes CoreOS. This choice is arbitrary, but there are many assumptions in the deploy-tools component that are bound to the OS choice. Different providers make different OS decisions. Kubernetes seems to lean toward Fedora and Debian. GCE itself is Debian. Azure Ubuntu, etc. This may not be important if we can rely on Kuberenetes deployment provided by each commercial cloud provider.
The Labs Workbench system assumes Docker, but there are other container options. Kubernetes also supports rkt. This is something we've discussed but never explored.
Labs Workbench relies heavily on Kubernetes itself. The API server integrates directly with the Kubernetes API. Of all basic requirements, this seems to be one that's unlikely to change.
Labs Workench uses a custom Gluster FS solution for shared storage. A single Gluster volume is provisioned (4 GFS servers) and mounted to each host. The shared volume is accessed via hostPath by containers.
This approach was necessary due to lack of support for persistent volume claims for OpenStack. For commercial cloud providers, we'll need to re-think this approach. We can either have a single volume claim (giant shared disk), volume claim per user, or volume claim per application. There are benefits/weaknesses in all of these approaches. For example, in a cloud provider, you don't want to have a giant provisioned disk with no usage. The per account approach may be better.
Other storage includes mounted volumes for /var/lib/docker and /var/lib/kubelet.
Labs Workbench provides a thin REST interface over Kubernetes. Basic operations include: authentication, account management (register, approve, deny, delete), service management (add/update/remove), application instance management (add/update/remove/start/stop/logs), console access. The primary purpose of the REST API is to support the Angular Web UI. The API depends on Kubernetes API, etcd, Gluster for shared volume support, and SMTP support.
The Web UI is an Angular JS application that interfaces with the REST API.
Labs workbench provides the ability to support custom application catalogs via Github. Eventually, it may be nice to provide a more user-friendly method for adding/removing services.
Labs Workbench relies on the Kubernetes contrib Nginx ingress controller (reverse proxy) to provide access to running services including authentication. We've made only minor modifications to some of the configuration options.
We know that GCE uses a version of the Nginx controller, but it's unclear whether it's the same as the version we use.
A backup container is provided to backup Gluster volumes, etcd, and Kubernetes configs. This is tightly coupled to the Workbench architecture. The backup server is hosted at SDSC. We should be able to generalize this solution, if needed.
A Nagios NRPE image is provided to support monitoring instances with some Kubernetes support. We also use the contrib addons (Grafana, etc), deployed as standard services.
Commercial cloud providers provide their own monitoring tools, e.g., GCE Monitoring.
The Labs Workbench system deployed via deploy-tools includes a local Docker cache to minimize network traffic for image pulls
The Angular Web UI includes a facility for executing automated Selenium smoke tests.