Prototype status:
- Working nginx LB with kubernetes ingress controller integration
- LB runs under kubernetes as a system-service
- Instructions/test harnesses in
- The LB is unopinionated - it works at the system level with any K8s service, as long as the service conforms to standard K8s network model. The requirements below are specific to NDSLabs test-drive/workbench but the LB is general-purpose and supportive of test-drive/workbench if test-drive/workbench are standard K8s services - assumed to be true.
- Tested with Vhost and path routing - basic testing not thorough
- Ingress interface based on K8s 1.2.0-alpha release - needs update
- Vhost/path routing verified
Tasks required for Production Deployments:
- Test LB prototype with test-drive interfaces/specs - path based for odum
- NDS-239 - Getting issue details... STATUS
- Update go dependencies/ingress API to current production release of kubernetes -
currently based on 1.2.0-alpha, current 160502 is 1.2.3 , should evaluate diff of 1.2.3 and 1.3.0-alpha and pick appropriately for future- NDS-240 - Getting issue details... STATUS
- Update the load balancer build - go build produces a static binary. Build should produce image from alpine with net-tools and single static binary.
Info on golang:onbuilds is here: https://hub.docker.com/_/golang/- NDS-241 - Getting issue details... STATUS
- Addressing startup
- Label the LB node such that LB pod deploys there, and add anti-affinity to label/scheduler/system to avoid scheduling other pods on the LB node.
i.e. The ingress-lb should be the only thing running on the LB node - always - NDS-242 - Getting issue details... STATUS
- Label the LB node such that LB pod deploys there, and add anti-affinity to label/scheduler/system to avoid scheduling other pods on the LB node.
Background
The NDS Labs "Workbench" service provides NDSC stakeholders with the ability to quickly launch and explore a variety of data management tools. Users select from a list of available services to configure and deploy instances of them. Workbench "services" are composed of a set of integrated Docker containers all deployed on a Kubernetes cluster.
In the following screenshot the user has starts a single instance of Dataverse which includes containers running Glassfish, PostgreSQL, Solr, Rserve, and TwoRavens (Apache + R). The user is attached to a Kubernetes namespace and can start instances of multiple different services:
Currently, remote access to running services is implemented using the Kubernetes "NodePort" mechanism. In essence, a given service (e.g., webserver) is mapped to a cluster-wide port in some configured range (default 30000-32767). Remote users access the running service on the specified port. In the above screenshot, the Dataverse web interface is accessible via http://141.142.210.130:30233. This solution has worked for development purposes but is 1) not scalable and 2) difficult to secure. We are exploring options to provide a scalable and secure solution to providing access to running services in the NDS Labs workbench for 10's-100's of users working with multiple instances of services, i.e. hundreds of service endpoints.
Requirements
Use case: The workbench user (project administrator) configures a service via the the workbench. Once configured, external endpoints are accessible via TLS/SSL.
- Ability for the user to securely access NDS Labs workbench services, which include web-based HTTP and TCP services.
- Service endpoints are secured using TLS
- Special handling for NDS Labs workbench API server and GUI requests, including CORS support
- Resilient to failure
Options:
Option | Description | Pro | Con |
---|---|---|---|
Path based | Services accessed via URL + Path For example, labs.nds.org/namespace/dataverse | Single DNS entry for labs.nds.org Single SSL certificate Simple | Only supports HTTP-based services Requires that every deployed service support a context or load balancer must re-write requests. |
Port based | Services accessed via URL + port For example labs.nds.org:33333 | Single DNS entry for labs.nds.org Single SSL certificate Simple | Requires use of non-standard ports Possible collisions in ports if services are stopped and started across projects (i.e., I stop my stack, port is free – you start your stack and are assigned my port, my users now access your service) Only scales to # ports |
CNAME | Services accessed via CNAME URL + Path or Port for example project.labs.nds.org/dataverse project.labs.nds.org:port | One DNS entry, IP address, and certificate for each project (or possibly wildcard Cert) Supports both HTTP and TCP services Port collisions are only within a project. | Requires IP address per project Requires DNS/CNAME request to neteng |
Requirements
- When a new project is created, if the admin anticipates needing remote access to non-HTTP services, a static IP address and CNAME are assigned to the project.
- The load balancer routes requests to services configured in Kubernetes. This means that the LB must be Namespace and service aware – which means monitoring Etcd or the Kubernetes API for changes.
- When a new HTTP service is added, load balancer config is updated to proxy via path
- If no CNAME
- paths are in the form: labs.nds.org/namespace/serviceId
- If CNAME
- paths are in the form namespace.labs.nds.org/serviceId
- If no CNAME
- When a new TCP service is added, load balancer config is updated to proxy via port – only if project has CNAME/IP:
- namespace.labs.nds.org:port
- For GUI and API, paths are labs.nds.org/ labs.nds.org/api respectively
- Load balancer must be resilient – if restarted, previous configuration is maintained. Possibly in failover configuration.
Preliminary Design
Based on the prototype, we will move forward with the Kubernetes ingress-based nginx load balancer model. The current version from the Kubernetes contrib repo works based on preliminary tests.
- Load balancer node: A VM node will serve as the dedicated load-balancer node and run the Nginx LB replication controller using node labels
- Nginx ingress controller: The nginx ingress controller is deployed as a replication controller
- DNS:
- "A" record points to load balancer node (e.g., test.ndslabs.org A 141.142.210.172)
- Per-project wildcard CNAME (e.g., "*.demo.ndslabs.org. CNAME test.ndslabs.org)
- Per-service Ingress resource:
- For each exposed service endpoint, an ingress rule will be created
- host: <service>.<namespace>.ndslabs.org
- path: "/"
- backend:
- serviceName: <service name>
- servicePort: <service port>
- These resources will be created/updated/deleted with the associated service
- The <service> value in the host will be the stack service ID (e.g., srz4wj-clowder)
- For each exposed service endpoint, an ingress rule will be created
- GUI/CLI: Instead of NodePort URLs, change to use the LB URL
- TLS: Add TLS termination support
- TCP support:
- The nginx controller supports access to TCP services using the ConfigMap resource. ConfigMap is simply a map of keys/values that contains the exposed port and the namespace/service:port. We will need to update the ConfigMap when services are added and removed. We will also need to handle assignment of ports. Unfortunately, the port assignments appear to be system-wide. It might be nice if we could assign ports within a host (i.e., in the Ingress rules), but this isn't possible today.