Note: Also see architecture/design notes within the source code under docs
Table of Contents
Concepts and Terminology
- Infrastructure: The compute and storage resources in a cloud or infrastructure service (AWS, etc) that an NDS Labs cluster runs on. The NDS Labs reference architecture is OpenStack.
- Site: A site is an administrative organization that provides resources to and operates one or more NDSLabs clusters.
- Cluster/NDS Labs cluster: The NDS Labs software platform that runs on the infrastructure.
- Project/Namespace: An isolated, named environment within the cluster that contains a set of services that are managed and operated independently of other projects. Projects typically implement the equivalent of a "website".
- Administrator: An authenticated person that manages and operates a part of the system.
NDS Labs System - Roles and Responsibilities
- Infrastructure Administrator:
- Provisions infrastructure to run a NDS Labs cluster
- On OpenStack, AWS, GCE, Rackspace, MaaS, ...
- Deploys the NDS Labs base cluster software
- Registers resources from infrastructure with NDS Labs cluster resource pool
- Provides API and credential to Cluster Administrator
- Provisions infrastructure to run a NDS Labs cluster
- Cluster Administrator:
- Manages and operates the NDS Labs cluster infrastructure
- Manages Projects in the cluster
- Provisions Projects on the cluster
- Manages resource assignments from the cluster pool to project pools
- Provides API and credentials per-project to Project Administrators
- Project Administrator:
- Provisions and deploys services in a project using resources granted to the project pool by the cluster administrator.
- Manages, monitors, and administrates services within independent projects.
- User/Project User: A client/user of the services within a project.
- Tool/Service Provider: A NDSC partner that provides a tool or service in a set of containers that include NDS Labs service descriptors to enable the service to be integrated in a NDS Labs cluster.
NDS Labs Architecture
NDS Labs extends the Kubernetes base system with NDS-specific services and REST API's that support NDS Labs cluster services, project services, inter-cluster NDS Labs services. The implementations of NDS Labs services are implemented via cluster-specific Kubernetes pods and sidekick containers that are deployed in conjunction with service pods in the cluster and in project-specific services that "extend" cluster-specific and project-specific pods with integration to services such a monitoring, volume management, etc.
- NDS Labs Cluster Services:
- API Manager: Manages cluster-wide API naming and public API exposure from the cluster public IP firewall/load-balancing system.
- Catalog, Configuration, and Deployment (CCD) Service: Automatically updated catalog of NDS Labs services available for deployment in the cluster (for cluster admins), and for projects (for project admins). The service catalog manager is configured with NDS Labs-specific container repositories, and periodically pulls service descriptions from the containers.
- Cluster Admin Project Administration (CADM): Provide the cluster administrator with project provisioning including project admin credentials. Provides management of infrastructure resources to projects, including volumes and managing differentiated compute resources.
- Cluster Administrator Monitoring (CMON) Tools/Service: Provides services for cluster administrators to monitor cluster operations, including logging, performance analysis, and resource utilization. Monitor services include ELK, Prometheus, etc. in addition to Kubernetes-provided tools like cadvisor.
- NDS Labs Project Services:
- Project Manager Administration (PADM): Allows the project manager to deploy, monitor, and manage application services within their project.
- Per-project Monitoring (PMON): Provides project-specific monitoring of project resources, utilization, performance, and application/service specific monitoring and logging.
- Project Manager Administration (PADM): Allows the project manager to deploy, monitor, and manage application services within their project.
- Inter-Cluster/Integrated-Cluster Services (ICS): Provide NDS Labs web services across multiple distributed clusters in the larger NDS Labs context to implement global NDS Labs services such as global resource search, distributed data access, and provide distributed application developers services to implement service discovery and distributed API access within their services.
- Distributed search: Locating named data and services in the NDS Labs global system.
- Resource discovery: Locates attribute-specified resources in the NDS Labs global system, such as specifically sized data-storage resources, or specific compute resources such as HPC resources, or accelerator-enabled compute resources, for example.
- Advanaced Data Management: Allows composing cross-cluster data management applications
High Level Global Architecture
The NDSLabs system is comprised of NDSLabs cloud services that run on clusters at various sites. Clusters provides the resources to one or more projects comprised of a related set of cloud-based services targeted to a specific community or application. In a service-oriented model the site is equivalent to an IaaS provider, the cluster is equivalent to a PaaS provider, and a project is equivalent to a unuiquely configured and deployed platform on the PaaS system. Specific NDSLabs services are implemented within each layer that assist with convenient deployment and operation of the PaaS and platform layers. Global distributed data services such as search across all NDSLabs sites and projects will be provided in the Inter-cluster system (ICS) that will provide infrastructure building blocks for implementing wide area cross-cluster services.
Single Cluster Architecture Diagrams
Layer 0 - Single Cluster Infrastructure
The NDSLabs reference Infrastructure is OpenStack. A cluster begins with a cluster of 6 OpenStack VMs. The cluster admin can add additional compute nodes as-needed based on dynamic demand.
Layer 1 - Kubernetes Container and Service Orchestration Layer
The initial 6-VM system is provisioned as a CoreOS cluster with 3 serving as Etcd masters (not shown for simplicity), and 3 serving as the initial Kubernetes infrastructure, with the following architecture. additional compute resources can be added to the kubernetes cluster as demand requires.
Layer 2 - Single-Cluster Detail - NDS Labs Services and APIs
The NDSLabs architecture layers services on top of kubernetes that implment the NDSLabs services to manage and monitor the cluster, provision and manage resources for projects in the cluster, and provide project managers the ability to manage software service stacks within their project. The NDSLabs services leverage the facilities of the underlying kubernetes cluster orchestration system and the etcd system for managing configuration and state information of the services.
API/Service Catalog
Service | APIs | UsedBy |
Service/Component/Role Matrix with Descriptions
Stage of Development Color Key: | Completed | In Development | In Design | Future |
Service | Component | Planning Notes | Role/Use | ||||
Infrastructure Admin | Cluster Admin | Project Admin | Tool Developer | System Service | |||
---|---|---|---|---|---|---|---|
DEVENV Developers Environment and tooling | Kubernetes Devenv Host-node network IPaddrs | NDSC - Planned for managed small-scale release to handful of early adopters | NA | NA | Test project deploy | Test tools | NA |
Kubernetes Devenv w/External firewall IPaddrs | Needs tests/design etcd/connfd/nginx | NA | NA | Test project with proper public interface | Develop to proper external interface | NA | |
Container build support Makefiles | Needs: docs, instructions, catalog yml support, publish process integration | NA | NA | NA | NA | ||
OpenStack | Production Cluster Deploy | Ifrastructure provision done Needs production config - TLS, security, data persistnence | Deploy Cluster Infrastructure | NA | NA | NA | NA |
Volume Interface service | Needed for OpenStack deploy | Provide vol resources | Allocate vol resources to projects | Implicit use of auto-named vols | NA | Register/track resources | |
CCD | CATADM | NDSC demo comopnent | NA | Admin Catalogs - register catalog URLS | NA | Publish service Needs service format | |
CATSVC Update local service catalog from configured catalogs | NDSC demo component | NA | NA | NA | NA | Pull catalogs maintain in etcd | |
Project Deploy CLI Deploy service stacks in project | NDSC demo | NA | NA | Deploy named service stacks in project | NA | Uses kubernetes API | |
CCDSRV Project Deploy GUI/Server Web deploy tool on CLI | NDSC demo | NA | NA | Web configurator and deploy | Use to test newly developed tools | NA | |
PMON Project Service Monitor | NDSC demo | NA | NA | CCD gui | NA | NA | |
CADM Cluster Administator/Ops | CMON - Cluster Monitor | NDSC demo component | NA | Monitor cluster health and performance | NA | NA | NA |
ICS Inter-cluster services | Search | Search across all NDSL clusters Needs research, requirements, plan | NA | Registration | Register data resources | relevant for developing search interfaces/tools | External interface to cluster Distributed global service |
Registration Cluster registration/federation | Needs development | NA | Global registration | Project resource registration | ?? | Local and global distributed service |
...