There are many different options available to run containers on AWS, so we would like to determine the best way to run Clowder in containers on AWS.


Scenario A: Docker Compose via EC2 - Identical to what we do on OpenStack

Overview

The most familiar option would be to use AWS EC2 to spin up one or more raw virtual machines, precisely how we do on OpenStack today.

We would just need to install Docker on the host, and use Clowder's docker-compose.yml to bring up the core pieces of the Clowder stack.

After deployment we would maintain the individual EC2 VMs that run the cluster components, while Docker Compose runs the containers and restarts them if they hit an error.

The compose file can be edited to the user's liking to provide the desired set of configuration options / extractors needed.

The main downside here would likely be long-term cost - while we likely wouldn't need any additional learning to get started, there may be more AWS-friendly ways to save money on the long-running infrastructure.

Pros

Cons

Example

https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/clowder/browse/docker-compose.yml

Pricing

Open Questions

Scenario B: Kubernetes via EC2 (kops) - Classic Kubernetes with an AWS twist

Overview

AWS has a tool called kops that can be used to deploy several EC2 instances and install the necessary services for running them as a Kubernetes cluster.

We would just need to spin up an EC2 instance that can house our deployment/configuration data, then use kops from that machine to provision master/worker VMs and bring up a cluster.

The compose (mentioned in scenario A above)  can be edited to the user's liking to provide the desired set of configuration options / extractors needed.

After deployment AWS will monitor and we would maintain the individual EC2 VMs that run the cluster components, while Kubernetes manages the containers and keeps them all healthy and running.

We also maintain control over the master node(s) in the case, and are free to modify them as we wish, but we would then be footing the bill for the master node(s) and need to maintain/monitor them to keep them running.

Autoscaling components can be deployed to keep your cluster size to a minimum when there is less load, and can automatically increase the size of the cluster by spinning up new loads once a bottleneck is hit.

Pros

Cons

Example

KnowEnG Kubernetes Deployment

Pricing

Open Questions

Scenario C: Kubernetes via EKS - Kubernetes where AWS manages your master nodes?

Overview

AWS has a newer service called EKS (Elastic Kubernetes Service) that can be used to deploy several EC2 instances as a Kubernetes cluster.

We would just need to use the AWS CLI to create a new EKS cluster - this will, like kops, spin up a set of EC2 instances and provision them as a Kubernetes cluster.

The compose (mentioned in scenario A above) can be edited to the user's liking to provide the desired set of configuration options / extractors needed.

After deployment AWS will monitor and we would maintain the individual EC2 VMs that run the cluster components, while Kubernetes manages the containers and keeps them all healthy and running.

We lose control over the master node(s) in the scenario (versus Scenario B, above), but this may save money in the long-term.

Autoscaling components can be deployed to keep your cluster size to a minimum when there is less load, and can automatically increase the size of the cluster by spinning up new loads once a bottleneck is hit.

Pros

Cons

Example

https://github.com/aws-samples/aws-workshop-for-kubernetes

Pricing

Open Questions

Scenario D: ECS via FarGate - Containers, minus the Infrastructure?

Overview

AWS has an even newer service called FarGate that can be used to deploy containers directly to the cloud without worrying about the machines they will run on.

In this scenario, we pass the docker-compose.yml of Clowder or whichever other services we want FarGate to run.

FarGate then figures out where to run it and charges us for the memory/CPU usage of our containers while it runs.

While we likely could test out this scenario immediately without any/many modification to Clowder's docker-compose file, it may end up being more expensive in the long-term.

Pros

Cons

Example

https://docs.aws.amazon.com/AmazonECS/latest/developerguide/ecs-cli-tutorial-fargate.html

Pricing

NOTE: Some users say that FarGate is better suited for batch operations, and not for services that are expected to run 24/7

Open Questions

Scenario E...? ECS via EC2 - Containers, minus the familiar parts of the Infrastructure?

Overview

Before FarGate, the ECS option offered by ECS was center around EC2 and their custom taskDefinitions.

You first need to spin up an ECS cluster (using the UI or CLI). Your cluster size determines how many tasks you can run concurrently.

You can then define your tasks using a special AWS-flavored JSON/YAML syntax, then submit it to the ECS to run your tasks in containers.

This scenario appears to be the most work with less (or less obvious) methods for cutting costs.

Pros

Cons

Example

https://aws.amazon.com/getting-started/tutorials/deploy-docker-containers/

https://docs.aws.amazon.com/AmazonECS/latest/developerguide/ECS_GetStarted_EC2.html

Pricing

Unclear, but I suspect that EC2 Instance Pricing may apply

Open Questions