There are many different options available to run containers on AWS, so we would like to determine the best way to run Clowder in containers on AWS.
Raw EC2 - Docker Compose on raw EC2 instances
Example: https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/clowder/browse/docker-compose.yml
Pricing:
- Hourly pricing depends on chosen instance sizes
- I think that "On Demand" is the most basic/common setup
- See EC2 On Demand instance pricing: https://aws.amazon.com/ec2/pricing/on-demand/
- Autoscaling is available and works fairly well given reasonable constraints
- See EC2 Pricing: https://aws.amazon.com/ec2/pricing/
Pros:
- No additional learning required
Cons:
- Price of this option could prove to be higher than others
Open Questions:
- Prices? What do we know about our expected workload?
- We could do some CPU/memory/IO profiling on existing systems
- Are there better options out there that could bring down price without too much additional architecture?
ECS via EC2 (kops) - Classic Kubernetes with an AWS twist
Pricing:
- Hourly pricing depends on chosen instance sizes
- I think that "On Demand" is the most basic/common setup
- See EC2 On Demand instance pricing: https://aws.amazon.com/ec2/pricing/on-demand/
- Autoscaling is available and works fairly well given reasonable constraints
- See EC2 Pricing: https://aws.amazon.com/ec2/pricing/
Pros:
- Maximum flexibility / control over your cluster
Cons:
- Most complex to setup (but I've done it before)
Open Questions:
- Prices? What do we know about our expected workload?
- We could do some CPU/memory/IO profiling on existing systems
EKS: Kubernetes with an AWS UI
Example: https://github.com/aws-samples/aws-workshop-for-kubernetes
Pricing:
- "You pay $0.20 per hour for each Amazon EKS cluster that you create."
- "You can use a single Amazon EKS cluster to run multiple applications by taking advantage of Kubernetes namespaces and IAM security policies."
- "You pay for AWS resources (e.g. EC2 instances or EBS volumes) you create to run your Kubernetes worker nodes."
- "You only pay for what you use, as you use it; there are no minimum fees and no upfront commitments."
- See EKS Pricing: https://aws.amazon.com/eks/pricing/
Pros:
- Less infrastructure to manage
Cons:
- Assumption: we may have less control over what gets deployed (kube-apiserver flags, K8S versions, etc)
Open Questions:
- Prices? What do we know about our expected workload?
- We could do some CPU/memory/IO profiling on existing systems
ECS via FarGate - effectively docker-compose on invisible EC2?
Example: https://docs.aws.amazon.com/AmazonECS/latest/developerguide/ecs-cli-tutorial-fargate.html
Pricing:
- "The price per vCPU is $0.00001406 per second ($0.0506 per hour) and per GB memory is $0.00000353 per second ($0.0127 per hour)."
- "You pay only for what you use."
- See FarGate Pricing: https://aws.amazon.com/fargate/pricing/
NOTE: Some users say that FarGate is better suited for batch operations, and not for services that are expected to run 24/7
Pros:
- Least effort to manage - no underlying infrastructure to worry about, and you pay based on usage
- Easiest migration, as it appears to take a format similar to docker-compose
Cons:
- Assumption: Least flexible - may not be able to run all types of workloads here, but we shouldn't need to run cronjobs or batch jobs or anything like that
- Many unknowns - it's unclear what potential features we might be losing here (for example, LMA)
Open Questions:
- Prices? What do we know about our expected workload?
- We could do some CPU/memory/IO profiling on existing systems
- LMA: do we need to provide monitoring/logging/alerts ourselves? Perhaps this is something that AWS FarGate provides for its containers?
- Persistent storage: are Docker volumes safely backed by EBS, EFS, or some other persistent storage options?
- Shared storage: do we need shared storage (e.g. NFS/EFS)? Is such a thing provided by FarGate?
- Networking: How do the container networks work? Does it simply use the configured networks in the docker-compose.yaml as expected?