This article begins a series about Amazon’s Elastic Compute Cloud (EC2) Container Service, which is Amazon’s answer to managing a cluster of Docker containers. Before we begin exploring ECS, there are several key Amazon Web Services (AWS) that you need to be familiar with. This article reviews those core AWS services and lays a foundation for learning about ECS.
Elastic Cloud Compute (EC2)
In a previous article I reviewed EC2, but I think it is important to review it again, and this time in the context of ECS. At its core, EC2 provides virtual machines and an interface and API to manage them. But the motivation behind EC2 is so much more, which can be discovered by diving deeper into its name:
- Cloud Compute Instances: when we think of EC2 instances, we should not only be thinking about virtual machines, but rather as cloud engines that provide computing power. This is an subtle, but important, distinction from virtual machines because, although they are very similar, EC2 instances are really focused on executing a task, just that task happens to be an operating system.
- Elastic: the most important part of EC2 is its elasticity. Elasticity means that an environment can grow and shrink to meet user load. As load increases you can add more instances to your cluster and as load decreases you can remove instances from your cluster. Because you only pay for what you use, you can quickly scale up to meet user demand, but, at the same time, scale back down to reduce costs.
ECS does not define a new computing platform, but rather it leverages EC2 instances. ECS leverages EC2 instances that have two components installed on them:
- Docker Daemon: as we saw in our Docker series, Docker runs through a daemon that sits between a Docker container and the underlying operating system. This is that same Docker daemon that sits between your containers and EC2.
- ECS Agent: ECS provides management capabilities and an abstraction of a Docker cluster upon which it runs Docker containers (I will elaborate in the next article). In order to identify the EC2 instances that are available to host Docker containers, the EC2 instance needs to have an ECS agent installed on it.
Amazon provides Amazon Machine Images (AMIs) that have both Docker and the ECS Agent installed on them, but if you want more control, you can create your own image, install Docker on it, and install the ECS Agent yourself. Amazon has open sourced the ECS Agent and made it available on GitHub. When you leverage the ECS wizard to create an ECS environment, it will choose the preconfigured AMI for you, so you will not need to do any research to find it on your own.
ECS itself is free, but you still pay for the AWS services that you use. This means that you control your own cost by choosing the type, size, and number of EC2 instances that you want to dedicate to ECS. Obviously, the bigger the EC2 instance, the more containers that it can run, but the more it costs.
Auto Scaling Groups
One of the interesting features that EC2 provides, and ECS leverages, is auto-scaling groups. An auto-scaling group defines the minimum, maximum, and desired number of EC2 instances that you want to run of a particular instance type. When the ECS wizard defines a cluster of services that run Docker containers, it will create an auto-scaling group of EC2 instances upon which to run those Docker containers.
Beyond ECS, auto-scaling groups allow you to define policies to automatically add or remove a specific number of EC2 instances from the group based on alarms. Alarms leverage Cloud Watch metrics, which include EC2 instance CPU utilization, disk reads and read operations, disk writes and write operations, and network in and out metrics, to define rules about when to fire. For example, you might define an alarm that fires when the average CPU utilization is greater than 80% for the past two 5 minute periods (for the past 10 minutes). The auto-scaling group policy might respond by adding two new instances to the group when this alarm fires.
The reason that you need to be comfortable with auto-scaling groups when using ECS is that it will run Docker containers across a cluster of EC2 instances and those instances are configured to grow and shrink via auto-scaling groups. As we’ll see, Docker containers will be our level of abstraction, not EC2 instances. EC2 instances will provide the underlying infrastructure upon which our Docker containers run, so other than in our pocket books, all that we care about is whether or not our Docker containers have the resources they need to run optimally.
For more information about Auto Scaling Groups, you can find the documentation here:
Elastic Load Balancer (ELB)
Auto-scaling groups are great because they add and remove EC2 instances based on policies that correlate to user demand, but it still begs the question of how your application can leverage those new EC2 instances. How can your application discover that a new instance has been added and direct traffic to it? Likewise, how do you know that a server is no longer available and you should not send traffic to it? It sounds like a mess of configuration and application that I would not want to deal with!
Fortunately, AWS provides another service called an Elastic Load Balancer (ELB) that solves this problem for us. ELBs, as their name implies, are load balancers that distribute load across a set of EC2 instances.
An auto-scaling group can be associated to an ELB (from your EC2 “Auto Scaling Group” configuration, see Details → Load Balancers) so that when the auto-scaling group adds or removes EC2 instances, it can automatically update the load balancer. Your job, then, is to send load to the ELB and it will distribute that load across your EC2 instances.
ELB can be configured to execute a health check on your EC2 instances, such as by pinging a port or web page on a fixed interval and measuring the response time against a timeout value. If it determines that a server is unhealthy then it will remove it from its group and stop sending load to it.
Beyond just being a load balancer, another powerful feature of ELB is that it can distribute load across multiple availability zones. Availability zones allow you to run instances of your servers in multiple data centers for high availability. If one data center becomes unavailable then you still have servers available to you.
Because ELB distributes load across multiple availability zones and it defines health checks, when it detects that a data center is not available then it will send all traffic to the available data center. Furthermore, auto-scaling groups can be defined across availability zones and also to assess health, so if a data center were to be completely lost, the auto-scaling group will start enough instances to satisfy its policy.
All of this is to say that ELB distributes load across all of the EC2 instances in an auto-scaling group and, if there is an availability issue, ELB will stop sending load to instances in that zone while the auto-scaling group rebuilds itself in good zones!
For more information about ELB, you can find the documentation here:
Figure 1 shows how these all work together.
In figure 1 we see three EC2 instances managed by an Auto Scaling Group. That Auto Scaling Group is associated with an ELB, which means that the ELB distributes its load across all of the EC2 instances. The user interacts with the ELB, which sends the user’s request to one of the EC2 instances.
The Auto Scaling Group has a scaling policy that is driven by an alarm, which is fed metrics by Cloud Watch. Thus, as the load or behavior of the EC2 instances becomes unhealthy, Cloud Watch will report metrics to the alarm, which will detect the unhealthiness of the environment, causing the scaling policy to add more EC2 instances to the Auto Scaling Group.
We have covered a lot of ground here in preparing the basics for our Elastic Container Service implementation, so stay tuned as we dive further in our next article into VPC (Virtual Private Cloud) implementation and much more!
Image sources: Screenshot from video at https://aws.amazon.com/ecs/ , Flowchart: Steven Haines