AWS Compute Services: Amazon EC2 Auto Scaling
Amazon EC2 Auto Scaling ensures that you have the correct number of Amazon EC2 instances available to handle the load for your application. It helps you maintain application availability and allows you to automatically add or remove EC2 instances according to conditions you define.
The Restaurant Analogy
Imagine a popular restaurant. On Friday nights, the place is packed, and you need 10 waiters to provide good service. On Monday afternoons, it’s quiet, and 2 waiters are enough. Auto Scaling is like a manager who calls in extra staff exactly when the line starts forming at the door and sends them home when the crowd thins out, ensuring customers are happy (Performance) and the owner isn’t paying for idle staff (Cost Optimization).
Core Components
- Groups: Logical collections of EC2 instances (Auto Scaling Groups – ASG). You define the Minimum, Maximum, and Desired capacity.
- Configuration Templates: Defines what to launch. AWS strongly recommends Launch Templates over the legacy Launch Configurations.
- Scaling Options: Defines when and how to scale.
Comparison: Launch Template vs. Launch Configuration
| Feature | Launch Template (Recommended) | Launch Configuration (Legacy) |
|---|---|---|
| Versioning | Yes (Multiple versions allowed) | No (Must recreate for changes) |
| T2/T3 Unlimited | Supported | Not Supported |
| Spot & On-Demand | Can mix in a single ASG | Single purchase option only |
| Subsetting | Can use partial parameters | Must define all parameters |
Scaling Policies
1. Target Tracking Scaling
The simplest way to scale. You pick a metric (e.g., Average CPU Utilization at 50%) and ASG handles the math to keep it there. Similar to a thermostat.
2. Step Scaling
Responds based on “steps.” If CPU > 50%, add 1 instance. If CPU > 80%, add 3 instances. Useful for rapid spikes.
3. Scheduled Scaling
Scaling based on known time patterns. Example: Scaling up every Friday at 5:00 PM for a weekend sale.
4. Predictive Scaling
Uses Machine Learning to analyze historical traffic patterns and schedules scaling actions in advance of predicted load changes.
Decision Matrix: Which Scaling Policy?
- If you want to maintain a specific metric (CPU/Request Count)… Then use Target Tracking.
- If your traffic spikes are aggressive and vary in size… Then use Step Scaling.
- If you have a predictable weekly event… Then use Scheduled Scaling.
- If you have long-term historical data and want to be proactive… Then use Predictive Scaling.
Health Checks and Termination
ASG performs EC2 Health Checks (Status Checks) by default. However, if your ASG is behind an Elastic Load Balancer (ELB), you should enable ELB Health Checks. This ensures that if an instance is “Running” but the application is returning a 504 error, the ASG will replace it.
Termination Policy: By default, ASG protects the balance of Availability Zones (AZs). It identifies the AZ with the most instances and deletes the instance with the oldest launch configuration/template first.
Exam Tips and Gotchas
- Cooldown Periods: Prevents the ASG from launching or terminating additional instances before the previous scaling activity takes effect. Default is 300 seconds.
- ASG spans AZs, not Regions: An Auto Scaling Group can span multiple Availability Zones within the same region, but it cannot span multiple regions.
- Termination Protection: You can protect specific instances from being terminated during scale-in events.
- Lifecycle Hooks: Use these to perform custom actions (like downloading logs or installing software) before an instance is fully “InService” or “Terminated.”
- Suspended Processes: You can stop specific ASG actions like “Launch” or “Terminate” for troubleshooting without deleting the group.
Topics covered:
Summary of key subtopics covered in this guide:
- ASG Core Concepts (Min/Max/Desired)
- Launch Templates vs. Launch Configurations
- Scaling Policies (Target, Step, Scheduled, Predictive)
- Health Check types (EC2 vs. ELB)
- Termination Policies and Cooldowns
- Lifecycle Hooks and Instance Refresh
Infographic: The Auto Scaling Ecosystem
CloudWatch triggers ASG based on metrics -> ASG launches/terminates instances -> ELB distributes traffic.
CloudWatch: The “eyes” that trigger scaling alarms.
IAM: Roles attached to instances via Launch Templates.
VPC: Defines subnets/AZs where instances live.
Warm Pools: Keep stopped instances ready to reduce “boot-up” lag for applications with long initialization times.
Instance Refresh: Roll out new AMI versions without manual termination.
Spot Fleet: Use ASG to mix Spot and On-Demand instances to save up to 90% on costs for fault-tolerant apps.
Rebalancing: ASG automatically re-balances instances across AZs if one becomes unavailable.