AWS Compute Services: Amazon EC2 Auto Scaling

Amazon EC2 Auto Scaling ensures that you have the correct number of Amazon EC2 instances available to handle the load for your application. It helps you maintain application availability and allows you to automatically add or remove EC2 instances according to conditions you define.

The Restaurant Analogy

Imagine a popular restaurant. On Friday nights, the place is packed, and you need 10 waiters to provide good service. On Monday afternoons, it’s quiet, and 2 waiters are enough. Auto Scaling is like a manager who calls in extra staff exactly when the line starts forming at the door and sends them home when the crowd thins out, ensuring customers are happy (Performance) and the owner isn’t paying for idle staff (Cost Optimization).

Core Components

Groups: Logical collections of EC2 instances (Auto Scaling Groups – ASG). You define the Minimum, Maximum, and Desired capacity.
Configuration Templates: Defines what to launch. AWS strongly recommends Launch Templates over the legacy Launch Configurations.
Scaling Options: Defines when and how to scale.

Comparison: Launch Template vs. Launch Configuration

Feature	Launch Template (Recommended)	Launch Configuration (Legacy)
Versioning	Yes (Multiple versions allowed)	No (Must recreate for changes)
T2/T3 Unlimited	Supported	Not Supported
Spot & On-Demand	Can mix in a single ASG	Single purchase option only
Subsetting	Can use partial parameters	Must define all parameters

Scaling Policies

1. Target Tracking Scaling

The simplest way to scale. You pick a metric (e.g., Average CPU Utilization at 50%) and ASG handles the math to keep it there. Similar to a thermostat.

2. Step Scaling

Responds based on “steps.” If CPU > 50%, add 1 instance. If CPU > 80%, add 3 instances. Useful for rapid spikes.

3. Scheduled Scaling

Scaling based on known time patterns. Example: Scaling up every Friday at 5:00 PM for a weekend sale.

4. Predictive Scaling

Uses Machine Learning to analyze historical traffic patterns and schedules scaling actions in advance of predicted load changes.

Decision Matrix: Which Scaling Policy?

If you want to maintain a specific metric (CPU/Request Count)… Then use Target Tracking.
If your traffic spikes are aggressive and vary in size… Then use Step Scaling.
If you have a predictable weekly event… Then use Scheduled Scaling.
If you have long-term historical data and want to be proactive… Then use Predictive Scaling.

Health Checks and Termination

ASG performs EC2 Health Checks (Status Checks) by default. However, if your ASG is behind an Elastic Load Balancer (ELB), you should enable ELB Health Checks. This ensures that if an instance is “Running” but the application is returning a 504 error, the ASG will replace it.

Termination Policy: By default, ASG protects the balance of Availability Zones (AZs). It identifies the AZ with the most instances and deletes the instance with the oldest launch configuration/template first.

Exam Tips and Gotchas

Cooldown Periods: Prevents the ASG from launching or terminating additional instances before the previous scaling activity takes effect. Default is 300 seconds.
ASG spans AZs, not Regions: An Auto Scaling Group can span multiple Availability Zones within the same region, but it cannot span multiple regions.
Termination Protection: You can protect specific instances from being terminated during scale-in events.
Lifecycle Hooks: Use these to perform custom actions (like downloading logs or installing software) before an instance is fully “InService” or “Terminated.”
Suspended Processes: You can stop specific ASG actions like “Launch” or “Terminate” for troubleshooting without deleting the group.

Topics covered:

Summary of key subtopics covered in this guide:

ASG Core Concepts (Min/Max/Desired)
Launch Templates vs. Launch Configurations
Scaling Policies (Target, Step, Scheduled, Predictive)
Health Check types (EC2 vs. ELB)
Termination Policies and Cooldowns
Lifecycle Hooks and Instance Refresh

Infographic: The Auto Scaling Ecosystem

CloudWatch triggers ASG based on metrics -> ASG launches/terminates instances -> ELB distributes traffic.

Service Ecosystem

CloudWatch: The “eyes” that trigger scaling alarms.

IAM: Roles attached to instances via Launch Templates.

VPC: Defines subnets/AZs where instances live.

Performance

Warm Pools: Keep stopped instances ready to reduce “boot-up” lag for applications with long initialization times.

Instance Refresh: Roll out new AMI versions without manual termination.

Cost Optimization

Spot Fleet: Use ASG to mix Spot and On-Demand instances to save up to 90% on costs for fault-tolerant apps.

Rebalancing: ASG automatically re-balances instances across AZs if one becomes unavailable.

Production Use Case: A media streaming site uses Scheduled Scaling to ramp up capacity 30 minutes before a major live sports event starts, and Target Tracking (CPU 60%) to handle unexpected surges during the broadcast.