AWS Database Services: Amazon Aurora
Amazon Aurora is a MySQL and PostgreSQL-compatible relational database built for the cloud. It combines the performance and availability of traditional commercial databases with the simplicity and cost-effectiveness of open-source databases.
Core Concepts & Architecture
Aurora features a distributed, fault-tolerant, self-healing storage system that auto-scales up to 128 TiB per database instance. It decouples compute from storage, which is the secret to its high performance and rapid failover.
1. High Availability and Durability
- Data Replication: Aurora automatically replicates your data in 6 copies across 3 Availability Zones (AZs).
- Quorum System: It requires 4/6 copies for writes and 3/6 for reads, ensuring data integrity even if an entire AZ goes offline.
- Self-Healing: Data blocks and disks are continuously scanned for errors and repaired automatically.
2. Scaling and Replicas
- Read Replicas: You can add up to 15 Aurora Replicas. These share the same underlying storage as the primary instance, meaning replication lag is typically sub-10ms.
- Aurora Serverless v2: Automatically scales compute capacity (measured in ACUs) up and down in fractions of a second based on application demand.
- Global Database: For disaster recovery and low-latency local reads, Aurora can replicate data to up to 5 secondary AWS Regions with typical latency of < 1 second.
Comparison: Aurora vs. RDS Standard
| Feature | RDS (MySQL/Postgres) | Amazon Aurora |
|---|---|---|
| Max Storage | 64 TiB | 128 TiB |
| Replication | Binlog-based (Slower) | Storage-level (Faster) |
| Max Read Replicas | 5-15 (depends on engine) | 15 |
| Failover Time | 60-120 seconds | < 30 seconds (often < 15s) |
| Self-Healing Storage | No | Yes |
Security & Integrations
- IAM Auth: You can authenticate to your Aurora DB using AWS IAM users and roles (no need for DB passwords in code).
- Encryption: At-rest encryption using AWS KMS and in-transit encryption using SSL/TLS.
- Backtrack: Allows you to “rewind” the database to a specific point in time without restoring from a backup (useful for accidental deletes).
Decision Matrix: If-Then Guide
- If the requirement is “Fastest possible failover,” choose Aurora.
- If the requirement is “Unpredictable workloads or dev/test,” choose Aurora Serverless.
- If you need “Global Disaster Recovery with RPO < 1 min," choose Aurora Global Database.
- If you need “Cost-effective small DB for a blog,” choose RDS MySQL (Aurora might be overkill).
Exam Tips and Gotchas
- The 6-Copy Rule: Remember Aurora always stores 6 copies across 3 AZs. This is a common “durability” question.
- Endpoints: Understand the difference between the Cluster Endpoint (points to Writer) and Reader Endpoint (load balances across Replicas).
- Backtrack vs. Snapshots: Backtrack is “instant” rewinding; Snapshots are traditional point-in-time restores to a new instance.
- Storage Auto-scaling: You don’t need to provision storage size in Aurora; it grows automatically.
Topics covered:
Summary of key subtopics covered in this guide:
- Aurora Shared Storage Architecture (6 copies/3 AZs)
- Aurora Replicas vs. RDS Read Replicas
- Aurora Serverless v2 for auto-scaling
- Global Database for multi-region DR
- Security (IAM Auth & KMS)
- High Availability and Failover mechanics
Amazon Aurora Architecture Overview
The storage layer is independent of the compute nodes. All replicas “see” the same data.
Performance Throughput
Up to 5x throughput of standard MySQL and 3x of standard PostgreSQL. Parallel query capabilities for analytical-style workloads.
Availability Failover
If the Primary fails, Aurora promotes a Replica. If no Replica exists, it creates a new Primary. Failover is typically under 30 seconds.
Cost Optimization
Use Aurora Serverless v2 for variable workloads. Use Aurora I/O-Optimized for I/O intensive apps to make costs predictable.