High Availability & Disaster Recovery: Backup and Restore
In the AWS ecosystem, Backup and Restore is the foundation of any Disaster Recovery (DR) strategy. While it has the highest RTO (Recovery Time Objective) and RPO (Recovery Point Objective) compared to Pilot Light or Warm Standby, it is the most cost-effective method for non-critical workloads.
Core Concepts and Services
1. AWS Backup
A fully managed, policy-based service that centralizes and automates data protection across AWS services. It removes the need for custom scripts and manual processes.
- Backup Plans: Define how often to back up (frequency) and how long to keep them (retention).
- Backup Vaults: Logical containers where backups are stored. Use Vault Lock for WORM (Write Once Read Many) compliance.
- Cross-Region/Cross-Account Backup: Essential for DR and security isolation.
2. Storage-Specific Backups
- Amazon EBS: Point-in-time snapshots stored in S3. Use Fast Snapshot Restore (FSR) to eliminate latency on the first block read.
- Amazon RDS: Automated backups (stored in S3, 0-35 days retention) and manual snapshots (stored indefinitely until deleted).
- Amazon DynamoDB: On-demand backups and Point-in-Time Recovery (PITR) which allows restoration to any second in the last 35 days.
- Amazon S3: Use Versioning and Cross-Region Replication (CRR) for protection against accidental deletion or regional failure.
Disaster Recovery Strategy Comparison
| Strategy | RPO (Data Loss) | RTO (Downtime) | Cost |
|---|---|---|---|
| Backup & Restore | Hours | 24 Hours+ | $ (Lowest) |
| Pilot Light | Minutes | Hours | $$ |
| Warm Standby | Seconds/Minutes | Minutes | $$$ |
| Multi-Site (Active-Active) | Near Zero | Near Zero | $$$$ (Highest) |
Decision Matrix: If–Then Guide
| If the requirement is… | Then choose… |
|---|---|
| Compliance requires non-deletable backups | AWS Backup Vault Lock (Compliance Mode) |
| Restore EBS volumes instantly without “warming” | EBS Fast Snapshot Restore (FSR) |
| Protect against accidental ‘DELETE’ in S3 | S3 Versioning + MFA Delete |
| Centralized backup for RDS, EBS, and EFS | AWS Backup |
| Minimize RPO for DynamoDB to seconds | DynamoDB Point-in-Time Recovery (PITR) |
Exam Tips and Gotchas
- S3 is the destination: Almost all AWS backup services (EBS Snapshots, RDS Backups) are stored in S3 behind the scenes, but you don’t see them in your S3 buckets.
- Cross-Region is Key: For a true DR scenario, always look for answers that involve “Cross-Region Snapshot Copy” or “Cross-Region Replication.”
- Automated vs. Manual: RDS automated backups are deleted when the DB instance is deleted. Manual snapshots persist.
- EFS to S3: AWS Backup is the primary way to back up EFS. You cannot “snapshot” EFS like you do EBS.
- Cold Storage: Use AWS Backup to transition backups to “Cold Storage” (Glacier) for long-term cost savings.
Topics covered:
Summary of key subtopics covered in this guide:
- AWS Backup centralized management and Vault Lock.
- RPO vs. RTO definitions in a DR context.
- EBS Snapshotting and Fast Snapshot Restore (FSR).
- RDS Automated Backups vs. Manual Snapshots.
- DynamoDB PITR and S3 Versioning/Replication.
- Comparison of DR strategies (Backup, Pilot Light, Warm Standby, Multi-site).
Backup & Restore Architecture
Service Ecosystem
- IAM: Control who can restore data.
- KMS: Encrypt backups at rest.
- Vault Lock: Prevent deletion (WORM).
- CloudWatch: Monitor backup success/failure.
Performance & Scaling
- EBS FSR: Eliminate “First-touch” penalty.
- S3 CRR: High-speed background replication.
- Incremental: Most snapshots only save changed blocks.
Cost Optimization
- Lifecycle Policies: Move old backups to Glacier Deep Archive.
- Retention: Delete old manual snapshots.
- Cold Storage: 90%+ cheaper for long-term compliance.
Production Use Case
A healthcare company uses AWS Backup to take daily snapshots of their RDS SQL Server and EFS file systems. They enable Cross-Account Backup to a separate “Security Account” to protect against ransomware that might compromise their primary production account credentials.