Backup & Archival Strategies
In the AWS ecosystem, backup and archival are distinct but related disciplines. Backup focuses on the recovery of active data to maintain business continuity, while archival focuses on the long-term preservation of data that is rarely accessed but must be kept for compliance or historical purposes.
The Attic Analogy
Think of your data like your household items. Backups are like keeping a spare set of car keys in a drawer downstairs; you need them quickly if you lose the first set (Low RTO). Archival is like putting your old tax records in a labeled box in the attic. You don’t expect to need them today, but if an auditor calls in three years, you know exactly where they areāeven if it takes a few hours to climb up and get them (Lower Cost, Higher Latency).
Core Concepts: Reliability & Cost Optimization
According to the AWS Well-Architected Framework, backup and archival strategies directly support two pillars:
- Reliability: Ensuring the system can recover from infrastructure or service disruptions through automated backups (RPO/RTO).
- Cost Optimization: Using tiered storage (like S3 Glacier) to minimize spend on data that doesn’t require high availability.
Comparison: S3 Storage Classes for Archival
| Service Class | Retrieval Time | Min Storage Duration | Cost (Storage) | Best Use Case |
|---|---|---|---|---|
| S3 Standard-IA | Milliseconds | 30 Days | Moderate | Backups accessed once a month. |
| S3 Glacier Instant | Milliseconds | 90 Days | Low | Medical images, old news media. |
| S3 Glacier Flexible | 1 min – 12 hours | 90 Days | Very Low | Standard backups, yearly audits. |
| S3 Glacier Deep Archive | 12 – 48 hours | 180 Days | Lowest | Regulatory/Legal compliance (7-10 yrs). |
Scenario-Based Decision Matrix
- If you need to manage backups across multiple AWS services (EBS, RDS, EFS) from a central console, Then use AWS Backup.
- If you need to prevent accidental deletion of backups for compliance, Then enable S3 Object Lock or AWS Backup Vault Lock.
- If you have on-premises servers that need to backup directly to the cloud, Then use AWS Storage Gateway (Tape Gateway).
- If you need to move data to archival based on its age, Then implement S3 Lifecycle Policies.
Exam Tips: Golden Nuggets
- RPO vs. RTO: Recovery Point Objective (RPO) is about data loss (time since last backup); Recovery Time Objective (RTO) is about downtime (how long it takes to restore).
- Cross-Region Copy: For disaster recovery (DR) scenarios, always look for “Cross-Region Backup” or “Cross-Region Replication” to protect against a total region failure.
- Glacier Retrieval Fees: Be careful! While storage is cheap, retrieving large amounts of data from Glacier Flexible or Deep Archive can be expensive and slow.
- Versioning: Enabling S3 Versioning is a prerequisite for Replication and a primary defense against accidental overwrites.
Visualizing Backup & Archival Flow
Architecting for Durability and Recovery
Key Services
- AWS Backup: Policy-based management.
- S3 Glacier: Secure, durable, low-cost archival.
- AWS DRS: Elastic Disaster Recovery for block storage.
Common Pitfalls
- Manual Snapshots: Hard to scale; use automated plans.
- No Testing: A backup is only good if the restore works.
- Ignoring KMS: Always encrypt your backup vaults.
Quick Patterns
- Daily: Snapshot to S3.
- Monthly: Move to Glacier via Lifecycle.
- Compliance: Enable MFA Delete and Vault Lock.