Amazon S3 Architecture & Storage Classes
Amazon Simple Storage Service (S3) is an object storage service offering industry-leading scalability, data availability, security, and performance. Unlike block storage (EBS) or file storage (EFS), S3 treats data as “objects” stored in “buckets.”
The “Coat Check” Analogy
Think of Amazon S3 as a high-end coat check service. When you hand over your coat (the Object), the attendant gives you a ticket (the Key). You don’t know exactly which rack your coat is on or how the room is organized, but as long as you have that ticket, you can retrieve your coat exactly as you left it. If you want to change your coat, you don’t sew a new pocket onto it while it’s hanging; you take it out, modify it, and put it back (S3 is immutable storage).
Core Concepts & Well-Architected Framework
1. Reliability & Durability
S3 is designed for 99.999999999% (11 9’s) of durability. This is achieved by automatically replicating objects across a minimum of three physically separated Availability Zones (AZs) within an AWS Region.
2. Cost Optimization
AWS provides various storage classes so you can pay only for what you need. By moving infrequently accessed data to lower-cost tiers via Lifecycle Policies, you align costs with data value.
3. Security
S3 follows the “Least Privilege” model. By default, all buckets are private. Security is managed via IAM Policies, Bucket Policies, and Access Control Lists (ACLs). Encryption can be applied at rest (SSE-S3, SSE-KMS, SSE-C) and in transit (TLS).
Comparison: S3 Storage Classes
| Storage Class | Durability | Availability | Min. Duration | Use Case |
|---|---|---|---|---|
| S3 Standard | 11 9’s | 99.99% | None | Frequent access, active data |
| S3 Intelligent-Tiering | 11 9’s | 99.9% | None | Data with changing access patterns |
| S3 Standard-IA | 11 9’s | 99.9% | 30 Days | Infrequent access, but rapid retrieval |
| S3 One Zone-IA | 11 9’s | 99.5% | 30 Days | Non-critical, replaceable data |
| S3 Glacier Instant | 11 9’s | 99.9% | 90 Days | Archived data, millisecond retrieval |
| S3 Glacier Deep Archive | 11 9’s | 99.99% | 180 Days | Long-term archive (retrieval 12-48 hrs) |
Scenario-Based Decision Matrix
- If you have dynamic data with unknown access patterns, Then use S3 Intelligent-Tiering to automate cost savings.
- If you need to host a static website with high availability, Then use S3 Standard.
- If you have secondary backup copies that can be recreated if a region stays up but a zone fails, Then use S3 One Zone-IA to save 20% in costs.
- If you must store compliance logs for 7 years and rarely look at them, Then use S3 Glacier Deep Archive.
Exam Tips: Golden Nuggets
- Consistency: S3 provides strong read-after-write consistency for all applications (it used to be eventual consistency for overwrites, but that changed in 2020).
- Object Size: The maximum size of a single S3 object is 5 TB. However, the maximum upload size in a single PUT is 5 GB (use Multipart Upload for anything over 100 MB).
- Not a File System: S3 is not a mountable drive for an OS. If the exam mentions “locking” files or “Linux permissions (POSIX),” the answer is likely Amazon EFS or FSx, not S3.
- Performance: Use S3 Select to retrieve only a subset of data from an object (e.g., specific rows in a CSV) to improve performance and reduce costs.
S3 Architectural Flow
Key Services
Lifecycle Policies: Automate transitions between classes (e.g., move to Glacier after 30 days).
Replication: CRR (Cross-Region) or SRR (Same-Region) for compliance and latency.
Common Pitfalls
Public Access: Forgetting to enable “Block Public Access” leads to data leaks.
One Zone-IA: Don’t use this for unique data; if the AZ fails, the data is gone.
Quick Patterns
Static Web: Enable “Static Website Hosting” + CloudFront for global distribution.
Big Data: Use S3 as a “Data Lake” for Athena or Redshift Spectrum queries.