Amazon S3 Architecture & Storage Classes

Amazon Simple Storage Service (S3) is an object storage service offering industry-leading scalability, data availability, security, and performance. Unlike block storage (EBS) or file storage (EFS), S3 treats data as “objects” stored in “buckets.”

The “Coat Check” Analogy

Think of Amazon S3 as a high-end coat check service. When you hand over your coat (the Object), the attendant gives you a ticket (the Key). You don’t know exactly which rack your coat is on or how the room is organized, but as long as you have that ticket, you can retrieve your coat exactly as you left it. If you want to change your coat, you don’t sew a new pocket onto it while it’s hanging; you take it out, modify it, and put it back (S3 is immutable storage).

Core Concepts & Well-Architected Framework

1. Reliability & Durability

S3 is designed for 99.999999999% (11 9’s) of durability. This is achieved by automatically replicating objects across a minimum of three physically separated Availability Zones (AZs) within an AWS Region.

2. Cost Optimization

AWS provides various storage classes so you can pay only for what you need. By moving infrequently accessed data to lower-cost tiers via Lifecycle Policies, you align costs with data value.

3. Security

S3 follows the “Least Privilege” model. By default, all buckets are private. Security is managed via IAM Policies, Bucket Policies, and Access Control Lists (ACLs). Encryption can be applied at rest (SSE-S3, SSE-KMS, SSE-C) and in transit (TLS).

Comparison: S3 Storage Classes

Storage Class Durability Availability Min. Duration Use Case
S3 Standard 11 9’s 99.99% None Frequent access, active data
S3 Intelligent-Tiering 11 9’s 99.9% None Data with changing access patterns
S3 Standard-IA 11 9’s 99.9% 30 Days Infrequent access, but rapid retrieval
S3 One Zone-IA 11 9’s 99.5% 30 Days Non-critical, replaceable data
S3 Glacier Instant 11 9’s 99.9% 90 Days Archived data, millisecond retrieval
S3 Glacier Deep Archive 11 9’s 99.99% 180 Days Long-term archive (retrieval 12-48 hrs)

Scenario-Based Decision Matrix

  • If you have dynamic data with unknown access patterns, Then use S3 Intelligent-Tiering to automate cost savings.
  • If you need to host a static website with high availability, Then use S3 Standard.
  • If you have secondary backup copies that can be recreated if a region stays up but a zone fails, Then use S3 One Zone-IA to save 20% in costs.
  • If you must store compliance logs for 7 years and rarely look at them, Then use S3 Glacier Deep Archive.

Exam Tips: Golden Nuggets

  • Consistency: S3 provides strong read-after-write consistency for all applications (it used to be eventual consistency for overwrites, but that changed in 2020).
  • Object Size: The maximum size of a single S3 object is 5 TB. However, the maximum upload size in a single PUT is 5 GB (use Multipart Upload for anything over 100 MB).
  • Not a File System: S3 is not a mountable drive for an OS. If the exam mentions “locking” files or “Linux permissions (POSIX),” the answer is likely Amazon EFS or FSx, not S3.
  • Performance: Use S3 Select to retrieve only a subset of data from an object (e.g., specific rows in a CSV) to improve performance and reduce costs.

S3 Architectural Flow

User/App S3 BUCKET Standard (Active) IA (Infrequent) Glacier (Archive)

Key Services

Lifecycle Policies: Automate transitions between classes (e.g., move to Glacier after 30 days).

Replication: CRR (Cross-Region) or SRR (Same-Region) for compliance and latency.

Common Pitfalls

Public Access: Forgetting to enable “Block Public Access” leads to data leaks.

One Zone-IA: Don’t use this for unique data; if the AZ fails, the data is gone.

Quick Patterns

Static Web: Enable “Static Website Hosting” + CloudFront for global distribution.

Big Data: Use S3 as a “Data Lake” for Athena or Redshift Spectrum queries.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top