AWS Storage Services: Cost Optimization Study Guide
Cost optimization is a pillar of the AWS Well-Architected Framework. In the SAA-C03 exam, you aren’t just asked how to store data, but how to store it most efficiently. This guide focuses on selecting the right storage class, automating data lifecycles, and choosing the correct volume types to minimize spend without sacrificing durability.
The Real-World Analogy
Think of AWS Storage like a Home Organization System. You keep your daily essentials (wallet, keys) on a tray by the door (S3 Standard). You keep seasonal clothes (winter coats) in the top of the closet where they are slightly harder to reach (S3 Standard-IA). Finally, you put old tax returns in a box in a rented storage unit across town (S3 Glacier). You pay less for the storage unit, but it takes more time and effort to get your items back.
1. Amazon S3 Cost Optimization
S3 is the most frequent target for cost-related questions. The key is matching the access pattern to the storage class.
- S3 Intelligent-Tiering: The “Magic” tier. It automatically moves objects between frequent and infrequent access tiers based on changing access patterns. Use this when access patterns are unknown or unpredictable.
- S3 Standard-IA & One Zone-IA: Lower storage price but higher retrieval fees. Use for data that is not accessed often but needs millisecond access when requested. Note: One Zone-IA is 20% cheaper but lacks multi-AZ resilience.
- S3 Glacier Tiers: For long-term archival.
- Instant Retrieval: Millisecond access (rarely accessed data).
- Flexible Retrieval: Minutes to hours.
- Deep Archive: 12-48 hours (Cheapest storage in AWS).
2. EBS Volume Optimization
Elastic Block Store (EBS) costs are driven by volume type, provisioned size, and IOPS/Throughput.
- gp3 vs gp2: Always look for gp3 in exam answers. It is 20% cheaper than gp2 and allows you to scale IOPS and Throughput independently of storage size.
- HDD Volumes (st1, sc1): These are much cheaper than SSDs. Use st1 (Throughput Optimized) for big data/log processing and sc1 (Cold HDD) for infrequently accessed large datasets. Remember: HDD volumes cannot be used as boot volumes.
3. EFS Lifecycle Management
Amazon EFS can be expensive if left on the “Standard” tier. Use Lifecycle Management to automatically move files that haven’t been accessed for a period (e.g., 30 days) to the EFS Infrequent Access (IA) or EFS Archive storage classes, which can reduce costs by up to 92%.
Comparison Table: Storage Cost Drivers
| Service | Primary Cost Factor | Optimization Strategy | Durability |
|---|---|---|---|
| S3 | GB per month + Requests | Lifecycle Policies / Intelligent-Tiering | 11 9s (Standard) |
| EBS | Provisioned GB per month | Switch gp2 to gp3; Delete unattached volumes | 99.9% – 99.999% |
| EFS | Stored GB per month | Lifecycle Management (IA Tier) | 11 9s |
| FSx | Provisioned SSD/HDD + Throughput | Data Deduplication (Windows) | Multi-AZ deployment |
Exam Tips and Gotchas
- S3 One Zone-IA: If the exam mentions “reproducible data” (like thumbnails or transcoded videos) and “lowest cost,” this is the winner.
- S3 Lifecycle Policies: Transitioning data from Standard to IA requires objects to be at least 128KB and stored for at least 30 days to be cost-effective.
- EBS Snapshots: These are stored in S3 and are incremental. To save costs, delete old snapshots that are no longer needed.
- S3 Select / Glacier Select: These allow you to use SQL to pull only the data you need from an object, reducing data transfer costs significantly.
Decision Matrix: If-Then Guide
- IF access patterns are unpredictable THEN use S3 Intelligent-Tiering.
- IF data is long-term archival and you can wait 12 hours THEN use S3 Glacier Deep Archive.
- IF you need a low-cost HDD boot volume THEN STOP (Trick question! HDD cannot be boot volumes; use gp3).
- IF you have many EC2 instances sharing files THEN use EFS with Lifecycle Management.
Topics covered:
Summary of key subtopics covered in this guide:
- S3 Storage Classes and transition logic.
- EBS volume type selection (SSD vs HDD).
- EFS Lifecycle Management for shared file systems.
- S3 Select for reducing egress costs.
- Storage durability vs. cost trade-offs.
Infographic: The Storage Cost Optimizer
Automated Lifecycle Transitions: Moving data to cheaper tiers as it ages.
Performance & Scaling
S3 Select: Retrieve only specific columns/rows from CSV/JSON. Reduces IO and cost.
Intelligent-Tiering: No retrieval fees. Best for data with unknown access patterns.
Block & File Savings
gp3 vs gp2: gp3 provides better baseline performance at a lower price point.
EFS IA: Move files to Infrequent Access automatically via policy to save ~90%.
Log Archival Strategy
Scenario: Store application logs for 7 years for compliance.
Solution: Store in S3 Standard for 30 days, transition to Glacier Flexible Retrieval for 6 months, then to Glacier Deep Archive for the remainder.