Amazon EFS: Elastic File System Study Guide
Amazon EFS provides a simple, serverless, set-and-forget elastic file system for use with AWS Cloud services and on-premises resources. It is built to scale on demand to petabytes without disrupting applications, growing and shrinking automatically as you add and remove files.
The Real-World Analogy
Think of Amazon EFS as a shared office network drive. Just as multiple employees in an office can open, edit, and save files to the same central folder simultaneously from different computers, EFS allows hundreds or thousands of EC2 instances to access the same data at once. It expands its storage capacity automatically as the office grows, so you never have to buy a “bigger hard drive.”
Core Concepts & Architecture
EFS is a regional service that stores data across multiple Availability Zones (AZs) for high durability and availability. It uses the NFSv4 protocol and is specifically designed for Linux-based workloads.
Storage Classes
- Standard: Frequently accessed data. Multi-AZ resilience.
- Standard-IA (Infrequent Access): Cost-optimized for data not accessed daily. Multi-AZ resilience.
- One Zone: Stores data in a single AZ. Lower cost (approx. 45% less), but lower availability.
- One Zone-IA: Cheapest option for infrequently accessed data in a single AZ.
Performance & Throughput Modes
| Feature | Option A | Option B |
|---|---|---|
| Performance Mode | General Purpose: Default. Lowest latency per operation. Best for web servers, CMS, and dev tools. | Max I/O: Higher aggregate throughput and IOPS. Higher latency. Best for big data/parallel processing. |
| Throughput Mode | Bursting/Provisioned: Throughput scales with size or is set manually. | Elastic (Recommended): Automatically scales throughput based on workload activity. Pay-per-use. |
Decision Matrix: EFS vs. EBS vs. S3
| Requirement | Amazon EBS | Amazon EFS | Amazon S3 |
|---|---|---|---|
| Access Pattern | Single Instance (usually) | Thousands of Instances | Global/Web Access |
| Protocol | Block (iSCSI-like) | File (NFS) | Object (REST API) |
| Scalability | Manual scaling | Auto-scaling | Virtually Infinite |
| OS Support | Linux & Windows | Linux only (POSIX) | Any |
Exam Tips and Gotchas
- Linux Only: If the exam mentions a Windows shared drive, choose Amazon FSx for Windows File Server, not EFS.
- Mount Targets: To connect an EC2 to EFS, you must create a Mount Target in the same VPC and AZ as the instance.
- Security Groups: Ensure the EFS Security Group allows inbound traffic on TCP Port 2049 (NFS) from the EC2 Security Group.
- On-Premises: You can access EFS from on-premises servers via AWS Direct Connect or AWS VPN.
- Lifecycle Management: Use this to automatically move files to IA storage classes after 14, 30, 60, or 90 days to save costs.
Decision Matrix / If–Then Guide
- IF you need a shared file system for a fleet of Linux web servers THEN choose Amazon EFS.
- IF you need high-performance local block storage for a database THEN choose Amazon EBS.
- IF you need to store millions of images for a website with global access THEN choose Amazon S3.
- IF you need a shared file system for Windows instances (SMB protocol) THEN choose FSx for Windows.
Topics covered:
Summary of key subtopics covered in this guide:
- EFS Regional vs. One Zone Deployment
- Standard vs. Infrequent Access (IA) Storage Classes
- General Purpose vs. Max I/O Performance Modes
- Bursting, Provisioned, and Elastic Throughput
- NFSv4 Protocol and Linux Compatibility
- Network security via Security Groups and Mount Targets
- Cost optimization using Lifecycle Management
Amazon EFS Architectural Infographic
IAM: Control who can mount/modify the file system.
KMS: Encryption at rest with AWS-managed or customer-managed keys.
CloudWatch: Monitor BurstCreditBalance and Throughput.
Elastic Throughput: The modern way to handle unpredictable workloads without over-provisioning.
Scales to Petabytes: No need to pre-allocate storage size.
Lifecycle Management: Set policy to move files to EFS-IA if not accessed for 30 days. Saves up to 92% on storage costs.
One Zone: Use for non-critical data or dev/test environments.
Production Use Case: Content Management Systems (CMS) like WordPress. Multiple web servers share the same /wp-content/uploads directory, ensuring all users see the same images regardless of which server handles their request.