Amazon DynamoDB: The Serverless NoSQL Powerhouse
Amazon DynamoDB is a fully managed, multi-region, multi-active, durable database with built-in security, backup and restore, and in-memory caching for internet-scale applications. It is a Key-Value and Document database that delivers single-digit millisecond performance at any scale.
Core Concepts for SAA-C03
1. Data Structure: Tables, Items, and Attributes
- Items: Similar to rows in RDBMS. Each item can have different attributes (schema-less).
- Attributes: Similar to columns.
- Primary Key: Can be a simple Partition Key (PK) or a composite key (PK + Sort Key).
2. Read/Write Capacity Modes
- Provisioned Mode: You specify Reads Per Second (RCU) and Writes Per Second (WCU). Best for predictable traffic and cost control. Auto-scaling is available.
- On-Demand Mode: Scales instantly based on traffic. You pay per request. Best for unpredictable workloads or new applications.
3. Consistency Models
- Eventually Consistent Reads (Default): Maximizes read throughput; might not reflect a very recent write.
- Strongly Consistent Reads: Returns the most up-to-date data. Costs double the RCUs compared to eventual consistency.
- ACID Transactions: Support for all-or-nothing operations across multiple tables.
4. Secondary Indexes
| Feature | Local Secondary Index (LSI) | Global Secondary Index (GSI) |
|---|---|---|
| Partition Key | Must be the same as the base table. | Can be different from the base table. |
| Sort Key | Must be different. | Can be different or none. |
| Creation | Only at table creation time. | Created at any time. |
| Scope | Limited to one partition. | Spans across all partitions. |
Advanced Features & Performance
- DAX (DynamoDB Accelerator): An in-memory cache for DynamoDB. Use this when you need microseconds latency for read-heavy workloads.
- DynamoDB Streams: Captures item-level changes (Insert, Update, Delete) in real-time. Perfect for triggering Lambda functions (e.g., sending a welcome email when a user signs up).
- Global Tables: Provides multi-region, multi-active replication. Useful for disaster recovery and local performance for global users.
- TTL (Time to Live): Automatically deletes items after a specific timestamp, reducing storage costs without using WCUs.
Decision Matrix / If–Then Guide
- IF you need microseconds latency THEN use DAX.
- IF traffic is unpredictable or “spiky” THEN use On-Demand Mode.
- IF you need to search on a non-key attribute THEN use GSI (do not use Scan).
- IF you need to react to data changes in real-time THEN use DynamoDB Streams + Lambda.
- IF you need multi-region redundancy THEN use Global Tables.
Exam Tips and Gotchas
- Scan vs. Query: A
Queryfinds items based on the Primary Key. AScanlooks at every item in the table. Always prefer Query for performance and cost. - LSI Limitation: You cannot add an LSI to an existing table. If the exam asks how to add an index to a 3-year-old table, the answer is GSI.
- Hot Partitions: If your Partition Key is poorly designed (e.g., a “Status” field with only 2 values), one partition will get all the traffic, leading to
ProvisionedThroughputExceededException. - Large Items: DynamoDB has a 400KB item size limit. For larger files, store them in S3 and save the S3 URL in DynamoDB.
Topics covered :
Summary of key subtopics covered in this guide:
- Key-Value vs. Document models.
- Provisioned vs. On-Demand capacity modes.
- RCU/WCU calculations and consistency levels.
- LSI vs. GSI differences and limitations.
- DAX for microsecond performance.
- DynamoDB Streams for event-driven architecture.
- Global Tables for multi-region availability.
- Security (IAM, KMS, VPC Endpoints).
Amazon DynamoDB Architecture & Ecosystem
SAA-C03 Visual Reference Guide
Service Ecosystem
IAM: Fine-grained access control down to the attribute level.
KMS: Encryption at rest is default and mandatory.
VPC Endpoints: Access DynamoDB privately without an IGW or NAT Gateway.
CloudWatch: Monitor RCU/WCU consumption and Throttling events.
Performance & Scaling
DAX Cache Adaptive Capacity Auto-ScalingCost Optimization
TTL: Expire old logs/session data for free.
S3 Export: Export data to S3 for long-term cold storage or Athena analysis.
Reserved Capacity: 1 or 3-year commitment for significant savings on Provisioned throughput.