AWS Database Services: Streams (DynamoDB & Kinesis)
In the world of modern cloud architecture, databases are no longer just static repositories of data; they are sources of continuous events. Database Streams capture a time-ordered sequence of item-level modifications in a table, allowing you to react to changes in real-time.
The Analogy
Think of a Database Stream like a security camera feed for a high-security vault. While the database is the vault itself (storing the gold), the stream is the live video recording every time someone enters, adds something, or removes something. If a motion sensor (a Trigger) detects a change on the video feed, it can automatically sound an alarm or turn on the lights (Lambda functions).
1. DynamoDB Streams
DynamoDB Streams captures changes to items in a DynamoDB table at the moment they happen. It provides a 24-hour rolling window of changes.
- Ordering: Items appear in the stream in the exact order the modifications occurred.
- Deduplication: Each modification appears exactly once in the stream.
- Retention: Data is automatically deleted after 24 hours. This is a hard limit.
- Shards: The stream is composed of shards. AWS manages shard rotation and scaling automatically.
Stream View Types (Crucial for Exam)
When enabling a stream, you must choose what information is written to it:
- KEYS_ONLY: Only the key attributes of the modified item.
- NEW_IMAGE: The entire item as it appears after the modification.
- OLD_IMAGE: The entire item as it appeared before the modification.
- NEW_AND_OLD_IMAGES: Both the new and the old images of the item.
2. Kinesis Data Streams (KDS) for DynamoDB
Introduced as an alternative to native DynamoDB Streams, KDS offers more flexibility for high-volume or long-retention needs.
- Retention: Up to 365 days (default 24 hours).
- Multiple Consumers: Supports more consumers than native streams (which are limited to 2 simultaneous consumers per shard).
- Enhanced Fan-out: Provides dedicated throughput for consumers.
3. Comparison Table: Choosing Your Stream
| Feature | DynamoDB Streams | Kinesis Data Streams |
|---|---|---|
| Retention | Fixed 24 Hours | 1 Day to 365 Days |
| Consumers | Max 2 per shard | Up to 20+ (Enhanced Fan-out) |
| Ordering | Guaranteed per item | Guaranteed per partition key |
| Integration | Native with Lambda/Global Tables | Kinesis Data Analytics/Firehose |
| Cost Model | Read Request Units | Shard Hour + PUT Payload units |
Exam Tips and Gotchas
- Global Tables: DynamoDB Global Tables require DynamoDB Streams to be enabled for cross-region replication.
- The 24-Hour Wall: If an exam scenario mentions needing to process changes from 3 days ago, DynamoDB Streams is the wrong answer; use Kinesis Data Streams.
- Lambda Triggers: This is the most common architectural pattern. DynamoDB Stream -> Lambda -> SNS/SQS/S3.
- Shard Exhaustion: If you have more than 2 processes reading from a DynamoDB Stream shard simultaneously, you may face throttling.
- Exactly Once: While the stream records the change once, your Lambda function must be idempotent in case of retries.
Decision Matrix / If–Then Guide
| If the requirement is… | Then choose… |
|---|---|
| Triggering a Lambda for real-time notifications | DynamoDB Streams |
| Storing change logs for 90 days for compliance | Kinesis Data Streams |
| Replicating data across AWS Regions (Global Tables) | DynamoDB Streams |
| Processing changes with Kinesis Data Analytics (SQL) | Kinesis Data Streams |
Topics covered:
Summary of key subtopics covered in this guide:
- Difference between DynamoDB Streams and Kinesis Data Streams.
- Stream View Types (New Image, Old Image, etc.).
- Data retention limits and scaling behaviors.
- Integration patterns with AWS Lambda and Global Tables.
- Cost and performance trade-offs for SAA-C03 scenarios.
Infographic: Stream Architecture Flow
Integrations
IAM: Control who can describe or read the stream.
EventBridge: Pipe stream events to 20+ targets.
Global Tables: Automatic multi-region sync.
Scaling
DynamoDB Streams scale shards automatically based on table throughput. One shard per partition is common. Latency: Near real-time (< 1 second).
Optimization
Native DynamoDB streams are free to enable. You pay for the Read Requests when Lambda or other services poll the stream.
Production Use Case: Real-time Analytics
A gaming company uses a DynamoDB table to store player scores. They enable DynamoDB Streams to trigger a Lambda function every time a score changes. The Lambda updates a Redis (ElastiCache) leaderboard and sends high-score notifications via SNS to the player’s mobile device.