AWS SA-C003 Study Guide: Chapter 6 – Databases

AWS Certified Solutions Architect (SAA-C03)

Study Guide: Chapter 6 – Deep Dive Into AWS Databases

Quick Reference Infographic

📊

Relational (SQL)

Service: Amazon RDS / Aurora

Best For: Complex queries, transactional integrity (ACID), structured data.

Engines: MySQL, PostgreSQL, MariaDB, Oracle, SQL Server.

⚡

Non-Relational (NoSQL)

Service: DynamoDB / DocumentDB

Best For: High scale, flexible schema, low-latency (BASE), big data.

Scaling: Horizontal (Scaling Out).

🚀

Optimization

Caching: ElastiCache (Redis/Memcached).

Read Scaling: Read Replicas (up to 5 in RDS).

HA: Multi-AZ Deployments.

🔍

Big Data & Analytics

Warehouse: Redshift.

Serverless Query: Athena (SQL on S3).

Streaming: Kinesis.

1. Database Fundamentals

OLTP vs. OLAP

Feature	OLTP (Online Transaction Processing)	OLAP (Online Analytical Processing)
Data	Current, operational data	Historical data from various sources
Use Case	Business tasks (Order entry)	Data mining/Analytics
Speed	Fast queries	Slower, complex queries
Normalization	Highly normalized	Not normalized (Star/Snowflake)

The ACID vs. BASE Properties

ACID (SQL): Atomicity, Consistency, Isolation, Durability. Ensures “all-or-nothing” reliability.
BASE (NoSQL): Basically Available, Soft State, Eventual Consistency. Prioritizes availability and scale over immediate consistency.

2. Amazon RDS (Relational Database Service)

RDS is a managed service that handles provisioning, patching, and backups.

Key Features

Multi-AZ: Synchronous replication to a standby instance in a different AZ. Used for Disaster Recovery.
Read Replicas: Asynchronous replication. Used for Scaling Read Performance. Replicas can be promoted to primary.
Storage Types:
- General Purpose SSD (Balanced)
- Provisioned IOPS (High performance)
- Magnetic (Legacy/Archival)

Exam Tip: If the question asks for High Availability, think Multi-AZ. If it asks for Read Performance, think Read Replicas.

3. Amazon Aurora

AWS’s proprietary cloud-native database. 5x faster than standard MySQL; 3x faster than PostgreSQL.

Aurora Serverless: Automatically scales capacity based on demand. Best for infrequent or unpredictable workloads.
Global Database: One primary region and up to 5 read-only secondary regions. Latency < 1 second.
Storage: Self-healing storage that replicates 6 copies of data across 3 AZs.

4. Amazon DynamoDB (NoSQL Deep Dive)

A serverless, key-value database providing single-digit millisecond latency.

DAX (DynamoDB Accelerator): In-memory cache for DynamoDB; reduces latency from milliseconds to microseconds.
Global Tables: Multi-region, multi-active replication.
Capacity Modes:
- Provisioned: You specify Read/Write Capacity Units (RCUs/WCUs).
- On-Demand: Pay per request. Best for unpredictable traffic.

5. Specialized AWS Databases

Amazon Neptune: Graph database (Social networks, fraud detection).
Amazon DocumentDB: Managed MongoDB compatible service.
Amazon Keyspaces: Managed Apache Cassandra.
Amazon QLDB: Ledger database (Immutable, cryptographically verifiable logs).
Amazon ElastiCache:
- Redis: Complex data types, persistence, pub/sub.
- Memcached: Simple key-value, no persistence.

6. Big Data & Analytics Tools

Amazon Redshift: SQL-based Data Warehouse. Uses columnar storage for OLAP.
Amazon Athena: Serverless query service to analyze data in S3 using standard SQL.
AWS Glue: Serverless ETL (Extract, Transform, Load) service.
Amazon EMR: Big data platform for Hadoop, Spark, and other distributed frameworks.
Amazon Kinesis: Real-time streaming data collection and processing.