Multi-AZ vs. Multi-Region Design

Architecting for Resilience and Global Reach

1. Study Guide: Understanding the Scope of Failure

In the AWS ecosystem, the fundamental goal of a Solutions Architect is to design systems that can withstand failures. This involves a strategic choice between Multi-AZ (Availability Zone) deployments for high availability and Multi-Region deployments for disaster recovery and global performance.

The Spare Tire Analogy:

Multi-AZ is like having a spare tire inside your car. If one tire goes flat (an AZ failure), you can swap it immediately and keep driving without much delay. Multi-Region is like having a second car parked in a different city. If your entire city experiences a massive flood (a Regional failure), you can travel to the other city and use the second car to continue your journey.

Core Concepts: The Well-Architected View

Under the Reliability Pillar of the AWS Well-Architected Framework, AWS emphasizes that everything fails all the time.

  • Multi-AZ: Focuses on High Availability (HA). It protects against data center outages, power failures, or localized networking issues. It typically uses synchronous replication to ensure no data loss.
  • Multi-Region: Focuses on Disaster Recovery (DR) and Business Continuity. It protects against rare but catastrophic events affecting an entire geographic area. It typically uses asynchronous replication.

Comparison Table: Design Trade-offs

Feature Multi-AZ Design Multi-Region Design
Primary Goal High Availability (HA) & Fault Tolerance Disaster Recovery (DR) & Low Latency
Replication Synchronous (usually) Asynchronous (usually)
Latency Low (Single-digit ms) High (Tens to hundreds of ms)
Cost Moderate (Data transfer is often free/low) High (Duplicate stacks + Cross-region transfer)
Complexity Low (Managed by AWS services) High (Requires DNS/Traffic routing logic)

Scenario-Based Decision Matrix

  • If you need to survive a single data center failure with zero data loss: Use Multi-AZ.
  • If your RTO/RPO requirements are measured in seconds/minutes: Use Multi-AZ.
  • If you need to serve users in Europe and Asia with sub-100ms latency: Use Multi-Region.
  • If you must comply with data sovereignty laws requiring data to stay in a specific country: Use Multi-AZ (within that Region).
  • If you are protecting against a total AWS service outage in a specific geography: Use Multi-Region.

Exam Tips: Golden Nuggets

  • RDS Multi-AZ vs. Read Replicas: Multi-AZ is for HA (synchronous, automatic failover); Read Replicas are for scaling (asynchronous, can be cross-region).
  • S3 Durability: S3 Standard is Multi-AZ by default (replicated across ≥3 AZs). S3 One Zone-IA is NOT.
  • Route 53: Use Health Checks and Failover Routing to transition traffic between Regions.
  • Aurora Global Database: The best choice for low-latency cross-region disaster recovery (typical RPO < 1 sec).

2. Architectural Infographic

Region: us-east-1 AZ-A App AZ-B App Sync Region: eu-west-1 App Async

Key Services

  • ELB: Distributes traffic across AZs.
  • ASG: Spans multiple AZs.
  • DynamoDB: Global Tables for Multi-Region.

Common Pitfalls

  • Hardcoding AZ names (use AZ IDs).
  • Ignoring Cross-Region Data Transfer costs.
  • Assuming Multi-AZ solves application bugs.

Quick Patterns

  • Active-Passive: Failover to secondary region.
  • Active-Active: Serve traffic from both regions via Route 53.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top