Persistent Disks & Local SSDs

In Google Cloud, storage is decoupled from compute. Understanding the nuances between network-attached Persistent Disks (PD) and physically attached Local SSDs is critical for passing the Associate Cloud Engineer exam and designing performant architectures.

The “External Drive” vs. “Internal Drive” Analogy

Think of a Persistent Disk like a high-performance external hard drive connected via a 100Gbps fiber connection. You can unplug it from one computer and plug it into another without losing your files. If the computer breaks, the data on the drive remains safe.

Think of a Local SSD like an NVMe drive soldered directly onto the computer’s motherboard. It is incredibly fast because there is no cable or network in the way, but if the computer is thrown away or loses power for too long, the data is gone forever.

Detail Elaboration: Block Storage Options

1. Persistent Disks (PD)

PDs are network-attached but appear as block devices to your VM. They provide high durability and are distributed across multiple physical drives to ensure data integrity.

  • Standard (pd-standard): Backed by HDDs, best for large sequential read/write workloads (backups, logs).
  • Balanced (pd-balanced): The default choice. A mix of performance and cost-efficiency using SSDs.
  • SSD (pd-ssd): High performance for databases and low-latency applications.
  • Extreme (pd-extreme): Highest IOPS and throughput, allowing you to provision performance independently of capacity.

2. Local SSDs

These are physically attached to the server hosting the VM instance. They offer the lowest latency and highest IOPS possible in GCP. However, they are ephemeral. Data survives a guest OS reboot but is lost if the instance is deleted or if a host error occurs that prevents data migration.

Core Concepts & Best Practices

  • Reliability: Use Regional Persistent Disks for high-availability clusters. They synchronously replicate data across two zones in a single region.
  • Scalability: You can resize Persistent Disks on-the-fly without downtime. However, you cannot shrink them.
  • Security: All data at rest in GCP is encrypted by default. You can use Customer-Managed Encryption Keys (CMEK) for higher control.
  • Cost Optimization: Use snapshots to back up PDs. Snapshots are incremental and stored in Cloud Storage, making them cheaper than keeping large disks active.

Comparison Table: Storage Performance & Characteristics

Feature Standard PD SSD / Balanced PD Local SSD
Connection Network-attached Network-attached Physically attached
Persistence Survives VM deletion Survives VM deletion Lost on VM deletion
Max IOPS Low (~3k – 15k) High (~80k – 100k) Ultra-High (Up to 2.4M)
Redundancy Built-in Built-in None (User-managed)
Use Case Large data processing Web servers, DBs Cache, Scratch space

Scenario-Based Decision Matrix

IF the requirement is to ensure data survives a Zonal failure…
THEN use Regional Persistent Disk.

IF you need the absolute lowest possible latency for a NoSQL cache…
THEN use Local SSD (and ensure your app handles replication).

IF you need to share a read-only disk across multiple VMs…
THEN use Persistent Disk in Read-Only mode.

IF you need to increase disk size without stopping the VM…
THEN use Persistent Disk (Edit disk > Expand > Resize FS in OS).

Exam Tips: Golden Nuggets

  • Local SSD Data Loss: On the exam, if a scenario mentions “ephemeral” or “temporary” data with “extreme performance,” Local SSD is the answer. If it mentions “database data” that must be “highly available,” Local SSD is a distractor—use PD.
  • Resizing: Remember: You can increase the size of a PD while it’s attached and in use, but you can never decrease the size.
  • Multi-Writer: SSD PDs can be attached to multiple VMs in ReadOnly mode, or in Read/Write mode only if using a specialized cluster file system (rarely the answer for ACE).
  • Snapshots: Snapshots are the primary way to migrate disks between regions or to create backups.

Architecture Visual: Block Storage Flow

Compute VM Local SSD (Physical Link) Zonal PD (Network Link) Regional PD (Replicated x2 Zones) ⚡ Ultra Low Latency 🛡️ Default Choice 🌍 High Availability

Key GCP Services

  • Snapshots: Global resources for backups.
  • Images: Boot disk templates.
  • Cloud Console: Manual disk management.
  • gcloud compute: CLI for automation.

Common Pitfalls

  • Assuming Local SSDs are persistent (they are not!).
  • Trying to move a Zonal PD to another zone without a snapshot.
  • Forgetting to resize the file system after resizing the PD.

Architecture Patterns

  • High Perf: Local SSD for RAID 0 scratch space.
  • DR: Regional PD for zero-RPO failover.
  • Cold Storage: Snapshots of Standard PDs for long-term retention.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top