VM Images & Snapshots: Persistence and Portability
In Google Cloud Compute Engine, managing the lifecycle of your data and operating systems is critical for both reliability and scalability. While Images serve as the blueprint for creating new virtual machine instances, Snapshots act as point-in-time backups of your existing persistent disks.
The Analogy: The Bakery Concept
Imagine you are running a professional bakery:
- A VM Image is the Recipe: It’s a standardized set of instructions (OS, drivers, pre-installed software) used to bake a fresh batch of cookies (VMs). Every cookie from that recipe starts exactly the same.
- A Snapshot is a Polaroid Photo: Imagine you are halfway through decorating a specific cake. You take a photo of it. If you drop the cake later, you can use that photo to recreate it exactly as it was at that specific moment.
Core Concepts & Detail Elaboration
1. VM Images (The Blueprints)
Images are the primary way to package an operating system and application stack. They are used to create boot disks for your instances.
- Public Images: Provided and maintained by Google or open-source communities (e.g., Debian, Ubuntu, Windows Server).
- Custom Images: Created by you from an existing disk or image. These are essential for Immutable Infrastructure, where you bake your app directly into the image to ensure fast scaling.
- Image Families: A way to group related images. Referencing a family always points to the latest non-deprecated version, simplifying automation.
2. Snapshots (The Backups)
Snapshots are incremental, meaning only the blocks that have changed since the last snapshot are stored. This makes them cost-effective and fast.
- Standard Snapshots: Used for backup and disaster recovery. They are stored across multiple zones by default for high availability.
- Instant Snapshots: Provide very low recovery time objectives (RTO) by keeping data on the local disk subsystem, though they are less resilient than standard snapshots.
- Snapshot Schedules: Automation is key. You can define when and how often snapshots are taken to meet your Recovery Point Objectives (RPO).
Comparison Table: Images vs. Snapshots vs. Machine Images
| Feature | Custom Image | Standard Snapshot | Machine Image |
|---|---|---|---|
| Primary Purpose | Instance creation & scaling (Golden Image) | Backup, DR, and disk cloning | Full VM cloning (multiple disks + metadata) |
| Storage Format | Compressed, global resource | Incremental, regional/multi-regional | Comprehensive (all disks + config) |
| Cost | Higher (storage per GB) | Lower (incremental storage) | Variable (includes metadata storage) |
| Best Use Case | Managed Instance Groups (MIGs) | Daily database backups | Moving a VM between projects |
Scenario-Based Decision Matrix
If the requirement is… Use this service:
- “I need to create 100 identical web servers in a MIG.” → Use Custom Image.
- “I need a cost-effective backup of my data disk every 24 hours.” → Use Snapshot Schedule.
- “I want to move a VM with 3 attached disks to another region.” → Use Machine Image.
- “I need to version my OS patches so I can roll back easily.” → Use Image Families.
Exam Tips: ACE Golden Nuggets
- Snapshot Consistency: For database integrity, always “quiesce” (flush buffers to disk) or shut down the VM before taking a snapshot to ensure application consistency.
- Cross-Project Sharing: Images can be shared across projects using IAM roles (
roles/compute.imageUser). This is a common exam scenario for centralized “Image Projects.” - Incremental Logic: Remember that deleting an intermediate snapshot does not break the chain; GCP automatically moves the unique data to the next snapshot.
- Storage Location: Snapshots are Regional or Multi-regional. They are NOT Zonal. This provides protection against zonal failures.
Visualizing VM Lifecycle Management
Key GCP Services
- Compute Engine: Core VM hosting.
- Cloud Storage: Backend for image/snapshot data.
- IAM: Controls who can create/use images.
- Resource Manager: Organization-wide image sharing.
Common Pitfalls
- Snapshotting a “busy” database without freezing.
- Not using Image Families for CI/CD pipelines.
- Storing all snapshots in a single region for DR.
- Forgetting to delete old snapshots (cost accumulation).
Architecture Patterns
- Golden Image: Pre-configure OS + Security agents.
- Auto-Healing: MIGs use images to recreate failed VMs.
- Cross-Project: Centralized security-hardened image repo.
- Regional DR: Snapshot replication across regions.