Choosing Your NoSQL Path: Firestore vs. Cloud Bigtable
In the world of Google Cloud, “NoSQL” isn’t a one-size-fits-all label. When designing high-scale applications, architects often find themselves at a crossroads between Cloud Firestore and Cloud Bigtable. While both are managed NoSQL databases, they serve radically different purposes.
Firestore is the evolution of Firebase’s Realtime Database. It’s a document-oriented database designed for mobile, web, and server-side development. Its magic lies in its ability to sync data in real-time and provide offline support, making it the go-to for “live” applications like chat apps, user profiles, and inventory management.
On the other hand, Cloud Bigtable is the powerhouse behind Google Search and Maps. It is a wide-column, high-throughput database built for massive analytical and operational workloads. If you are dealing with petabytes of time-series data or need sub-10ms latency for millions of reads and writes per second, Bigtable is your engine. However, with great power comes great responsibility: Bigtable requires careful schema design, particularly around “Row Keys,” to avoid performance bottlenecks.
Understanding the nuance between “Document” and “Wide-Column” is the first step in passing the Professional Cloud Architect exam and, more importantly, building scalable systems in the real world.
Study Guide: NoSQL Mastery
The Analogy
Firestore is like a Digital Filing Cabinet. You have folders (collections) containing documents. Each document can have different fields, and you can easily search for a specific document or a group of documents based on their contents.
Cloud Bigtable is like a Massive Industrial Conveyor Belt. It’s designed for speed and volume. Items (rows) fly past at incredible speeds. To grab the right item, you need to know exactly where it is on the belt (the Row Key). It’s not built for browsing; it’s built for high-speed processing.
Detailed Explanation
- Firestore:
- Data Model: Documents (JSON-like) stored in Collections.
- Scaling: Fully serverless. Scales automatically.
- Features: ACID transactions, multi-region replication, real-time listeners.
- Queries: Flexible querying with automatic indexing.
- Cloud Bigtable:
- Data Model: Sparse, distributed, multi-dimensional sorted map (Wide-column).
- Scaling: Cluster-based. You must provision nodes (though it scales linearly).
- Features: High throughput at low latency. Integrates perfectly with Hadoop, Dataflow, and Spark.
- Queries: Primary key (Row Key) lookups and range scans only. No secondary indexes.
Real-World Scenarios
Scenario A: A global retail app needs to store user shopping carts and sync them across devices instantly.
Solution: Firestore (Real-time sync and document structure fit perfectly).
Scenario B: A financial institution needs to store billions of stock price ticks per second for algorithmic analysis.
Solution: Cloud Bigtable (High write throughput and time-series optimization).
Comparison Table
| Feature | Firestore | Cloud Bigtable | AWS Equivalent |
|---|---|---|---|
| Type | Document (NoSQL) | Wide-column (NoSQL) | DynamoDB / DocumentDB |
| Transactions | Multi-document ACID | Single-row only | DynamoDB Transactions |
| Latency | Low (ms) | Ultra-low (<10ms) | DynamoDB DAX |
| Scaling | Serverless / Auto | Provisioned Nodes | DynamoDB (Provisioned/On-demand) |
Interview Questions & Answers
- Q: When should I choose Firestore over Bigtable?
A: Choose Firestore for mobile/web apps, small to medium datasets, and when you need complex queries or ACID transactions. - Q: What is a “Hotspot” in Bigtable?
A: A hotspot occurs when too many reads/writes hit a single node, usually caused by a poorly designed Row Key (e.g., using a timestamp as the prefix). - Q: Does Firestore support offline data?
A: Yes, Firestore has built-in SDK support for local data persistence and synchronization when the device reconnects. - Q: How do you scale Bigtable?
A: By adding more nodes to the cluster. Each node provides approximately 10,000 QPS (for writes). - Q: Can Bigtable be used for global apps?
A: Yes, via Multi-cluster replication, which provides high availability and eventual consistency across regions. - Q: What is the maximum size of a Firestore document?
A: 1 MiB. - Q: Does Bigtable support SQL?
A: No native SQL, but you can use the Bigtable HBase-compatible API or query it via BigQuery as an external data source. - Q: What is Firestore “Native Mode” vs “Datastore Mode”?
A: Native Mode is for mobile/real-time apps (Firebase features). Datastore Mode is for high-concurrency server-side workloads (backwards compatible with App Engine Datastore). - Q: How are costs calculated in Firestore?
A: Based on the number of document reads, writes, and deletes, plus storage. - Q: Why would you use Bigtable for IoT data?
A: Because IoT generates massive streams of time-series data that require high-speed ingestion and low-latency retrieval for analysis.
Interview Golden Nuggets
- The Row Key is King: In Bigtable interviews, always mention that Row Key design is the most critical task. Avoid sequential keys; use salted or reversed timestamps.
- Firestore Limits: Remember the 1 write per second limit per document (though this has been improved, it’s a common “gotcha” for high-frequency updates).
- Cost Pivot: If a customer complains about Firestore costs for a high-throughput analytical load, suggest Bigtable or BigQuery. Firestore is expensive for “scanning” millions of rows.
NoSQL Architectural Overview
Firestore: Integrates with Firebase Auth, Cloud Functions, and App Check.
Bigtable: Integrates with Dataflow for ETL, Dataproc for Hadoop, and BigQuery for federated queries.
Firestore: Automatic horizontal scaling. 10,000 writes/sec per database (soft limit).
Bigtable: Linear scaling. Add nodes to increase throughput. Handles millions of requests/sec.
Firestore: Pay-per-op. Use “TTL” (Time to Live) to auto-delete old docs and save on storage.
Bigtable: Pay-per-node/hour + storage. Use HDD for cold data, SSD for production.
Decision Tree: Which NoSQL?
- Need ACID Transactions? → Firestore
- Need Offline Sync? → Firestore
- Need > 10TB of data? → Bigtable
- Need < 10ms latency for huge writes? → Bigtable
- Need flexible, ad-hoc queries? → Firestore