VOOZH about

URL: https://dev.to/humzakt/the-firestore-default-database-trap-why-your-data-is-going-to-the-wrong-place-2kea

⇱ The Firestore Default Database Trap: Why Your Data Is Going to the Wrong Place - DEV Community


Firestore has a (default) database. If you don't explicitly specify which database to use, everything routes there. We had multiple Firestore databases in production, but several code paths were accidentally hitting the default.

This guide covers how Firestore's default database works, how to detect misrouting, and how to fix it in Python and JavaScript/React.

The Problem: Silent Data Misrouting

We use multiple Firestore databases for tenant isolation. Each evaluation tenant has its own database:

evaluations-db-prod-tenant-1
evaluations-db-prod-tenant-2
evaluations-db-prod-tenant-3
...
evaluations-db-prod-tenant-12

But some code paths were missing explicit database references:

# ❌ BAD: Routes to (default) database
from google.cloud import firestore

db = firestore.Client() # No database specified!
db.collection("evaluations").add({"score": 0.95})

This code writes to (default), not the tenant-specific database. The bug was silent — no errors, just wrong data location.

How Firestore Defaults Work

Firestore has two ways to specify a database:

1. Explicit Database Reference (Correct)

from google.cloud import firestore

# Specify database ID explicitly
db = firestore.Client(database="evaluations-db-prod-tenant-1")
db.collection("evaluations").add({"score": 0.95})

2. Default Database (What Happens If You Don't Specify)

# No database specified → routes to (default)
db = firestore.Client() # Uses (default) database!

The (default) database is created automatically when you first use Firestore. It's always there, even if you never intended to use it.

Detecting Misrouting

Method 1: Check Database Usage in Console

Go to Firebase Console → Firestore → Databases. Check if (default) has unexpected collections:

(default) database:
 - evaluations (shouldn't be here!)
 - scores (shouldn't be here!)

evaluations-db-prod-tenant-1:
 - evaluations (correct)
 - scores (correct)

Method 2: Query Default Database Explicitly

# Check what's in (default)
default_db = firestore.Client(database="(default)")
evaluations = default_db.collection("evaluations").stream()

for doc in evaluations:
 print(f"Found evaluation in default DB: {doc.id}")
 # This shouldn't exist!

Method 3: Add Logging

import logging

def get_firestore_client(database_id: str):
 """Get Firestore client with explicit database."""
 if not database_id:
 logging.error("Database ID is required!")
 raise ValueError("Database ID must be specified")

 client = firestore.Client(database=database_id)
 logging.info(f"Using Firestore database: {database_id}")
 return client

# Usage
db = get_firestore_client("evaluations-db-prod-tenant-1")

Fixing Python Code

Pattern 1: Environment Variable

import os
from google.cloud import firestore

# Get database ID from environment
DATABASE_ID = os.getenv("FIRESTORE_DATABASE_ID")
if not DATABASE_ID:
 raise ValueError("FIRESTORE_DATABASE_ID environment variable must be set")

db = firestore.Client(database=DATABASE_ID)

Pattern 2: Configuration Class

from dataclasses import dataclass
from google.cloud import firestore

@dataclass
class FirestoreConfig:
 database_id: str
 project_id: str = None

 def get_client(self):
 """Get Firestore client with explicit database."""
 kwargs = {"database": self.database_id}
 if self.project_id:
 kwargs["project"] = self.project_id
 return firestore.Client(**kwargs)

# Usage
config = FirestoreConfig(
 database_id="evaluations-db-prod-tenant-1",
 project_id="my-project"
)
db = config.get_client()

Pattern 3: Factory Function

from google.cloud import firestore
from functools import lru_cache

@lru_cache(maxsize=12) # Cache clients for each database
def get_firestore_client(database_id: str) -> firestore.Client:
 """Get Firestore client for a specific database."""
 if not database_id:
 raise ValueError("database_id is required")

 return firestore.Client(database=database_id)

# Usage
db = get_firestore_client("evaluations-db-prod-tenant-1")

Fixing JavaScript/React Code

Frontend: Firebase SDK

// ❌ BAD: Uses default database
import { getFirestore } from 'firebase/firestore';

const db = getFirestore(app); // No database specified!

// ✅ GOOD: Explicit database reference
import { getFirestore } from 'firebase/firestore';

const db = getFirestore(app, 'evaluations-db-prod-tenant-1');

React Hook Pattern

import { useState, useEffect } from 'react';
import { getFirestore, collection, onSnapshot } from 'firebase/firestore';
import { app } from './firebase-config';

function useFirestoreCollection(collectionName, databaseId) {
 const [data, setData] = useState([]);
 const [loading, setLoading] = useState(true);
 const [error, setError] = useState(null);

 useEffect(() => {
 if (!databaseId) {
 setError('Database ID is required');
 return;
 }

 // Explicit database reference
 const db = getFirestore(app, databaseId);
 const colRef = collection(db, collectionName);

 const unsubscribe = onSnapshot(
 colRef,
 (snapshot) => {
 const docs = snapshot.docs.map(doc => ({
 id: doc.id,
 ...doc.data()
 }));
 setData(docs);
 setLoading(false);
 },
 (err) => {
 setError(err.message);
 setLoading(false);
 }
 );

 return () => unsubscribe();
 }, [collectionName, databaseId]);

 return { data, loading, error };
}

// Usage
function EvaluationsList({ tenantId }) {
 const databaseId = `evaluations-db-prod-tenant-${tenantId}`;
 const { data, loading, error } = useFirestoreCollection('evaluations', databaseId);

 if (loading) return <div>Loading...</div>;
 if (error) return <div>Error: {error}</div>;

 return (
 <ul>
 {data.map(eval => (
 <li key={eval.id}>{eval.id}</li>
 ))}
 </ul>
 );
}

Backend: Admin SDK

// ❌ BAD: Uses default database
const admin = require('firebase-admin');
const db = admin.firestore();

// ✅ GOOD: Explicit database reference
const admin = require('firebase-admin');
const db = admin.firestore().database('evaluations-db-prod-tenant-1');

Real-Time Listeners on Non-Default Databases

Real-time listeners must also specify the database:

// ❌ BAD: Listens to default database
import { collection, onSnapshot } from 'firebase/firestore';

const db = getFirestore(app); // Missing database ID!
const colRef = collection(db, 'evaluations');

onSnapshot(colRef, (snapshot) => {
 // This listens to (default) database, not tenant-specific!
});

// ✅ GOOD: Explicit database in listener
const db = getFirestore(app, 'evaluations-db-prod-tenant-1');
const colRef = collection(db, 'evaluations');

onSnapshot(colRef, (snapshot) => {
 // Now listening to the correct database
});

Auto-Export Configuration

If you're using Firestore auto-export (for backups or analytics), make sure it's configured for the correct databases:

# Export specific database (not default)
gcloud firestore export gs://BUCKET_NAME/export \
 --database-ids=evaluations-db-prod-tenant-1 \
 --collection-ids=evaluations,scores

Gotcha: If you don't specify --database-ids, the export includes (default) and all other databases. This can be expensive and slow.

Migration: Moving Data from Default

If you've already written data to (default), you need to migrate it:

from google.cloud import firestore
from google.cloud.firestore_v1 import Query

def migrate_from_default(target_database_id: str, collection_name: str):
 """Migrate data from (default) to target database."""
 default_db = firestore.Client(database="(default)")
 target_db = firestore.Client(database=target_database_id)

 # Read from default
 docs = default_db.collection(collection_name).stream()

 # Write to target
 batch = target_db.batch()
 count = 0

 for doc in docs:
 batch.set(
 target_db.collection(collection_name).document(doc.id),
 doc.to_dict()
 )
 count += 1

 # Commit in batches of 500 (Firestore limit)
 if count % 500 == 0:
 batch.commit()
 batch = target_db.batch()
 print(f"Migrated {count} documents...")

 # Commit remaining
 if count % 500 != 0:
 batch.commit()

 print(f"Migration complete: {count} documents migrated")

# Usage
migrate_from_default("evaluations-db-prod-tenant-1", "evaluations")

Prevention: Code Review Checklist

Add this to your PR template:

## Firestore Database Checklist
- [ ] All Firestore clients specify explicit `database` parameter
- [ ] No `firestore.Client()` calls without database ID
- [ ] Environment variables set for database IDs
- [ ] Frontend real-time listeners specify database
- [ ] Tests use explicit database IDs (not default)

Complete Example: Production-Ready Pattern

import os
from typing import Optional
from google.cloud import firestore
from functools import lru_cache
import logging

class FirestoreManager:
 """Manages Firestore clients with explicit database references."""

 def __init__(self, project_id: Optional[str] = None):
 self.project_id = project_id or os.getenv("GCP_PROJECT_ID")
 if not self.project_id:
 raise ValueError("GCP_PROJECT_ID must be set")

 @lru_cache(maxsize=20)
 def get_client(self, database_id: str) -> firestore.Client:
 """Get Firestore client for a specific database."""
 if not database_id:
 raise ValueError("database_id is required")

 if database_id == "(default)":
 logging.warning("Using (default) database - is this intentional?")

 client = firestore.Client(
 project=self.project_id,
 database=database_id
 )
 logging.info(f"Created Firestore client for database: {database_id}")
 return client

 def get_tenant_database(self, tenant_id: str) -> firestore.Client:
 """Get Firestore client for a tenant-specific database."""
 database_id = f"evaluations-db-prod-tenant-{tenant_id}"
 return self.get_client(database_id)

# Usage
manager = FirestoreManager()

# Get tenant-specific database
db = manager.get_tenant_database("1")
db.collection("evaluations").add({"score": 0.95})

# Explicit database
db = manager.get_client("evaluations-db-prod-tenant-2")

TL;DR

Problem Solution
Code routes to (default) database Always specify database parameter
Silent data misrouting Check console, add logging, query default DB
Frontend uses wrong database Pass database ID to getFirestore()
Real-time listeners wrong DB Specify database in listener setup
Auto-export includes default Use --database-ids flag

Key takeaway: Never rely on Firestore defaults. Always specify the database ID explicitly. The (default) database is a trap — it's always there, even when you don't want it.


More production GCP articles on my blog. I write about patterns from real infrastructure — find me at humzakt.github.io.