Firestore Restore Process Taking Very Long

I am testing out the new backup process as detailed here: https://cloud.google.com/firestore/docs/backups

I am testing this on a rather large firestore database. The backup creation was successful, but when I test out the restore into a different db, it is taking more than 24 hours. When I go to the database browse page, I get "400: Cannot serve requests when the database is undergoing a restore."

Is this to be expected? Taking that long for a restore isn't acceptable in an emergency

1 3 184
3 REPLIES 3

Firestore restores can indeed take a considerable amount of time, particularly for large databases. The duration is influenced by several factors, and understanding these can help in planning more effective recovery strategies.

Factors Influencing Restore Time:

  • Database Size: The primary factor affecting restore time is the volume of data being copied and re-indexed.

  • Complexity: Databases with numerous collections, documents, complex relationships, and extensive indexing require more processing time during restoration.

  • Resource Competition: Restoration operations might compete with other activities in your database or be limited by the overall resource availability within Google Cloud.

Challenges for Quick Emergency Response

Firestore's backup and restore mechanism is optimized for disaster recovery rather than immediate emergency failover. This distinction is crucial for planning your data recovery strategy.

Strategies for Faster Recovery

  • Proactive Planning: Establish your Recovery Time Objective (RTO) and Recovery Point Objective (RPO) well in advance. These metrics are essential for guiding your disaster recovery strategy and should include regular testing of restore processes.

  • Regular Testing: Conduct tests on smaller subsets of your data or within a staging environment to get realistic estimates of recovery times and identify potential bottlenecks.

  • Database Sharding: Splitting your database across multiple Firestore instances can facilitate faster, parallel restoration of smaller segments.

  • Complementary Real-time Replication: Implementing a system for continuous data replication to a secondary database can ensure data is readily available for quick failover, albeit at the cost of added complexity.

  • Custom Export/Import Solutions: Developing custom scripts tailored to your data structure may offer speed advantages in extremely time-sensitive recovery scenarios, though this requires a significant development effort.

  • Change Data Capture (CDC): For scenarios where near-zero data loss is imperative, CDC systems that continuously stream database changes to a replica can provide a near-real-time failover option, though they are complex to implement.

Got it, that makes sense. I'm also doing a manual import/export, and that seems to be taking about the same amount of time.

Is there a way to backup only a subset of firestore collections using the Back up and restore data method?

Unfortunately, Firestore's built-in "Back up and restore data" method does not support backing up only a subset of collections directly. This design choice ensures that Firestore's backup system can provide complete database snapshots, which are crucial for maintaining consistency and enabling reliable full restoration when needed.

Alternative Strategies for Partial Backups

Manual Export/Import at Collection Level: Leverage the gcloud firestore export and gcloud firestore import commands, specifying collectionIds to target specific collections. This approach allows for selective backups and restorations.

gcloud firestore export gs://[BUCKET_NAME] --collection-ids=[COLLECTION_ID_1],[COLLECTION_ID_2]

Custom Scripting: Gain more tailored control by utilizing Firestore client libraries to selectively fetch and serialize data from specific collections. Below is an illustrative Python example:

 
import firebase_admin 
from firebase_admin import credentials, firestore 

# Initialize the Firestore client 
cred = credentials.Certificate('path/to/your/serviceAccountKey.json') 
firebase_admin.initialize_app(cred)
db = firestore.client()

collections_to_backup = ['users', 'products'] 
for collection_id in collections_to_backup: 
    collection_ref = db.collection(collection_id)
    docs = collection_ref.stream()
    data = {doc.id: doc.to_dict() for doc in docs} 
    # Store 'data' in your preferred format and location