Description:
I am working on a project where I need to insert more than 1 million records into Google Firestore. Currently, my approach is not efficient enough and the process is extremely slow. I am looking for a way to optimize this process.
What I’ve tried:
- Individual inserts: I tried inserting the records one by one using a loop, but this is very slow.
- Batch writes: I attempted to use
batch writes
, but there seems to be a limit on the number of operations I can perform in a single batch. - Firestore SDK for Node.js: I have been using the Firestore SDK for Node.js to manage the inserts.
Current code:
const { Firestore } = require('@google-cloud/firestore');
// Initialize Firestore
const db = new Firestore();
// Data to insert (example)
const data = Array.from({ length: 1000000 }, (_, i) => ({
field1: `value${i}`,
field2: `value${i}`,
}));
// Individual insert
async function insertData() {
for (const item of data) {
await db.collection('my_collection').add(item);
}
}
insertData().then(() => {
console.log('Inserts completed');
}).catch(error => {
console.error('Error inserting data:', error);
});
Problem:
The above code is extremely slow for such a large number of records. I understand that Firestore has limitations regarding the number of operations per second and per batch, and I would like to know the best way to handle this situation.
Questions:
- What is the best practice for inserting a large number of records into Firestore?
- How can I optimize the process to be more efficient?
- Are there specific limits I need to be aware of and how can I overcome them?
- Is it possible to use other Google Cloud services, such as Pub/Sub or Dataflow, to solve this problem and how could I integrate them into the bulk insert process?
I appreciate any suggestions or code examples that can help improve the performance of bulk inserts into Firestore.