New to Mongo so bare with me, but cannot figure out best (fast) solution.
I have three different collections: Shipments, Tracking, Messages
I’m trying to join two simple fields from the Tracking and Messages collections to a list of shipments.
Collections:
Tracking
{
shipment: {
id: XXX
},
enabled: true,
...
}
Messages
{
shipment: {
id: XXX
},
messages: [
0: {
isRead: false,
origin: 'driver'
},
1: {
isRead: false,
origin: 'driver'
},
2: {
isRead: true,
origin: 'driver'
}
],
...
}
Shipments
{
_id: XXX,
...
}
When I pull list of shipments I need to be able to add the ‘enabled’ field from the Tracking collection and a ‘unRead’ count from the Messages collection from the messages array of subdocuments to each shipment based on the shipments ID so each shipment record looks something like:
{
_id: XXX,
trackingEnabled: true,
unreadCount: 2,
...
}
I’ve tried every aggregate feature that I could find and I’ve accomplished it pulling what I need except it’s incredibly slow. Pulling 50-60 records takes 8-9 seconds which is painful.
Here’s a pipeline I’ve used that works but is insanely slow.
I’ve tried many variations including simple projection and filtering, and I cannot get anything speedy to filter out the messages array. Even just pulling just the Tracking with $lookup causes 2-3 extra seconds.
Any suggestions on another way to accomplish this or am I doing this wrong?
I would prefer to do this in one query vs looping the shipments and querying each collection separately, as that also takes too long.
[
{
$match: {
active: true,
deleted: false
}
},
{
$lookup: {
from: 'messages',
localField: '_id',
foreignField: 'shipment.id',
let: {
shipmentID: '$_id'
},
pipeline: [
{
$unwind: '$messages'
},
{
$match: {
'messages.isRead': false,
'messages.origin': 'driver'
}
},
{
$sortByCount: '$messages.isRead'
}
],
as: 'unreadCount'
}
},
{
$lookup: {
from: 'tracking',
localField: '_id',
foreignField: 'shipment.id',
pipeline: [
{
$project: {
'enabled': 1
}
}
],
as: 'trackingEnabled'
}
}
]
In the above-provided aggregate query, there seems to be lot of $lookup
stage, which is generally a loop that takes an _id from the current collection and compares it with the foreign collection. These are not as efficient as SQL joins. To avoid performance overload, use proper index
, pagination use $skip
& $take
.
Then, if your collection have huge data with each document, consider using $project
stage after each $match
or $lookup
to take only necessary data, which will reduce the amount of data being carried to each stage.
Below are the reference link to do the above optimization:
Index – Index type in mongoDB
$project – MongoDB Query optimization
Note
: MongoDB also generally suggests duplicating the data, which is frequently required by some collections, to avoid more lookups.