$lookup in MongoDB

What is $lookup in MongoDB?

$lookup is an aggregation pipeline stage in MongoDB.

$lookup allows you to perform a left outer join in MongoDB. This means you can return all rows from collection A with any corresponding rows from collection B.

If collection A represents blog posts and collection B represents comments for those posts, you can perform a $lookup to return every post with it's comments as an array...

Post Collection

{
    "title": "My first post"
}

Comment Collection

{
  "postTitle": "My first post",
  "comment": "I liked this post"
}

MongoDB $lookup query

db.posts.aggregate([
{
  $lookup: {
      from:"comments",
      localField:"title",
      foreignField:"postTitle",
      as:"comments"
  }
}
])

Results

{
  title: "My first post"
  comments: [
    {
      postTitle: "My first post",
      comment: "I liked this post"
    }
  ]
}

You can find more detailed examples here.

$lookup Performance

$lookup allows you to perform joins directly on the server. This is typically a lot faster than performing multiple DB operations to combine results on client applications.

$lookup will perform a complete scan on collection B for every single record in A. This isn't a concern with smaller collections but indexing on collection B is encouraged if collection B is large.

Using $match can also significantly reduce the number of results in A that are compared to B. For these reasons, $match should always come before any $lookup operations in the aggregation pipeline.

Considerations with $lookup

$lookup only works on unsharded foreign collections. If you run a $lookup aggregation on collection A with foreign collection B then collection B must be sharded.

This is not to say that collection A must be sharded. You can still run $lookup aggregations on a sharded collection (A). Only the foreign collection (B) must be sharded.

Performing joins contradicts the very nature of non relational data stores like MongoDB. MongoDB leverages the document model to perform fast lookups and avoid expensive join operations characteristic of relational databases.

This contradiction is so apparent that $lookup was originally released only with enterprise versions of Mongo to discourage its use.

While $lookup has since been released with community edition, it's still important to remember why you are using Mongo in the first place...

For more $lookup examples, check out $lookup examples | MongoDB.

Your thoughts?