Mastering MongoDB Atlas: Running $lookup with an Atlas Search $search Query

Rolled out in MongoDB v6.0, the integration between MongoDB’s $lookup and Atlas Search’s $search allows us to effortlessly merge datasets based on shared fields, with the finesse of Atlas Search’s $search, enabling potent full-text search capabilities within the MongoDB ecosystem.

Picture this: you’ve got interconnected datasets, and you need to dig out specific information while ensuring efficiency. Enter $lookup and $search. By fusing these features, MongoDB lets you seamlessly navigate complex datasets. Whether you’re filtering, sorting, or hunting down specific text-based queries. It’s not just a new feature; it’s a game-changer, enhancing your MongoDB experience, making data manipulation a breeze, and ensuring your applications remain responsive and insightful.

A practical example

In this tutorial, we’ll delve into an advanced use case of MongoDB’s $lookup aggregation stage and Atlas Search’s $search functionality. We’ll be working with two collections: streaming_platforms and movies_series. The streaming_platforms collection contains information about various streaming platforms, including subscription plans, while the movies_series collection stores data about movies and series, including extensive details such as casting and full synopsis.

Prerequisites

  • MongoDB cluster running v6.0 or later.
  • Project Data Access Admin or higher access to the MongoDB Atlas project.

Collections Overview

streaming_platforms Collection (JSON Format)

{
  "platform_id": 1,
  "platform_name": "Example Streaming",
  "subscription_plans": [
    {
      "plan_id": 101,
      "plan_name": "Basic",
      "monthly_price": 9.99,
      "max_streams": 1
    },
    {
      "plan_id": 102,
      "plan_name": "Premium",
      "monthly_price": 14.99,
      "max_streams": 4
    }
  ]
}

movies_series Collection (JSON Format)

```json
{
  "content_id": 101,
  "title": "Example Movie",
  "genre": ["Action", "Adventure"],
  "release_date": "2023-01-15",
  "casting": [
    {
      "actor_name": "John Doe",
      "role": "Protagonist"
    },
    {
      "actor_name": "Jane Smith",
      "role": "Antagonist"
    }
  ],
  "platforms_available": ["Example Streaming","CineFlix", "FilmHub"],
  "synopsis": "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nulla eu justo nec libero interdum ultricies at a metus."
}
```

Step 1: Setting Up an Atlas Search Index

Create an Atlas Search index named media-search-index for the movies_series collection to index the title, genre, casting.actor_name, and synopsis fields.

To create the index just log into your MongoDB Atlas, navigate to your cluster, and click on Browse Collections.

Select the collection you want to create your Atlas Search Index for, in our case is ‘movies_series’.

Then go to ‘Search Indexes’ and click Create Search Index

After that you can use the visual Editor to create the index.

Type the index name and the database and collection it will be created on.

Here we will use dynamic indexes to automatically index all fields but in a regular case, you should click “Refine Your Index” and do the mapping manually to avoid an Index that’s too big.

Then just click Create and you are all set.

Step 2: Running $lookup with $search in Python

Create a Python script named media_lookup_search.py and use the following code:

from pymongo import MongoClient

# MongoDB Connection String
connection_string = "<connection-string>"

# MongoDB Database and Collections
database_name = "sample_database"
platforms_collection = "streaming_platforms"
movies_series_collection = "movies_series"

# Atlas Search Pipeline with $lookup and Atlas Search inside the pipeline
aggregation_pipeline = [
    {
        '$match': {
            'subscription_plans.max_streams': {'$gt': 3}
        }
    },
    {
        '$lookup': {
            'from': 'movies_series',
            'localField': 'platform_name',
            'foreignField': 'platforms_available',
            'as': 'contentDetails',
            'pipeline': [
                {
                    '$search': {
                        'index': 'media-search-index',
                        'compound': {
                            'must': [
                                {
                                    'queryString': {
                                        'defaultPath': 'synopsis',
                                        'query': 'Lorem ipsum'
                                    }
                                }
                            ]
                        }
                    }
                },
                {
                    '$limit': 5
                },
                {
                    '$project': {
                        '_id': 0,
                        'title': 1,
                        'synopsis': 1
                    }
                }
            ]
        }
    },
    {
        '$project': {
            '_id': 0,
            'platform_name': 1,
            'subscription_plans': 1,
            'contentDetails': 1
        }
    }
]

# Connect to MongoDB
client = MongoClient(connection_string)
db = client[database_name]
platforms = db[platforms_collection]

# Execute Atlas Search Pipeline
results = list(platforms.aggregate(aggregation_pipeline))

# Print Results
for result in results:
    print(result)

```

Replace with your MongoDB connection string, including user credentials.

Step 3: Run the Python Script

Execute the Python script using a Python interpreter or a code editor. Ensure you have the necessary Python libraries, especially pymongo, installed to run the script.

The result we got on our example database:

{
    "platform_name": "Example Streaming",
    "subscription_plans": {
        "plan_id": 102,
        "plan_name": "Premium",
        "monthly_price": 14.99,
        "max_streams": 4
    },
    "contentDetails": [
        {
            "title":  "Example Movie",
            "synopsis": "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nulla eu justo nec libero interdum ultricies at a metus."
        },
        {
            "title": "Lorem Ipsum Movie 2",
            "synopsis": "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Vivamus fringilla justo ut nisl varius, at volutpat ligula vulputate."
        }
    ]
}

```

As you can see in this script, we search for streaming platforms offering subscription plans allowing more than 3 streams. Specifically, the script targets platforms with subscription plans permitting additional streams. Within these platforms, the search is narrowed down to movies that contain the specific synopsis text “Lorem ipsum”. The results are then limited to showcasing only the first 5 movies available on each platform meeting these criteria.

By mastering this combination of $lookup and Atlas Search’s $search query in Python, you can perform complex searches within your MongoDB data, enhancing your applications with intelligent and meaningful interactions.

Contact us to schedule your consultation.

Delbridge is a privately held global company with offices in Canada, the USA, Costa Rica, and Romania.

Delbridge Solutions specializes in providing Corporate Performance Management, Sales Performance Management, and Data & Software Engineering.

888.866.6176

 info@delbridge.solutions

Join the Delbridge Community!