Free Wins in Storage and Read Speed: Flat Schemas vs Structured Schemas

When developers or admins who grew up with the “relational way” move to MongoDB, they often design flat documents. That habit makes sense, since relational modeling trains you to think in two dimensions, with data spread across tables.

MongoDB stores data as BSON documents, which are close to a binary form of JSON with a few differences. Because of that format, schemas can have multiple levels. You can read more in the BSON specification and its differences from JSON.

A MongoDB document is a set of key and value pairs. A value can be any BSON type, including nested documents, arrays, or arrays of documents.

Using nested documents or arrays lets you model a structured schema, where one field groups related details. This is an alternative to a flat schema.

Consider the same user record expressed both ways:

Both versions hold identical information. In flatUser everything sits on one level. In structuredUser fields are nested to reflect related data.

Why pick structured instead of flat? The short answer is that structured schemas can use less disk space and they can be quicker to traverse. To see why, it helps to recall how BSON is laid out.

For our purpose, think of a BSON document as a list of items, one per field and value. Each item includes a type byte, the field name as a string, a four byte length for variable sized values, and the serialized value bytes. In a picture, it looks like this:

Now let’s compare storage for the user’s name.

In flatUser, the storage table looks like this:

field-and-value	Type	Field Name	Field Length	Field Data	Total
name_first: “john”	1 byte	10 bytes	4 bytes	4 bytes	19 bytes
name_last: “smith”	1 byte	9 bytes	4 bytes	5 bytes	19 bytes
name_middle: “oliver”	1 byte	11 bytes	4 bytes	6 bytes	22 bytes

Summing the totals, the flat approach spends 60 bytes for the name field and value.

For structuredUser, split the accounting into two tables. The first table is the nested document that holds the name. The second table is the field and value for the name itself.

First table, the value of the field name:

field-and-value	Type	Field Name	Field Length	Field Data	Total Size
first: “john”	1 byte	5 bytes	4 bytes	4 bytes	14 bytes
last: “smith”	1 byte	4 bytes	4 bytes	5 bytes	14 bytes
middle: “oliver”	1 byte	6 bytes	4 bytes	6 bytes	17 bytes

Those entries add up to 45 bytes for the value of name. Now the second table:

field-and-value	Type	Field Name	Field Length	Field Data	Total Size
name: { … }	1 byte	4 bytes	4 bytes	45 bytes	54 bytes

Together, the structured approach uses 54 bytes for the user name.

The big gap comes from the field name bytes. The flat design spends 30 bytes on field names, while the structured design spends 19 bytes. The repeated substring “name_” in the flat fields drives the extra cost.

When these two full documents are stored in MongoDB, the flat version is 403 bytes and the structured version is 307 bytes. That is about a 24 percent space reduction with only a schema refactor, and the structured document is also easier to read.

Next, consider traversal speed for a lookup like the work address zip code.

In flatUser, reaching address_work_zip from the document start requires 12 field name comparisons.

In structuredUser, reaching address.work.zip takes 8 comparisons. The reduction happens because some values are documents. When the cursor reads a field like name, it can skip any nested fields that clearly cannot contain address.work.zip. The same idea applies when the cursor reads address.home and can skip street, number, zip, state, and country within that branch.

To measure the effect, we ran a focused test with this setup:

The MongoDB instance used in-memory storage to isolate document traversal.
Flat schemas used documents with 10, 25, 50, and 100 fields.
Structured schemas used 2×5, 5×5, 10×5, and 20×5 layouts, where 2×5 means two document fields with five fields each.
Each collection contained 10,000 documents generated with faker/npm.
Queries searched for a field and value that did not exist, which forced a full scan of every document and field.
Each query was run 100 times for every document size and schema.
No concurrent workload ran during the tests.

Results:

Documents	Flat	Structured	Difference	Improvement
10 / 2×5	487 ms	376 ms	111 ms	29,5%
25 / 5×5	624 ms	434 ms	190 ms	43,8%
50 / 10×5	915 ms	617 ms	298 ms	48,3%
100 / 20×5	1384 ms	891 ms	493 ms	55,4%

As expected, structured documents were faster to traverse in this scenario. Keep in mind that gains vary with how you nest and organize fields.

This walkthrough showed how to get more from your MongoDB deployment by reshaping the schema while keeping the same information. You can also apply common MongoDB schema patterns to decide what belongs in each document. The article Building with Patterns covers widely used approaches and is a strong next step.

All test code is available in the GitHub repository.

Data and Software Engineering

Corporate Performance Management

Sales Performance Management

Data and Software Engineering

Corporate Performance Management

Sales Performance Management Solutions

Data and Software Engineering

Corporate Performance Management

Latest News

Stay up to date with the latest Delbridge news, events and announcements.

Blog

Videos

Webinars, Product Demos and Interviews

Case Studies

Learn from real success stories that illustrate our commitment to excellence.

Company Overview

Delbridge Solutions transforms businesses with tailored expertise and innovative solutions for data engineering, CPM and SPM.

Clients

Partnering with clients across the globe to deliver impactful solutions and drive meaningful results. Our success is defined by our customers’ success.

Partners

Collaborating with industry-leading partners in order to deliver innovative solutions and create lasting value for our clients.

Careers

Our business growth is your professional growth. We are committed to making every successful employee a leader in our organization.

Free Wins in Storage and Read Speed: Flat Schemas vs Structured Schemas

Let's Connect!

Canada Office

USA Office

Romania Office

India Office

Spain Office

Costa Rica Office

Contact

Useful Links

Social Media