Model Tree Structures with Materialized Paths
Data in MongoDB has a flexible schema. Collections do not enforce document structure. Decisions that affect how you model data can affect application performance and database capacity. See Data Modeling Concepts for a full high level overview of data modeling in MongoDB.
This document describes a data model that describes a tree-like structure in MongoDB documents by storing full relationship paths between documents.
The Materialized Paths pattern stores each tree node in a document; in addition to the tree node, document stores as a string the id(s) of the node’s ancestors or path. Although the Materialized Paths pattern requires additional steps of working with strings and regular expressions, the pattern also provides more flexibility in working with the path, such as finding nodes by partial paths.
Consider the following hierarchy of categories:
The following example models the tree using Materialized Paths, storing the path in the field
path; the path string uses the comma
, as a delimiter:
You can query to retrieve the whole tree, sorting by the field
You can use regular expressions on the
pathfield to find the descendants of
You can also retrieve the descendants of
Booksis also at the topmost level of the hierarchy:
To create an index on the field
pathuse the following invocation:
This index may improve performance depending on the query:
For queries from the root
/^,Books,Programming,/), an index on the
pathfield improves the query performance significantly.
For queries of sub-trees where the path from the root is not provided in the query (e.g.
/,Databases,/), or similar queries of sub-trees, where the node might be in the middle of the indexed string, the query must inspect the entire index.
For these queries an index may provide some performance improvement if the index is significantly smaller than the entire collection.