Model Tree Structures with an Array of Ancestors

On this page

Overview

Data in MongoDB has a flexible schema. Collections do not enforce document structure. Decisions that affect how you model data can affect application performance and database capacity. See Data Modeling Concepts for a full high level overview of data modeling in MongoDB.

This document describes a data model that describes a tree-like structure in MongoDB documents using references to parent nodes and an array that stores all ancestors.

Pattern

The Array of Ancestors pattern stores each tree node in a document; in addition to the tree node, document stores in an array the id(s) of the node’s ancestors or path.

Consider the following hierarchy of categories:

Tree data model for a sample hierarchy of categories.

The following example models the tree using Array of Ancestors. In addition to the ancestors field, these documents also store the reference to the immediate parent category in the parent field:

db.categories.insertMany( [
  { _id: "MongoDB", ancestors: [ "Books", "Programming", "Databases" ], parent: "Databases" },
  { _id: "dbm", ancestors: [ "Books", "Programming", "Databases" ], parent: "Databases" },
  { _id: "Databases", ancestors: [ "Books", "Programming" ], parent: "Programming" },
  { _id: "Languages", ancestors: [ "Books", "Programming" ], parent: "Programming" },
  { _id: "Programming", ancestors: [ "Books" ], parent: "Books" },
  { _id: "Books", ancestors: [ ], parent: null }
] )
  • The query to retrieve the ancestors or path of a node is fast and straightforward:

    db.categories.findOne( { _id: "MongoDB" } ).ancestors
    
  • You can create an index on the field ancestors to enable fast search by the ancestors nodes:

    db.categories.createIndex( { ancestors: 1 } )
    
  • You can query by the field ancestors to find all its descendants:

    db.categories.find( { ancestors: "Programming" } )
    

The Array of Ancestors pattern provides a fast and efficient solution to find the descendants and the ancestors of a node by creating an index on the elements of the ancestors field. This makes Array of Ancestors a good choice for working with subtrees.

The Array of Ancestors pattern is slightly slower than the Materialized Paths pattern but is more straightforward to use.