Data Model Design
With MongoDB, you may embed related data in a single structure or document. These schema are generally known as “denormalized” models, and take advantage of MongoDB’s rich documents. Consider the following diagram:
Embedded data models allow applications to store related pieces of information in the same database record. As a result, applications may need to issue fewer queries and updates to complete common operations.
In general, use embedded data models when:
- you have “contains” relationships between entities. See Model One-to-One Relationships with Embedded Documents.
- you have one-to-many relationships between entities. In these relationships the “many” or child documents always appear with or are viewed in the context of the “one” or parent documents. See Model One-to-Many Relationships with Embedded Documents.
In general, embedding provides better performance for read operations, as well as the ability to request and retrieve related data in a single database operation. Embedded data models make it possible to update related data in a single atomic write operation.
However, embedding related data in documents may lead to situations where documents grow after creation. With the MMAPv1 storage engine, document growth can impact write performance and lead to data fragmentation.
In version 3.0.0, MongoDB uses Power of 2 Sized Allocations as the default allocation strategy for MMAPv1 in order to account for document growth, minimizing the likelihood of data fragmentation. See Power of 2 Sized Allocations for details. Furthermore, documents in MongoDB must be smaller than the
maximum BSON document size. For bulk binary data, consider GridFS.
To interact with embedded documents, use dot notation to “reach into” embedded documents. See query for data in arrays and query data in embedded documents for more examples on accessing data in arrays and embedded documents.
Normalized data models describe relationships using references between documents.
In general, use normalized data models:
- when embedding would result in duplication of data but would not provide sufficient read performance advantages to outweigh the implications of the duplication.
- to represent more complex many-to-many relationships.
- to model large hierarchical data sets.
To join collections, MongoDB provides the aggregation stages:
MongoDB also provided referencing to join data across collections.
For an example of normalized data models, see Model One-to-Many Relationships with Document References.
For examples of various tree models, see Model Tree Structures.