
MongoDB Notes Part IV

Data Modeling Introduction

MongoDB’s collections do not enforce document structure.

Tools for represent the relationships:
  • References
    • store the relationships between data by including links or references from one document to another.
    • normalized data models.

  • Embedded Data
    • store the relationships between data by storing related data in a single document structure.
    • denormalized data model.
    • could guarantee atomicity since all data are in a single document.

In general, use embedded data models when:
  • you have “contains” relationships between entities.
  • you have one-to-many relationships between entities.

Embedding provides better performance for read operations, as well as the ability to request and retrieve related data in a single database operation. However, this may lead to situations where documents grow after creation.

To interact with the embedded document, use “dot notation”.

In general, use references when:
  • when embedding would cause duplicates and would not bring any advantages.
  • to represent more complex many-to-many relationships.
  • to model large hierarchical data sets.

GridFS stores files in two collections:
  • chunks stores the binary chunks.
  • files stores the file’s metadata. 

Model Tree Structures
  • use Parent References, store the reference to the parent category in the field parent.
  • use Child References, store all the reference of the child category in the field children.
  • use Array of Ancestors, provides a fast and efficient way to find the descendants and ancestors of a node by creating an index on the ancestors field.
  • use Materialized Paths, store the path in the field path, the path string uses the comma a a delimiter.

  • use Nested Sets, best for static trees that do not change.

To support keyword search, contains a field of the keywords, and create a multi-key index on this field.

Index Introduction

Index types
  • Single Field Indexes
    • A default index is create on the _id field.
    • The field could be the embedded field or a subdocument.
  • Compound Indexes
    • The order of the fields matter.
    • Supports queries on any prefix of the index fields.
  • Multikey Indexes
    • to create an index on an array, adds index items for each element in the array.
    • the index of a shard key can not be multi key index.

  • Geospatial Indexes
  • Text Indexes
    • to support text search of string content in documents of a collection

  • TTL indexes
    • special indexex that MongoDB can use to automatically remove documents from a collection after a certain amount of time.

  • Unique indexes
    • cause MongoDB to reject all documents that contain a duplicate value for the indexed field.

  • Sparse indexes
    • only contain entries for documents that have the indexed field.

Remove a Specific Index

Modify an Index
  • First drop the index and then build the index

List all Indexes on a Collection

