M320: Chapter 4: Patterns Part 2

Computed Pattern

The computed pattern is used when you need to perform the same computations many times. Instead of recalculating values on every read, pre-compute and store them in the document.

When to Use the Computed Pattern

Use this pattern when:

The same calculation runs repeatedly on the same data
The source data changes infrequently while reads are frequent
The computation is expensive (multiple documents, complex formulas)
Read latency is critical and must be predictable

Common use cases include:

Math operations — Running totals, averages, percentages
Fan-out operations — Distributing computed results to related documents
Roll-up operations — Aggregating data across time periods or categories

Pattern Mechanics

Instead of computing on every read:

// Bad: compute on every read
const pipeline = [
  { $match: { productId: "prod123" } },
  { $group: { _id: null, total: { $sum: "$amount" }, count: { $sum: 1 } } },
];
const result = await db.collection("sales").aggregate(pipeline).next();

Store the pre-computed result in the document:

// Good: store the computed value
{
  _id: ObjectId("..."),
  productId: "prod123",
  productName: "Wireless Headphones",
  dailySales: {
    date: ISODate("2026-03-15"),
    totalAmount: 3199.60,
    orderCount: 42,
    averageOrderValue: 76.18
  }
}

Pre-computation Strategies

Strategy 1: Update on Write

Compute the value when the source data changes.

async function recordSale(sale) {
  const session = db.getMongo().startSession();
  session.startTransaction();

  try {
    // Insert the sale record
    await db.collection("sales").insertOne(sale, { session });

    // Update the pre-computed daily summary
    await db.collection("daily_summaries").updateOne(
      {
        productId: sale.productId,
        "dailySales.date": {
          $eq: new Date(sale.createdAt.toISOString().split("T")[0]),
        },
      },
      {
        $inc: {
          "dailySales.totalAmount": sale.amount,
          "dailySales.orderCount": 1,
        },
      },
      { upsert: true, session },
    );

    await session.commitTransaction();
  } catch (err) {
    await session.abortTransaction();
    throw err;
  } finally {
    session.endSession();
  }
}

Strategy 2: Scheduled Batch Computation

Periodically recompute values using the aggregation pipeline and $merge.

// Hourly rollup — batch job run by a scheduler (e.g., cron, node-cron)
const pipeline = [
  {
    $match: {
      createdAt: {
        $gte: new Date(Date.now() - 3600000),
      },
    },
  },
  {
    $group: {
      _id: {
        productId: "$productId",
        hour: { $dateToString: { format: "%Y-%m-%dT%H:00:00Z", date: "$createdAt" } },
      },
      totalAmount: { $sum: "$amount" },
      orderCount: { $sum: 1 },
      avgOrderValue: { $avg: "$amount" },
    },
  },
  {
    $merge: {
      into: "hourly_summaries",
      on: ["_id.productId", "_id.hour"],
      whenMatched: "replace",
      whenNotMatched: "insert",
    },
  },
];

await db.collection("sales").aggregate(pipeline).toArray();

Aggregation Pipeline for Computation

The aggregation framework is the primary tool for computing values in MongoDB.

// Compute product category statistics
const categoryStats = await db.collection("products").aggregate([
  {
    $group: {
      _id: "$category",
      productCount: { $sum: 1 },
      averagePrice: { $avg: "$price" },
      minPrice: { $min: "$price" },
      maxPrice: { $max: "$price" },
      totalStock: { $sum: "$inventory.stock" },
    },
  },
  {
    $addFields: {
      inventoryValue: { $multiply: ["$averagePrice", "$totalStock"] },
    },
  },
  {
    $sort: { productCount: -1 },
  },
  {
    $merge: {
      into: "category_stats",
      whenMatched: "replace",
      whenNotMatched: "insert",
    },
  },
]).toArray();

Real-Time vs. Batch Computation

Aspect	Real-Time	Batch
Latency	Immediate	Delayed (minutes to hours)
Accuracy	Perfect accuracy	Eventual consistency
System load	Higher per-operation cost	Lower overall cost
Implementation	Change streams, triggers	Scheduled jobs (cron)
Use case	Dashboards, alerts	Reports, analytics

When to use real-time:

Live dashboards that must reflect the latest data
Alerting systems that trigger on thresholds
User-facing features like current balance or view count

When to use batch:

Historical analytics and trend reports
End-of-day reconciliation
Large-scale data processing where real-time is too expensive

Lab: Apply the Computed Pattern

// Result document after applying the computed pattern
{
  "_id": ObjectId("5c9414f25e6aff2b8870a2d0"),
  "zone": 13,
  "date": ISODate("2019-03-21T00:00:00.000Z"),
  "kW per day": {
    "consumption": 9756,
    "self-produced": 2059,
    "city-supplemented": 7700
  }
}

This document stores pre-computed daily energy metrics. Instead of summing thousands of hourly readings every time a dashboard loads, the application reads one document per zone per day. The computation runs once when the day ends or incrementally as data arrives.

Which one of the following scenarios is best suited for the computed pattern?

We need to calculate a value that is displayed 100 times a minute and is based on a field that updates once a minute. The computation is expensive (aggregating data across 10,000 sensors) and the result is read far more often than the source data changes. The computed pattern caches the result and only recomputes when the source field actually updates.

Bucket Pattern (Used in IoT)

The bucket pattern groups related data into time-based or count-based buckets. This is especially useful for IoT applications, time-series data, and event logging.

// Sensor data stored with bucket pattern — one document per hour
{
  _id: ObjectId("..."),
  sensorId: "sensor-temp-a1",
  startDate: ISODate("2026-03-15T10:00:00Z"),
  endDate: ISODate("2026-03-15T10:59:59Z"),
  readings: [
    { time: ISODate("2026-03-15T10:00:00Z"), value: 22.1 },
    { time: ISODate("2026-03-15T10:05:00Z"), value: 22.3 },
    { time: ISODate("2026-03-15T10:10:00Z"), value: 22.0 },
    // ... 12 readings per hour
  ],
  metadata: {
    unit: "celsius",
    location: "Server Room A",
  },
  stats: {
    min: 22.0,
    max: 22.7,
    avg: 22.3,
    count: 12,
  },
}

Benefits of the Bucket Pattern

Reduces the total number of documents by an order of magnitude
Single read retrieves many data points
Enables efficient time-range queries
Keeps document size under control by limiting bucket size

// Creating a bucketed collection with pre-allocated buckets
function getBucketKey(sensorId, timestamp) {
  const startOfHour = new Date(timestamp);
  startOfHour.setMinutes(0, 0, 0);
  return { sensorId, startDate: startOfHour };
}

async function insertSensorReading(sensorId, timestamp, value) {
  const bucketKey = getBucketKey(sensorId, timestamp);

  const result = await db.collection("sensor_data").updateOne(
    {
      ...bucketKey,
      "readings": { $not: { $size: 60 } }, // prevent overflow
    },
    {
      $push: { readings: { time: timestamp, value } },
      $setOnInsert: {
        sensorId,
        startDate: bucketKey.startDate,
        endDate: new Date(bucketKey.startDate.getTime() + 3599999),
      },
      $inc: { "stats.count": 1 },
      $min: { "stats.min": value },
      $max: { "stats.max": value },
    },
    { upsert: true },
  );

  return result;
}

Which one of the following requirements in our system is the best candidate to use the bucket pattern?

Collecting temperature readings from 10,000 sensors every 5 seconds and displaying hourly trend charts. The bucket pattern groups these readings into hour-long documents, reducing document count from 7.2 million per hour to 10,000.

Schema Versioning Pattern

The schema versioning pattern avoids downtime during schema upgrades by allowing multiple document versions to coexist.

// Version 1 document
{
  _id: ObjectId("..."),
  schema_version: 1,
  name: "John Doe",
  email: "[email protected]"
}

// Version 2 document — adds phone field
{
  _id: ObjectId("..."),
  schema_version: 2,
  name: "Jane Smith",
  email: "[email protected]",
  phone: "+1-555-0123"
}

Application-Level Migration

function normalizeUserDocument(doc) {
  if (!doc.schema_version || doc.schema_version === 1) {
    return {
      ...doc,
      displayName: doc.name,
      contact: {
        email: doc.email,
        phone: doc.phone || null,
      },
      schema_version: 2,
    };
  }
  return doc;
}

async function getUser(userId) {
  const doc = await db.collection("users").findOne({ _id: userId });
  return normalizeUserDocument(doc);
}

Lazy Migration

async function migrateDocument(doc) {
  if (doc.schema_version >= 2) return doc;

  const updated = {
    ...doc,
    displayName: doc.name,
    contact: { email: doc.email, phone: null },
    schema_version: 2,
  };

  // Remove old fields
  delete updated.name;

  await db.collection("users").replaceOne({ _id: doc._id }, updated);
  return updated;
}

async function findAndMigrate(query) {
  const cursor = db.collection("users").find(query);

  const results = [];
  for await (const doc of cursor) {
    results.push(await migrateDocument(doc));
  }

  return results;
}

Benefits of Schema Versioning

Zero-downtime schema evolution — old documents remain readable
Gradual migration — update documents as they are accessed
Rollback capability — old applications still work with new documents
No lock-in — migrate at your own pace

Tree Patterns

MongoDB supports four tree modeling patterns for hierarchical data such as organization charts, book subjects, and product categories.

Parent References — Each node stores a reference to its parent
Child References — Each node stores references to its children
Array of Ancestors — Each node stores an array of all ancestor IDs
Materialized Paths — Each node stores a string path from root to node

Each pattern addresses a different trade-off between read and write efficiency:

// Parent References — each node stores its parent ID
{ _id: ObjectId("..."), name: "Engineering", parentId: ObjectId("...") }

// Child References — each node stores its children IDs
{ _id: ObjectId("..."), name: "Engineering", children: [ObjectId("c1"), ObjectId("c2")] }

// Array of Ancestors — each node stores all ancestor IDs
{ _id: ObjectId("..."), name: "Frontend Team", ancestors: [ObjectId("root"), ObjectId("eng")], parentId: ObjectId("eng") }

// Materialized Paths — each node stores a path string
{ _id: ObjectId("..."), name: "Frontend Team", path: ",engineering,software,frontend," }

Tree Pattern Comparison

Pattern	Read Ancestors	Read Descendants	Write Cost	Implementation
Parent References	Recursive query	Recursive query	Low	Simple
Child References	Complex	One query	High (large arrays)	Medium
Ancestor Array	One query	One query	Medium	Medium
Materialized Paths	Regex query	Regex query	Medium	Medium

Ancestor array combined with parent reference provides a good balance for most tree use cases.

Polymorphic Pattern

The polymorphic pattern handles documents with varying structures within the same collection. This is useful when different entities share common fields but have unique attributes.

// Collection with polymorphic documents
{
  _id: ObjectId("..."),
  type: "vehicle",
  make: "Tesla",
  model: "Model 3",
  year: 2026,
  // vehicle-specific fields
  doors: 4,
  range_km: 500
},
{
  _id: ObjectId("..."),
  type: "property",
  address: "123 Main St",
  city: "San Francisco",
  price: 1200000,
  // property-specific fields
  bedrooms: 3,
  bathrooms: 2,
  squareFeet: 1500
}

Single View Solution with Polymorphic Pattern

The polymorphic pattern enables a single view across different entity types, ideal for search systems, inventory management, and CRM consolidation.

// Search across all entity types
async function searchAll(searchTerm) {
  return await db.collection("assets").find({
    $or: [
      { make: { $regex: searchTerm, $options: "i" } },
      { model: { $regex: searchTerm, $options: "i" } },
      { address: { $regex: searchTerm, $options: "i" } },
    ],
  }).toArray();
}

Which one of the following scenarios is best suited for the polymorphic pattern?

An organization acquired different companies over the years, serving the same markets with the same customers. There is a requirement to merge all systems into one. Each company has similar but not identical data structures. The polymorphic pattern allows all entities to coexist in a single collection while preserving their unique fields.

Other Patterns

Approximation Pattern

The approximation pattern avoids performing an expensive operation too often by using approximate values instead of exact calculations.

// Track page views with approximation
{
  _id: ObjectId("..."),
  pageId: "article-123",
  viewCount: 4200,
  // Only write to database every 10th view
  _pendingViews: 0
}

// In application code
async function recordPageView(pageId) {
  // Update a counter in memory or cache
  // Flush to database periodically
  await db.collection("page_views").updateOne(
    { pageId },
    { $inc: { viewCount: 1 } },
    { upsert: true },
  );
}

Using a counter that aggregates multiple updates before writing to the database reduces write load dramatically. For a popular page with 10,000 views per minute, writing every 10 views reduces database writes by 90%.

Outlier Pattern

The outlier pattern keeps the focus on the most frequent use cases while handling edge cases separately.

// Normal document — most users have a few favorite products
{
  _id: ObjectId("..."),
  username: "jdoe",
  favorites: ["prod123", "prod456", "prod789"]
}

// Outlier document — a power user with thousands of favorites
{
  _id: ObjectId("..."),
  username: "poweruser",
  favorites: ["prod001", "prod002", /* ... thousands of items */],
  favorites_outlier: true, // flag this as an outlier
  favorites_count: 3500
}

When a query asks about a user’s favorites, the application checks the favorites_outlier flag. Normal users return the embedded array. Power users query a separate collection that handles large arrays efficiently without affecting the common case.

Summary of Patterns

Pattern	Problem Solved	Best For
Computed	Repeated expensive calculations	Dashboards, stats, metrics
Bucket	Managing large time-series datasets	IoT, logging, analytics
Schema Versioning	Zero-downtime schema evolution	Live production systems
Tree (Ancestor Array)	Hierarchical data queries	Organization charts, categories
Polymorphic	Diverse entity types in one collection	Mergers, product catalogs
Approximation	Reducing expensive write operations	Page views, counters
Outlier	Handling edge cases without impacting the norm	Social media, user-generated data

M320: Chapter 4: Patterns Part 2

Computed Pattern

When to Use the Computed Pattern

Pattern Mechanics

Pre-computation Strategies

Strategy 1: Update on Write

Strategy 2: Scheduled Batch Computation

Aggregation Pipeline for Computation

Real-Time vs. Batch Computation

Lab: Apply the Computed Pattern

Bucket Pattern (Used in IoT)

Benefits of the Bucket Pattern

Schema Versioning Pattern

Application-Level Migration

Lazy Migration

Benefits of Schema Versioning

Tree Patterns

Tree Pattern Comparison

Polymorphic Pattern

Single View Solution with Polymorphic Pattern

Other Patterns

Approximation Pattern

Outlier Pattern

Summary of Patterns

Comments

Share this article

👍 Was this article helpful?