M320: Chapter 3: Patterns

MongoDB Data Modeling: Patterns

Introduction to Patterns

Patterns(设计模式) are for data modeling and schema design.

  • Use computed pattern to avoid repetitive computations
  • Structure similar fields with the attribute pattern
  • Handle changes to your deployment with no downtime with a schema versioning pattern

Guide to Homework Validation

Download MacOSX Validator

To help verify the syntax of your JSON files

JSON validator tool.

To learn more about JSON Schema

JSON Schema validation reference.

Handling Duplication, Staleness and Integrity

Handling Duplication

Duplication may cause inconsistancy when you change one piece of data while the duplication part not changed. This should be maintained by the application.

Duplication Examples

  • A clinet’s address should be fixed and embeded in one order.
  • A movie’s cast should be the names of actors, should be fixd and embeded.

Handling Staleness

Staleness means out-of-date data.

Handling Referential Integrity

Linking information between documents and tables. No support for cascading deletes.
Challenge for correctness and consistency.

Which of the following are valid concerns regarding duplication, staleness and referential integrity management in a MongoDB database and appropriate resolution techniques?

  • Data integrity issues can be minimized by using multi-document transactions.
  • Data staleness issues can be minimized with frequent batch updates.

Attribute Pattern

Which one of the following scenarios is best suited for the application of the Attribute Pattern? Some fields share a number of characteristics, and we want to search across those fields.

Lab

From

{
  "_id": "<objectId>",
  "title": "<string>",
  "artist": "<string>",
  "date_acquisition": "<date>",
  "location": "<string>",
  "on_display": "<bool>",
  "in_house": "<bool>",
  "events": [{
    "moma": "<date>",
    "louvres": "<date>"
  }]
}

to

{
  "_id": "<objectId>",
  "title": "<string>",
  "location": "<string>",
  "artist": "<string>",  
  "on_display": "<bool>",
  "in_house": "<bool>",
  "events": [
    {"k": "<string>", "v": "<date>"},
    {"k": "<string>", "v": "<date>"}
  ]
}

Extended Reference Pattern

How joins are performed in MongoDB

  • Application side
  • Lookups
    • $lookup
    • graphLookup
  • Avod a join by embedding the joined table!

Which one of the following scenarios is the best candidate to use the Extended Reference Pattern to avoid doing additional reads through joins/lookups? An app needs to retrieve a product and information about its supplier.

This is a good scenario for the Extended Reference Pattern. It is likely that we want to carry some information about a supplier with the product, however not all of it. Having fields like the supplier’s name, a reference number, and the supplier’s phone number should provide all the information we need when looking at a product. Additional information like the complete address and billing contact should be left within the suppliers collection.

Subset Pattern

Working set is too big:

  • add RAM
  • Scale with Sharding
  • reduce the size of the set

Which one of the following scenarios is the best candidate for use of the Subset Pattern? The working set does not fit in memory and it is difficult to scale the hardware.