M121: Chapter 5: Miscellaneous Aggregation

The MongoDB Aggregation Framework(Chapter 5 of 6)

The $redact Stage

Restricts the contents of the documents based on information stored in the documents themselves.

// creating a variable to refer against
var userAccess = "Management"

// comparing whether the value/s in the userAccess variable are in the array
// referenced by the $acl field path
db.employees.aggregate([
    {
      "$redact": {
        "$cond": [{ "$in": [userAccess, "$acl"] }, "$$DESCEND", "$$PRUNE"]
      }
    }
  ]).pretty()
// prune 修剪

The $out Stage

Takes the documents returned by the aggregation pipeline and writes them to a specified collection.

The following aggregation operation pivots the data in the books collection in the test database to have titles grouped by authors and then writes the results to the authors collection, also in the test database.

db.getSiblingDB("test").books.aggregate( [
    { $group : { _id : "$author", books: { $push: "$title" } } },
    { $out : "authors" }
] )

$merge Overview

Writes the results of the aggregation pipeline to a specified collection. The $merge operator must be the last stage in the pipeline.

$merge Syntax

{ $merge: {
     into: <collection> -or- { db: <db>, coll: <collection> },
     on: <identifier field> -or- [ <identifier field1>, ...],  // Optional
     let: <variables>,                                         // Optional
     whenMatched: <replace|keepExisting|merge|fail|pipeline>,  // Optional
     whenNotMatched: <insert|discard|fail>                     // Optional
} }
db.getSiblingDB("zoo").salaries.aggregate( [
   { $group: { _id: { fiscal_year: "$fiscal_year", dept: "$dept" }, salaries: { $sum: "$salary" } } },
   { $merge : { into: { db: "reporting", coll: "budgets" }, on: "_id",  whenMatched: "replace", whenNotMatched: "insert" } }
] )

Problem:

Consider an Aggregation Pipeline using the new $merge stage that outputs to the employee_data collection.

If we are not expecting to find any matching documents in the employee_data collection, which of the following stages should we use?

{
  $merge: {
    into: "employee_data",
    whenNotMatched: "insert",
    whenMatched: "fail"
  }
}

If we are not expecting to find any matching documents in the employee_data collection, then finding a matching document should raise a red flag. One way to deal with this is to throw an error, which is denoted by whenMatched: “fail”.

Additionally, it would not make sense to replace or merge the documents in employee_data with the output documents, because we weren’t expecting to find any matches.

Using $merge for Single View

Using $merge for a Temporary Collection

Using $merge for Rollups

Rollups 汇总

Problem:

Consider a potential $merge stage that:

  • outputs results to a collection called analytics.
  • merges the results of the $merge stage with current analytics documents using the value of the name field.
  • updates existing analytics documents to include any modified information from the resulting $merge documents.
  • creates a new analytics document if an existing document with the resulting document’s name does not exist.

Which of the following $merge stages will perform all of the above functionality?

{
  $merge: {
      into: "analytics",
      on: "name",
      whenMatched: "merge",
      whenNotMatched: "insert"
  }
}

Views

  • horizontally slice a document: $match, rows reduced(number of documents)
  • vertically slice a document: $project, columns reduced(shape)

To remove some sensitive imformation such as accountType, address, for the call center employees.

// this is the command we used to create the bronze_banking view in the database
// identical commands were used to create the silver and gold views, the only
// change was in the $match stage
db.createView("bronze_banking", "customers", [
  {
    "$match": { "accountType": "bronze" }
  },
  {
    "$project": {
      "_id": 0,
      "name": {
        "$concat": [
          { "$cond": [{ "$eq": ["$gender", "female"] }, "Miss", "Mr."] },
          " ",
          "$name.first",
          " ",
          "$name.last"
        ]
      },
      "phone": 1,
      "email": 1,
      "address": 1,
      "account_ending": { "$substr": ["$accountNumber", 7, -1] }
    }
  }
])

// getting all collections in a database and seeing their information
db.getCollectionInfos()

// getting information on views only
db.system.views.find()
db.view.find()
db.view.findOne()
db.view.count()
db.view.distinct()
db.view.aggregate()

  • view definitions are public
  • avoid referring to sensitive fields within the pipeline that defines a view.

Summary

  • Views contain no data themselves. They are created on demand and reflect the data in the source collection
  • Views are read only. Write operations to views will error
  • Views have some restrictions. They must abide by the rules of the Aggregation Framework and cannot contain find() projection operators
  • Horizontal slicing is performed with the match stage, reducing the number of documents that are returned.
  • Vertical slicing is performed with a $proiect or other shaping stage, modifving individual documents.