Introduction to Data Modeling
-
Good performance
-
Maximizing the productivity of your developers
-
Minimizing the overall costs of your solution
-
Gather the requirements to create a “Data Model”
-
Turn those requirements into a basic “Model”
-
Powerful transformation patterns to optimize your “Data Model”
-
Understand how to evolve your “Data Model” over time
Course Prerequisites
Here are some of the terms and references for your benefit:
MongoDB Concepts and Vocabulary
Relational Database Concepts and Vocabulary
- Table (Wikipedia Definition)
- Table (Textbook Definition)
- Entity Relationship Model (Wikipedia Definition)
- The Entity Relationship Data Model
- Crow’s Foot Notation and ERD
- Crow’s Foot Notation Definition
General Database Concepts and Definitions
- Database (Wikipedia Definition)
- Schema (Wikipedia Definition)
- Schema Short Definition
- Database Transactions (Wikipedia Definition)
- Database Transactions Short Description
- Throughput vs Latency
- NoSQL Databases
MongoDB Compass and Atlas
Data Modeling in MongoDB
MongoDB is schemaless. Schema is a structure.
ERD and UML tooling.
- Usage pattern
- How you access your data
- Which queries are critical to your application
- Ratios between reads and writes
Document validation(enforce rules)
To join, use $lookup
in MongoDB.
The Document Model in MongoDB
BSON is a binary representation of JSON documents, which is used store data in MongoDB.
- MongoDB stores data as Documents
- Document fields can be values, embedded documents, or arrays of values and documents
- MongoDB is a Flexible Schema database
Supported Datatypes in MongoDB
Constraints in Data Modeling
MongoDB does support transactions.
- Keep the frequently used Documents in RAM
- Keep the Indexes in RAM
- Prefer Solid State Drives to Hard Disk Drives
- Infrequently data can use Hard Disk Drives
Recap:
- The nature of your dataset and hardware define the need to model your data
- It is important to identify those exact constraints and their impact to create a better model
- As your software and the technological landscape change, your model should be re-evaluated and updated accordingly
When working with MongoDB, security features, network performance, disk drive speed, and amount of RAM are all aspects you need to keep in mind. As for the operating system your deployment will be running on, MongoDB and other systems usually hide the differences from you.
The Data Modeling Methodology
Model for Simplicity or Performance
Modeling for Simplicity Diagram
Modeling for Performance Diagram
Modeling for a Mix of Simplicity and Performance Diagram
Summary of Modeling Approaches
Identifying the Workload
Case Study: IoT
- Organization has 100 Millions weather sensors
- Need to:
- collect the data from all devices
- analyze the data trends with a team of 10 data scientists
- Quantify and Qualify the queries as much as you can
- Few CRUD operations will drive the design