Introduction
Engineers frequently move data between CSV files, JSON APIs, in-memory objects, and databases. Bugs happen when we treat all these representations as equivalent without understanding their structural differences. See Go Installation Guide, Go Ecosystem Overview, Go Best Practices for more context.
This article explains tables, matrices, and data types as modeling choices and shows how those choices map across Go, Python, JavaScript, and storage systems.
Table vs Matrix: Not the Same Thing
Matrix
A matrix is usually a dense 2D structure with uniform element type, commonly numeric.
Properties:
- Fixed conceptual shape (
m x n). - Same type in every cell.
- Optimized for math operations.
Table
A table is row/column data where columns may have different types.
Properties:
- Rows represent entities/records.
- Columns represent attributes.
- Column types differ (
string,int,timestamp, etc.).
Tables are better for business data; matrices are better for numerical computation.
Row-Oriented vs Column-Oriented Thinking
Row-oriented model
Best for transactional operations and record-level access.
Examples:
- JSON arrays of objects.
- Go slices of structs.
- SQL row inserts.
Column-oriented model
Best for analytics and aggregate scans.
Examples:
- DataFrame columns.
- Arrow columnar buffers.
- OLAP engines.
Choosing row vs column affects memory layout and query performance.
Language Mapping Overview
| Concept | Go | Python | JavaScript | Database |
|---|---|---|---|---|
| Record (row) | struct |
dict / dataclass |
object | row/document |
| Table (records) | []struct |
list[dict] / DataFrame |
array of objects | table/collection |
| Matrix | [][]float64 |
numpy.ndarray |
nested arrays / typed arrays | numeric array columns |
Go Modeling Patterns
Row model with struct slice
type Person struct {
Name string
Age int
}
people := []Person{
{Name: "Alice", Age: 30},
{Name: "Bob", Age: 25},
}
Benefits:
- Compile-time type safety.
- Explicit schema.
- Easier refactor safety.
Matrix model in Go
matrix := [][]float64{
{1.0, 2.0, 3.0},
{4.0, 5.0, 6.0},
}
Use this for numeric operations, not mixed business records.
Python Modeling Patterns
Table as list of dict
rows = [
{"name": "Alice", "age": 30},
{"name": "Bob", "age": 25},
]
Table as DataFrame
import pandas as pd
df = pd.DataFrame(rows)
DataFrame is table-first and analytics-friendly, while list-of-dict is API-friendly.
Matrix in Python
import numpy as np
mat = np.array([[1, 2], [3, 4]], dtype=np.float64)
JavaScript and JSON Patterns
Common API payload format:
[
{ "name": "Alice", "age": 30 },
{ "name": "Bob", "age": 25 }
]
This is table-like row data. It is not an efficient matrix for heavy numeric operations.
Data Type Discipline and Schema Drift
Schema drift happens when fields change shape over time without explicit contracts.
Examples:
agebecomes string in one source.- missing keys in some rows.
- null handling differs by service.
Prevent drift with:
- JSON Schema or protobuf contracts.
- Validation at ingest time.
- Typed models in service boundaries.
Normalization and Missing Data
Real datasets are messy. Normalize before downstream usage.
Checklist:
- Convert numeric strings.
- Standardize timestamps and timezones.
- Handle null/missing explicitly.
- Deduplicate by stable key.
Bad modeling decisions usually appear as data quality incidents later.
Performance Perspective
Row model strengths
- Fast per-record read/write.
- Natural for APIs and CRUD.
Column model strengths
- Faster scans and aggregates on selected columns.
- Better compression in analytics workloads.
Matrix strengths
- Efficient vectorized numerical operations.
- Better cache behavior for linear algebra workloads.
Common Modeling Mistakes
- Using matrix structures for heterogeneous business records.
- Using dynamic objects everywhere without schema checks.
- Ignoring type conversion cost during ETL.
- Treating JSON arrays as inherently scalable analytics storage.
Practical Design Rules
- For business entities: row model with schema.
- For analytics: column model.
- For scientific compute: matrix model.
- At boundaries: validate and normalize aggressively.
Conclusion
Tables and matrices represent different data intentions. Modeling them correctly across languages improves correctness, performance, and maintainability.
Choose representation based on workload semantics, not convenience alone.
Comments