Introduction
Good documentation is crucial for software maintenance and team collaboration. Architecture Decision Records (ADRs) and RFCs help teams make and communicate technical decisions. This guide covers documenting architecture effectively, from ADR templates to RFC workflows, C4 modeling, and documentation-as-code automation.
Documentation is a love letter to your future self. Future developers—including you six months from now—will thank you for clear, contextual records of why decisions were made.
Architecture Decision Records
What is an ADR?
An Architecture Decision Record captures a specific architectural decision, its context, and its consequences. ADRs are lightweight documents stored in version control alongside the code they describe.
ADR Template
# ADR-001: Use PostgreSQL as Primary Database
## Status
Accepted
## Context
We need to choose a primary database for our new e-commerce platform.
The system requires:
- ACID compliance for transactions
- Complex queries for reporting
- JSON support for flexible schemas
- Horizontal scaling capability
## Decision
We will use PostgreSQL 16 as our primary database.
## Rationale
PostgreSQL was chosen over alternatives because:
1. ACID compliance is required for financial transactions
2. JSONB supports flexible product schemas without a separate document store
3. Native partitioning supports future horizontal scaling
4. Mature ecosystem with strong community support
5. Team has existing PostgreSQL experience
## Consequences
### Positive
- ACID compliance out of the box
- Excellent JSON support via JSONB
- Rich ecosystem of tools and extensions
- Strong community support
### Negative
- Horizontal scaling more complex than NoSQL solutions
- Requires more DBA expertise for performance tuning
- Connection management requires pool configuration
## Alternatives Considered
1. **MySQL**: Less feature-rich, weaker JSON support, no native partitioning
2. **MongoDB**: No ACID guarantees for multi-document transactions, limited join capability
3. **CockroachDB**: Newer, less mature ecosystem, higher operational complexity
4. **PlanetScale (MySQL-compatible)**: Serverless, but lacks PostgreSQL features
## Notes
- Decision made after proof-of-concept with PostgreSQL and MySQL
- POC results available in /docs/poc/database-comparison.md
- Review decision annually as PostgreSQL version evolves
ADR Lifecycle States
| State | Description | Typical Duration |
|---|---|---|
| Proposed | Initial draft, under review | 1-5 days |
| Accepted | Decision approved and implemented | Indefinite |
| Deprecated | Superseded by another ADR | N/A |
| Superseded | Replaced by a newer ADR | N/A |
| Amended | Modified by a follow-up ADR | N/A |
ADR Index
# Architecture Decision Records Index
## Current Decisions
| ADR | Title | Status | Date |
|-----|-------|--------|------|
| [001](adr-001-database.md) | Use PostgreSQL as Primary Database | Accepted | 2026-01-15 |
| [002](adr-002-messaging.md) | Use Kafka for Event Streaming | Accepted | 2026-02-01 |
| [003](adr-003-auth.md) | Use OAuth 2.0 / OIDC for Authentication | Accepted | 2026-03-10 |
| [004](adr-004-api.md) | Use GraphQL for User-Facing API | Deprecated | 2026-01-20 |
| [005](adr-005-deployment.md) | Use Kubernetes for Container Orchestration | Accepted | 2026-04-05 |
| [006](adr-006-cache.md) | Use Redis for Session Storage and Caching | Accepted | 2026-04-12 |
## Superseded Decisions
| ADR | Title | Superseded By | Date |
|-----|-------|---------------|------|
| [004](adr-004-api.md) | Use GraphQL for User-Facing API | ADR-007 | 2026-03-15 |
## Categories
| Category | ADRs |
|---|---|
| Database | 001 |
| Messaging | 002 |
| Authentication | 003 |
| API Design | 004, 007 |
| Infrastructure | 005 |
| Caching | 006 |
Multiple ADR Templates
# Template 2: Problem-Oriented ADR
## Problem
[What problem are we solving?]
## Constraints
[What constraints exist? Budget, time, team expertise, compliance]
## Options
### Option A: [Name]
- Cost: [Development, operational, migration]
- Complexity: [Low/Medium/High]
- Risk: [What could go wrong?]
- Pros: [List]
- Cons: [List]
### Option B: [Name]
...
## Recommendation
[Which option is chosen and why]
---
# Template 3: Lightweight ADR (for small decisions)
## Decision
[One-line description]
## Why
[Brief rationale, 2-3 sentences]
## Trade-offs
[List of accepted trade-offs]
ADR Tooling
| Tool | Storage | Collaboration | Review Workflow |
|---|---|---|---|
| Markdown in Git | Code repository | PR-based | Yes (via PR review) |
| ADR Tools (npm) | Markdown + CLI | Git-based | Manual |
| Log4bra | Markdown + Web UI | Git-based | PR + visual explorer |
| Architecture Center | Database + Web UI | Collaborative | Built-in workflow |
| GitHub Issues | GitHub Issues | Thread-based | Issue comments |
RFC Process
What is an RFC?
A Request for Comments is a structured proposal for a significant technical change. RFCs invite discussion before a decision is made, capturing the design process and alternatives considered.
RFC Template
# RFC-015: Implement Feature Flags System
## Summary
Implement a feature flag system to enable gradual rollouts and A/B testing.
## Motivation
Currently, all code deployments are risky because they affect all users
immediately. We need the ability to:
- Roll out features gradually (canary releases)
- A/B test features with statistical significance
- Kill switch problematic features instantly
- Enable features for internal teams before public release
## Detailed Design
### Components
1. **Flag Service**: Stores feature flag configuration with real-time updates
2. **Client SDKs**: Libraries for accessing flags (JavaScript, Python, Go)
3. **Management UI**: Dashboard for managing flags with audit logging
### Data Model
```json
{
"flag": "new-checkout",
"enabled": true,
"rollout": 10,
"targeting": {
"users": ["user-1", "user-2"],
"segments": ["beta-testers"],
"percentage": 10
},
"metadata": {
"owner": "checkout-team",
"created": "2026-05-01",
"expires": "2026-07-01"
}
}
```text
### API Endpoints
- `GET /flags` — List all flags
- `POST /flags` — Create flag
- `PUT /flags/:name` — Update flag configuration
- `DELETE /flags/:name` — Delete flag (soft delete with audit)
- `GET /flags/:name/evaluate` — Evaluate flag for current context
### SDK Usage
```typescript
import { FeatureFlagClient } from '@company/feature-flags';
const client = new FeatureFlagClient({
apiKey: process.env.FLAG_API_KEY,
refreshInterval: 30000, // Poll every 30s
});
if (client.isEnabled('new-checkout', { user: currentUser })) {
renderNewCheckout();
} else {
renderLegacyCheckout();
}
```text
## Risks
- Feature flag technical debt if flags are not cleaned up after rollout
- Performance overhead from flag evaluation on every request
- Cognitive load on team to manage flag lifecycles
## Mitigations
- Mandatory expiry dates on all flags
- Automated flag cleanup for expired flags
- Monitoring dashboard for active flag count
- Code review requirement for permanent flags
## Alternatives
- **LaunchDarkly**: External SaaS, feature-rich but costly at scale
- **Unleash**: Open-source self-hosted, good for mid-scale
- **Build custom**: Full control, but ongoing maintenance cost
## Implementation Plan
### Phase 1 (Week 1-2)
- Core flag service with file-based configuration
- Basic evaluation API
- JavaScript SDK
### Phase 2 (Week 3-4)
- Management UI
- Advanced targeting (percentage, segments)
- Audit logging
### Phase 3 (Week 5-6)
- A/B testing integration
- Analytics pipeline
- Automated flag cleanup
## Open Questions
- Should we use polling or WebSockets for flag updates?
- What is the maximum acceptable latency for flag evaluation?
- How do we handle flag evaluation during service degradation?
## Comments
> **Question**: Should we use percentage-based rollout or user-ID based?
> **Answer**: Both. Percentage for initial rollout, user-ID for targeted testing.
>
> **Question**: What about caching?
> **Answer**: Flags should be cached locally with 30s TTL. Cache invalidation via polling.
## Decision
[To be filled after review]
RFC Workflow
RFC Lifecycle:
┌─────────────────────────────────────────────────────────────────┐
│ │
│ Draft ──→ Review ──→ Discussion ──→ Decision ──→ Implementation│
│ ↑ │ │
│ └────────────────────────────────────────┘ │
│ Revise & Resubmit │
│ │
│ States: │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌───────────────┐ │
│ │ Draft │→│ Review │→│ Final │→│ Accepted/Reject│ │
│ └──────────┘ └──────────┘ └──────────┘ └───────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────┘
| Stage | Duration | Participants | Artifacts |
|---|---|---|---|
| Draft | 1-3 days | Author | Initial RFC document |
| Review | 3-7 days | Engineering team | Comments, questions |
| Discussion | 1-3 days | Stakeholders | Meeting notes, decisions |
| Final Comment | 2 days | All | Final feedback window |
| Decision | 1 day | Tech lead/Architect | Accepted or Rejected with rationale |
RFC Review Checklist
checklist:
problem_definition:
- "Is the problem clearly stated?"
- "Are success criteria defined?"
- "Is the scope well-defined?"
solution:
- "Is the proposed solution technically sound?"
- "Are alternatives documented?"
- "Are trade-offs acknowledged?"
risks:
- "Are security concerns addressed?"
- "Is scalability considered?"
- "Are failure modes documented?"
- "Is there a rollback plan?"
implementation:
- "Is the implementation plan realistic?"
- "Are dependencies identified?"
- "Is the timeline reasonable?"
- "Are testing strategies defined?"
Technical Writing
Writing Style Guide
# Clear vs unclear examples
# ❌ Unclear — vague, passive, no specifics
"The system processes data in a timely manner."
# ✅ Clear — specific, measurable
"The system processes data within 100ms at p95 latency."
# ❌ Unclear — no definition of "efficiently"
"This should be done efficiently."
# ✅ Clear — specific target
"This should complete within 5 seconds for datasets under 1GB."
# ❌ Passive voice — unclear who is responsible
"The configuration should be updated."
# ✅ Active voice — clear responsibility
"Operators must update the configuration file before deployment."
# ❌ Jargon without context — assumes reader knowledge
"Use eventual consistency for the CQRS read model."
# ✅ With context — explains the trade-off
"Use eventual consistency for the read model (CQRS) to improve write
throughput at the cost of temporarily stale reads. This is acceptable
because dashboard data does not require real-time accuracy."
Document Structure
# Document Title
## Status
[Review, Approved, Deprecated]
## Summary
Brief overview (1-3 sentences)
## Background
Why this document exists, what problem it addresses
## Current State
How things work today
## Proposed Change
What will change
## Detailed Design
Technical specifics, data models, API contracts
## Migration Plan
How to transition from current to proposed state
## Rollback Plan
How to undo if things go wrong
## Testing Strategy
How to validate the change
## Security Considerations
Auth, data privacy, compliance
## Open Questions
Decisions still to be made
## References
Related documents, ADRs, RFCs
Documentation Tone Matrix
| Audience | Tone | Example |
|---|---|---|
| Junior developers | Explanatory, includes background | “An API gateway sits between clients and services…” |
| Senior developers | Concise, references patterns | “Use a BFF pattern to aggregate data per client type.” |
| Operations | Procedural, precise | “Run kubectl apply -f deployment.yaml in us-east-1 cluster.” |
| Product managers | Outcome-focused, minimal tech | “This change reduces checkout time by 40%.” |
| External contributors | Guarded, explicit conventions | “All PRs must include tests and follow CONTRIBUTING.md.” |
C4 Model for Architecture Diagrams
The C4 model provides hierarchical diagrams for software architecture.
Level 1: System Context
[User] ──→ [E-Commerce System] ──→ [Payment Gateway]
↓
[Email Service]
Level 2: Containers
┌─────────────────────────────────────────────────────┐
│ E-Commerce System │
│ │
│ ┌────────────┐ ┌────────────┐ ┌────────────┐ │
│ │ Web App │ │ API Server │ │ Admin Panel │ │
│ │ (React) │ │ (Go) │ │ (React) │ │
│ └─────┬──────┘ └─────┬──────┘ └──────┬──────┘ │
│ │ │ │ │
│ └───────────────┼───────────────┘ │
│ │ │
│ ┌─────────▼──────────┐ │
│ │ PostgreSQL │ │
│ │ (Primary + 2 R) │ │
│ └────────────────────┘ │
│ ┌────────────────────┐ │
│ │ Redis │ │
│ │ (Cache + Session)│ │
│ └────────────────────┘ │
└─────────────────────────────────────────────────────┘
Level 3: Components
┌──────────────────────────────────────┐
│ API Server (Go) │
│ │
│ ┌────────────┐ ┌────────────────┐ │
│ │ Router │ │ Auth Middleware │ │
│ └─────┬──────┘ └───────┬────────┘ │
│ │ │ │
│ ┌─────▼──────────────────▼────────┐ │
│ │ Handlers │ │
│ │ ┌──────┐ ┌──────┐ ┌──────┐ │ │
│ │ │Users │ │Orders│ │Products│ │ │
│ │ └──┬───┘ └──┬───┘ └───┬───┘ │ │
│ └─────┼────────┼─────────┼───────┘ │
│ │ │ │ │
│ ┌─────▼────────▼─────────▼───────┐ │
│ │ Repositories │ │
│ │ ┌──────┐ ┌──────┐ ┌──────┐ │ │
│ │ │Users │ │Orders│ │Products│ │ │
│ │ └──┬───┘ └──┬───┘ └──┬───┘ │ │
│ └─────┼────────┼────────┼──────┘ │
│ │ │ │ │
│ ┌─────▼────────▼────────▼──────┐ │
│ │ Database Connection Pool │ │
│ └───────────────────────────────┘ │
└──────────────────────────────────────┘
C4 Tooling
| Tool | Format | Collaboration | Diagram as Code |
|---|---|---|---|
| Structurizr | DSL | Web UI + Workspaces | Yes |
| PlantUML | Text | Git-based | Yes |
| Mermaid.js | Text (Markdown) | Git + PR review | Yes |
| Diagrams.net | XML/DrawIO | Shared files | No |
Documentation as Code
CI/CD for Documentation
# .github/workflows/docs.yml
name: Documentation Validation
on:
pull_request:
paths:
- 'docs/**'
- 'adr/**'
- '**/*.md'
jobs:
validate-docs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Check ADR format
run: |
for adr in adr/*.md; do
# Verify frontmatter
head -1 "$adr" | grep -q "^# ADR-" || echo "Missing ADR header in $adr"
done
- name: Validate markdown links
uses: gaurav-nelson/github-action-markdown-link-check@v1
with:
use-quiet-mode: yes
config-file: '.mlc-config.json'
- name: Check for broken internal references
run: |
python scripts/validate_adr_references.py
- name: Generate ADR index
run: |
python scripts/generate_adr_index.py
OpenAPI as Documentation
OpenAPI specifications serve as living API documentation.
openapi: 3.1.0
info:
title: E-Commerce API
version: 2.0.0
description: |
API for the e-commerce platform.
See [ADR-001](../adr/adr-001-database.md) for database decisions.
See [RFC-015](../rfc/rfc-015-feature-flags.md) for feature flag design.
servers:
- url: https://api.example.com/v2
description: Production
paths:
/orders:
get:
summary: List orders
parameters:
- name: status
in: query
schema:
type: string
enum: [pending, completed, cancelled]
responses:
'200':
description: List of orders
content:
application/json:
schema:
type: array
items:
$ref: '#/components/schemas/Order'
components:
schemas:
Order:
type: object
required: [id, userId, total, status]
properties:
id:
type: string
format: uuid
userId:
type: string
total:
type: number
minimum: 0
status:
type: string
enum: [pending, completed, cancelled]
Documentation Reviews
Review Types
| Type | Focus | Participants | Duration |
|---|---|---|---|
| Technical review | Correctness, feasibility | Engineers, architects | 1 hour |
| Editorial review | Clarity, grammar, structure | Tech writer | 30 min |
| Security review | Security implications | Security team | 1 hour |
| Stakeholder review | Business alignment | PM, stakeholders | 30 min |
| User review | Understandability | Target audience | 15 min |
Best Practices
- Keep ADRs small: One decision per ADR. A single ADR should fit in a 15-minute read.
- Write them early: Document decisions when made, not months later when context is lost.
- Include context: Explain the why, not just the what. Context is the most valuable part.
- Review regularly: Update or deprecate stale decisions during architecture reviews.
- Use templates: Consistency helps readability and ensures nothing is forgotten.
- Store with code: ADRs and RFCs belong in the repository alongside the implementation.
- Link documents: Cross-reference ADRs, RFCs, and code for traceability.
- Prefer diagrams: A good diagram communicates more than paragraphs of text.
- Write for your audience: Different readers need different levels of detail.
- Make it findable: Use indexes, tags, and consistent naming conventions.
Conclusion
Good architecture documentation helps teams make better decisions and onboard new members faster. ADRs and RFCs are essential tools for capturing and communicating technical choices. Combined with diagram-as-code tools like C4 and documentation-as-code automation, teams can maintain living documentation that evolves with the system.
The key is consistency: document decisions when they are made, use templates to ensure completeness, and store documentation alongside code so it stays connected to the implementation.
Resources
- ADR GitHub Organization — ADR tools and templates
- C4 Model — Simon Brown’s C4 model for software architecture
- Structurizr — Diagram as code for C4 models
- Mermaid.js Documentation — Markdown-native diagramming
- RFC 2119: Key Words — MUST/SHOULD/MAY conventions
- Documenting Architecture Decisions — Original ADR article
- OpenAPI Specification — API documentation standard
- Write the Docs — Technical writing best practices
- Google Documentation Style Guide — Technical writing reference
- PlantUML — UML diagram as code
Comments