ADR-002: Markdown Model
Status
Accepted
Context
A note-taking app must decide how to represent notes internally. The key question:
What is the source of truth for a note's content?
Options:
- Markdown text - Store raw markdown, parse on demand
- AST (Abstract Syntax Tree) - Store parsed structure
- Hybrid - Store both, keep them in sync
This decision affects:
- Data portability
- Export fidelity
- Parse error handling
- Feature implementation complexity
Decision
Markdown text is the canonical source of truth.
Preservation Invariants
| Invariant | Rule |
|---|---|
| Raw text is sacred | User-typed markdown is NEVER auto-modified |
| AST is ephemeral | Exists only for parsing, never persisted as authority |
| Parse errors don't block | If parse fails, save raw markdown anyway |
| No normalization | Never reformat whitespace, headings, lists |
| No serialization from AST | Never reconstruct markdown from parsed AST |
Implementation
typescript
interface Note {
id: NoteId;
content: string; // RAW MARKDOWN - canonical source
metadata: NoteMetadata; // Derived from content, computed on save
}
interface NoteMetadata {
title: string; // Extracted from first H1 or filename
createdAt: Timestamp;
updatedAt: Timestamp;
tags: readonly Tag[]; // Extracted from #tags in content
wordCount: number;
}Derived Data
These are computed from markdown on save, stored for queries:
- Title (first H1 or "Untitled")
- Tags (from #hashtags in content)
- Word count
- Backlinks (from [[wikilinks]])
Golden Rule
"If the user typed it, we keep it."
Consequences:
- No "prettify markdown" feature
- No auto-fix for broken links
- No whitespace normalization
- Export = exact copy of stored markdown
Consequences
Positive
- Portable: Export is always valid, standalone markdown
- Predictable: User knows exactly what's stored
- Recoverable: Parse errors can't corrupt data
- Simple: One source of truth, no sync issues
- Compatible: Works with any markdown ecosystem
Negative
- Performance: Must parse on every read for features
- Consistency: Derived data could get out of sync
- Features limited: Can't do AST-based transformations
Risks
- Performance with large notes (>100KB)
- Mitigation: Cache parsed results, lazy parsing
Alternatives Considered
1. AST as Source of Truth
Store the parsed AST, serialize to markdown on export.
Rejected because:
- Information loss during parsing (whitespace, formatting choices)
- Different markdown flavors produce different ASTs
- Reconstruction never produces identical output
- Lock-in to specific parser implementation
2. Hybrid Storage
Store both raw markdown and AST, keep in sync.
Rejected because:
- Complexity of keeping them synchronized
- Which one wins on conflict?
- Double storage space
- More code to maintain
3. Block-Based (Notion-style)
Use structured blocks as the model.
Rejected because:
- Not markdown-first
- Export requires serialization
- Different product identity
- Higher complexity
Related Decisions
- ADR-003: Storage - How markdown is persisted