ADR-002: Markdown Model

Status

Accepted

Context

A note-taking app must decide how to represent notes internally. The key question:

What is the source of truth for a note's content?

Options:

Markdown text - Store raw markdown, parse on demand
AST (Abstract Syntax Tree) - Store parsed structure
Hybrid - Store both, keep them in sync

This decision affects:

Data portability
Export fidelity
Parse error handling
Feature implementation complexity

Decision

Markdown text is the canonical source of truth.

Preservation Invariants

Invariant	Rule
Raw text is sacred	User-typed markdown is NEVER auto-modified
AST is ephemeral	Exists only for parsing, never persisted as authority
Parse errors don't block	If parse fails, save raw markdown anyway
No normalization	Never reformat whitespace, headings, lists
No serialization from AST	Never reconstruct markdown from parsed AST

Implementation

typescript

interface Note {
  id: NoteId;
  content: string; // RAW MARKDOWN - canonical source
  metadata: NoteMetadata; // Derived from content, computed on save
}

interface NoteMetadata {
  title: string; // Extracted from first H1 or filename
  createdAt: Timestamp;
  updatedAt: Timestamp;
  tags: readonly Tag[]; // Extracted from #tags in content
  wordCount: number;
}

Derived Data

These are computed from markdown on save, stored for queries:

Title (first H1 or "Untitled")
Tags (from #hashtags in content)
Word count
Backlinks (from [[wikilinks]])

Golden Rule

"If the user typed it, we keep it."

Consequences:

No "prettify markdown" feature
No auto-fix for broken links
No whitespace normalization
Export = exact copy of stored markdown

Consequences

Positive

Portable: Export is always valid, standalone markdown
Predictable: User knows exactly what's stored
Recoverable: Parse errors can't corrupt data
Simple: One source of truth, no sync issues
Compatible: Works with any markdown ecosystem

Negative

Performance: Must parse on every read for features
Consistency: Derived data could get out of sync
Features limited: Can't do AST-based transformations

Risks

Performance with large notes (>100KB)
Mitigation: Cache parsed results, lazy parsing

Alternatives Considered

1. AST as Source of Truth

Store the parsed AST, serialize to markdown on export.

Rejected because:

Information loss during parsing (whitespace, formatting choices)
Different markdown flavors produce different ASTs
Reconstruction never produces identical output
Lock-in to specific parser implementation

2. Hybrid Storage

Store both raw markdown and AST, keep in sync.

Rejected because:

Complexity of keeping them synchronized
Which one wins on conflict?
Double storage space
More code to maintain

3. Block-Based (Notion-style)

Use structured blocks as the model.

Rejected because:

Not markdown-first
Export requires serialization
Different product identity
Higher complexity

ADR-003: Storage - How markdown is persisted

ADR-002: Markdown Model ​

Status ​

Context ​

Decision ​

Preservation Invariants ​

Implementation ​

Derived Data ​

Golden Rule ​

Consequences ​

Positive ​

Negative ​

Risks ​

Alternatives Considered ​

1. AST as Source of Truth ​

2. Hybrid Storage ​

3. Block-Based (Notion-style) ​

Related Decisions ​

ADR-002: Markdown Model

Status

Context

Decision

Preservation Invariants

Implementation

Derived Data

Golden Rule

Consequences

Positive

Negative

Risks

Alternatives Considered

1. AST as Source of Truth

2. Hybrid Storage

3. Block-Based (Notion-style)

Related Decisions