docs/architecture.md

# Architecture

tilstream is a static site generator with a very narrow remit: take
a directory of short markdown files, produce HTML, RSS, and a JSON
search index. This document is the tour I would give if you wanted
to change how any of that works.

## The pipeline

    content/*.md
         |
         v
    +----------+    +----------+    +----------+
    |  parse   | -> |  build   | -> |  write   |
    +----------+    +----------+    +----------+
         |               |               |
         |               |               +-> public/*.html
         |               |               +-> public/feed.xml
         |               |               +-> public/search.json
         |               |               +-> public/index.html
         |               v
         |         templates (text/template)
         v
    front-matter + body AST

- **parse**: walk `--src`, read each `.md` file, extract YAML
  front-matter, parse markdown body via goldmark.
- **build**: combine each post's metadata, its rendered HTML body,
  and a shared context into a `Post` struct; build aggregate
  structures for feed and index.
- **write**: render templates, write files under `--out`.

Everything is synchronous except the watch loop (see
[`internal/watch/watch.go`](/src/tilstream/internal-watch-watch-go/)).

## Why goldmark

Switched in `11a0f6e`. The previous renderer was a tiny in-tree
CommonMark implementation that I wrote because I wanted zero
dependencies; it didn't handle tables, strikethrough, or footnotes
correctly once I started needing them. goldmark is:

- CommonMark-compliant by default
- extensible via its `extension` package (we use tables, footnotes,
  strikethrough, task lists)
- fast enough that building the whole site is dominated by I/O

Swapping out goldmark for another parser would be a one-file change
in [`internal/render/render.go`](/src/tilstream/internal-render-render-go/).

## Post structure

    type Post struct {
        Slug       string            // derived from filename
        Title      string            // from front-matter
        Date       time.Time         // from front-matter
        Tags       []string          // from front-matter
        Draft      bool              // from front-matter
        Source     string            // original filename
        BodyHTML   template.HTML     // goldmark output
        BodyText   string            // plain text for search index
        Categories []string          // from front-matter, used in feed
    }

`BodyHTML` is `template.HTML` so `text/template` trusts it. The
`BodyText` flattened form is what the search index serializes; it's
the AST walked to pull out text nodes only, stripping code blocks
and HTML.

## Rendering

`internal/render/render.go` owns goldmark configuration and
template execution. The template set is:

- `base.tmpl`: outer HTML shell (doctype, head, nav, footer)
- `post.tmpl`: per-post page, extends base
- `index.tmpl`: list of posts, extends base
- `tag.tmpl`: a filtered list for a single tag

Templates are parsed once at startup from `templates/*.tmpl`. Users
override by placing their own file with the same name in
`--templates`. See
[docs/template-guide.md](/src/tilstream/docs-template-guide-md/) for
variables available inside each template.

Footnotes: commit `a3b1e52` fixed a crash where goldmark emitted a
footnote reference pointing at a footnote block nested inside a
list item. Fixed by visiting the AST to normalise footnote IDs
before rendering.

## Feed generation

`internal/feed/rss.go` generates RSS 2.0. Not Atom - I checked, most
feed readers take either, and RSS has fewer ways to be wrong. The
channel metadata comes from a config file at the site root; item
fields come from posts.

GUID stability matters. We include the post's categories in the GUID
(`d9c4f01`) so a re-tag doesn't keep the same GUID - subscribers
see it as a new entry, which is desirable when the tagging actually
changes what the post is about. That was a deliberate trade.

Atom generation is a future possibility and would be parallel to
`rss.go`. No current plan.

## Search index

`internal/index/index.go` produces `search.json`:

    [
      { "slug": "ripgrep-iglob",
        "title": "ripgrep --iglob for case-insensitive globs",
        "date": "2025-01-18",
        "tags": ["cli","ripgrep"],
        "text": "Full plain-text body..."
      },
      ...
    ]

Drafts are skipped (`7c218ba`). The output is loaded by a small
JS snippet on the site that does a linear scan; with ~200 TILs,
that's well under 10 ms, and there's no server dependency. Larger
sites would want a proper index (minisearch, lunr); this tool is
scoped below that threshold.

## CLI

`main.go` exposes three subcommands:

- `tilstream build --src notes --out public [--drafts]`
- `tilstream serve --src notes --port 8080` - runs `build` then
  starts a watcher and a dev http server
- `tilstream clean --out public` - wipes the output dir

Flags come from Go's standard `flag` package. No cobra, no kong.

## Watcher

`internal/watch/watch.go` wraps fsnotify. On a change, it debounces
by 150 ms (`4ed509d`) because editors typically produce a burst of
file events on save. After the debounce window, we re-run build,
then broadcast a reload message to any connected browsers via a
small SSE endpoint.

The watcher also re-reads the template dir. If you edit a template,
all posts are re-rendered; if you edit a single post, only that post
is re-rendered. The granularity matters because rebuilding the full
site for a 200-post blog takes ~350 ms on my machine, and we want
the `serve` mode snappy.

## Concurrency

Build runs sequentially. I have a `parallel-build` branch that
fans out the parse + render across goroutines, which cut wall-clock
time from 350 ms to 90 ms. I haven't merged it because the I/O hit
of reading files is the bottleneck anyway on my HDD'd personal box
(the SSD laptop is already fast enough that parallel wasn't a
visible difference). It's parked.

The watcher runs the build in a goroutine to keep the watcher loop
responsive; only one build runs at a time, any burst during a build
queues a single "dirty" flag that re-runs after the current build.

## File layout

| Path                                                                                   | Purpose                                         |
|---------------------------------------------------------------------------------------|-------------------------------------------------|
| [`main.go`](/src/tilstream/main-go/)                                                  | CLI dispatch                                    |
| [`internal/render/render.go`](/src/tilstream/internal-render-render-go/)              | goldmark config + template execution            |
| [`internal/feed/rss.go`](/src/tilstream/internal-feed-rss-go/)                        | RSS 2.0 generator                               |
| [`internal/index/index.go`](/src/tilstream/internal-index-index-go/)                  | search.json builder                             |
| [`internal/watch/watch.go`](/src/tilstream/internal-watch-watch-go/)                  | fsnotify debounce + live reload                 |
| [`templates/post.tmpl`](/src/tilstream/templates-post-tmpl/)                          | default post template                           |

## Non-goals

- Themes. Not a supported concept. Override templates instead.
- Comments. Static site; use Mastodon/links out.
- Taxonomies beyond tags. Hugo's five-tier taxonomy system is not
  coming.
- Internationalisation. I write in one language; if you write in
  two, use Hugo.

## Extension points

Three pluggable-ish places, in order of how often I've used them:

1. **Template overrides.** Drop a file under `--templates` to
   replace any of the defaults. Variables and partials are
   documented in the template guide.
2. **Front-matter keys.** The parser preserves every top-level YAML
   key as `.Meta[key]` inside the template context. Add your own
   without code changes.
3. **goldmark extensions.** Editing
   [`internal/render/render.go`](/src/tilstream/internal-render-render-go/)
   to add a goldmark extension is one line. No plugin system.

## Performance

- 200 posts, cold cache, my laptop: ~350 ms
- 200 posts, warm cache (watch rebuild): ~90 ms
- Output is ~3.5 MB uncompressed; gzip brings it to ~600 KB

Memory is a flat ~25 MB during build. No streaming anywhere;
everything fits in RAM easily at the sizes I care about.

## Tests

`go test ./...` covers the parse, render, feed, and index packages.
The watcher is hand-tested; I would accept a patch that adds
coverage using fsnotify fakes.