# Architecture
This document describes how httptap is put together. If you only want to
run the tool, the [README](/src/httptap/readme-md/) is all you need. If
you are trying to extend it, fix a verifier rejection, or understand why
the TUI sometimes freezes for a few hundred milliseconds while a fork
storm is happening, read on.
## The shape of the problem
httptap tries to answer "what HTTP did this process just send?" without
a proxy and without holding a private key. The only place the plaintext
is reliably visible is at the boundary between the application's TLS
library and the kernel, just before `send()` encrypts or just after
`recv()` decrypts. Attaching probes there gives us the bytes we want
without any in-path rewriting.
That bounds the design. Everything else grew out of that.
## Components
+-------------------------------------------------+
| TUI |
| bubbletea program + renderer |
+-----------------------^-------------------------+
| events (flow updates)
+-----------------------+-------------------------+
| parser |
| http/1 line+header parser, pairs req/resp, |
| decodes chunked TE, normalises header casing |
+-----------------------^-------------------------+
| raw plaintext buffers
+-----------------------+-------------------------+
| tracer |
| loads BPF object, manages probes, drains |
| the ring buffer, attributes bytes to a pid+fd |
+-----------------------^-------------------------+
| perf ring buffer
+-----------------------+-------------------------+
| kernel (BPF) |
| uretprobes on SSL_read / SSL_write / Go TLS |
| helpers copy userspace buffers into the ring |
+-------------------------------------------------+
Each horizontal line is an interface I can swap independently. The TUI
doesn't know whether the data came from OpenSSL or Go's crypto/tls. The
parser doesn't know whether the bytes arrived over a ring buffer or
from a test fixture. Most of the test suite exploits this by feeding
canned byte streams into the parser with no tracer attached.
## The kernel side
`internal/tracer/tracer.bpf.c` contains every BPF program. There are
three families:
1. **OpenSSL/GnuTLS uretprobes.** `SSL_read`, `SSL_write`,
`gnutls_record_send`, `gnutls_record_recv`. These all have the shape
`fn(ssl, buf, count) -> int`, so one attribute macro expands into
four probes.
2. **Go crypto/tls.** Go's TLS lives inside the binary. The userspace
tracer walks the target's ELF for `crypto/tls.(*Conn).Write` and
`crypto/tls.(*Conn).Read` and attaches uretprobes at those symbols.
Arguments come out of registers rather than the stack on amd64; the
`go_tls_ctx` struct in the BPF program reflects that.
3. **Per-fd bookkeeping.** A small map from `(pid, ssl_ptr)` to an
integer flow id. The userspace side populates `(ssl_ptr -> fd)` by
hooking `SSL_set_fd`. That lets us stamp events with a stable flow
identifier across a connection.
The BPF programs intentionally do as little as possible. They never
touch the TCP state, never lock anything, never emit more than one
record per hit. The short hot path keeps us well under the verifier's
instruction limit and below the realtime budget imposed by anything
running a probe on a busy cache path.
## The userspace tracer
`internal/tracer/tracer.go` owns:
- loading the pre-compiled BPF object (embedded via `go:embed`)
- attaching probes (uretprobe attach is via cilium/ebpf's `link`)
- symbol resolution for Go binaries (elf + DWARF when available)
- draining the perf ring buffer into a Go channel of `rawEvent`
- attributing `rawEvent` to the right `(pid, flow)` tuple
Draining runs on a single goroutine per CPU. cilium/ebpf's ringbuf
reader handles lost samples; on overflow we increment a counter and
surface it in the TUI footer as "lost events: N". That's the worst-case
behaviour I care about: we never block the kernel side on our own
backpressure.
## The parser
`internal/parser/http.go` is a hand-written HTTP/1 parser. Not using
`net/http` was deliberate: the stream we get from the kernel is not a
well-formed `io.Reader` connection but a sequence of fragments, each
labelled with direction. The parser's state machine mirrors that.
State per flow:
enum state { REQUEST_LINE, REQUEST_HEADERS, REQUEST_BODY,
RESPONSE_LINE, RESPONSE_HEADERS, RESPONSE_BODY,
DONE }
A fragment arriving on flow F is appended to the flow's buffer, and we
try to advance as far as the state allows. Content-Length and
Transfer-Encoding: chunked both work; HTTP/2 and HTTP/3 are emitted as
"HTTP2 frame" events with minimal decoding (commit `c0912ad` added
header-case normalisation; `a3f8b2c` fixed an off-by-one in chunk
sizing).
Paired requests and responses get promoted to a `Flow` struct which
then travels to the TUI.
## The TUI
`internal/tui/model.go` is a bubbletea `Model`. Three main pieces:
- a **list viewport** of flows, sorted by last-updated
- a **detail view** shown on Enter, with request/response bodies
- a **filter bar** bound to `/`
The renderer subscribes to the tracer's event channel via a
`tea.Cmd` that blocks on the channel and returns a `flowsUpdated`
message. bubbletea redraws at around 60 Hz when there's pending work
and idles otherwise.
Scroll performance is the feature that gets the most attention. On a
target handling thousands of RPS, the list mutates faster than the TUI
can redraw; I keep a linked list of flows and render only the window
visible on screen. Anything off-screen exists as a struct but never
gets its headers formatted.
## Concurrency model
There are only a handful of goroutines:
- **main** (runs the bubbletea loop)
- **ringbuf drain**: one per online CPU, reads `rawEvent`s and pushes
to a single `events` channel
- **parser**: one goroutine per flow, owns the flow's parser state
- **probe lifecycle**: one goroutine that watches `/proc/<pid>` for
fork/exit when `--follow-forks` is set
Flow goroutines exit on connection close or tracer shutdown. The
parser goroutine does not touch the TUI directly; it sends update
messages on a channel the TUI receives.
All shared state is either channel-passed or protected by small
mutexes. There are no `sync.Map`s because the hot paths are either
lock-free (the ringbuf) or single-owner (the per-flow parser).
## Boundaries: trust and safety
httptap reads plaintext from other processes. That means:
- it must be root or have `CAP_BPF` + `CAP_PERFMON`
- it should not write the captured bytes to disk unless the user asks
(the `y` yank-to-clipboard command is an explicit action)
- it should never modify the target process; we only attach
uretprobes, which are read-only instrumentation
I treat the captured buffers as untrusted for display purposes: the
TUI strips control characters before rendering, since otherwise a log
line from a target process could inject ANSI escape codes. See
[`internal/tui/model.go`](/src/httptap/internal-tui-model-go/) for the
sanitisation function.
## Data flow, concrete example
A curl to `https://httpbin.org/post` from pid 42317:
1. curl calls `SSL_write(ssl, "POST /post HTTP/1.1...", 254)`.
2. BPF uretprobe fires. The program reads `buf` using
`bpf_probe_read_user`, caps the copy at 1024 bytes (ring buffer
slot size), emits an event with `flow_id = hash(pid, ssl)`,
`direction = OUT`, `len = 254`.
3. Userspace ringbuf drain goroutine picks up the event, sends it on
the tracer channel.
4. The parser goroutine for that flow appends the bytes, advances
past the request line + headers + body, marks state = response.
5. curl blocks on `SSL_read`. Server responds. `SSL_read` returns;
another uretprobe fires; bytes flow up the same way with
`direction = IN`.
6. Parser pairs request with response, emits `FlowCompleted`.
7. TUI redraws the list with the new flow on top.
The whole thing is under 5 ms of added latency on my laptop, mostly in
userspace. The kernel side is invisible in `perf top`.
## What lives where
| Path | Purpose |
|------------------------------------------------------------------------------------------------------|-----------------------------------------|
| [`cmd/httptap/main.go`](/src/httptap/cmd-httptap-main-go/) | CLI flag parsing, tracer startup |
| [`internal/tracer/tracer.go`](/src/httptap/internal-tracer-tracer-go/) | BPF loader + ringbuf drain |
| [`internal/tracer/tracer.bpf.c`](/src/httptap/internal-tracer-tracer-bpf-c/) | kernel-side probes |
| [`internal/parser/http.go`](/src/httptap/internal-parser-http-go/) | HTTP/1 parser |
| [`internal/tui/model.go`](/src/httptap/internal-tui-model-go/) | bubbletea model + renderer |
## Non-goals
- MITM. There will never be a mode that intercepts and rewrites. The
whole point is to avoid that complexity.
- Capture on a foreign box. If you need that, point at `tcpdump` plus
keylog files from SSLKEYLOGFILE.
- Long-term flow storage. httptap is a live inspector. Scroll back, yank
a curl, move on. If you need persistence, pipe via `--jsonl` to your
own collector.
## Future shape
The BPF side will stay small. The interesting evolutions I see are:
- Getting HTTP/2 stream reassembly right enough to show request bodies.
- A Windows port, probably by hooking `SspiEncryptMessage`. I don't own
a Windows box to test on, so it's parked.
- Optional export of flows as HAR files, for replay in Charles/Proxyman.
Each of those is independent; I would tackle them in that order if I
had to choose.