From 2962590a7493b8db7d09ea97d729dcbb013c1c39 Mon Sep 17 00:00:00 2001 From: Pim van Pelt Date: Sat, 14 Mar 2026 20:50:10 +0100 Subject: [PATCH] Update README.md --- README.md | 348 +++++++++++++++++++++++++----------------------------- 1 file changed, 162 insertions(+), 186 deletions(-) diff --git a/README.md b/README.md index a18f4c8..54b1d61 100644 --- a/README.md +++ b/README.md @@ -1,27 +1,27 @@ SPECIFICATION -This project contains three programs: -1) A collector that can tail any number of nginx logfiles, and will keep a data structure of -{website,client_prefix,http_request_uri,http_response} across all logfiles in memory. It is -queryable and can give topN clients by website and by http_request; in other words I can see "who is -causing the most HTTP 429" or "what is the busiest website". This program pre-aggregates the logs -into a queryable structure. It runs on any number (10 or so) of nginx machines in a cluster. There -is no UI here, only a gRPC interface. +This project contains four programs: -2) An aggregator that can query the first one and show global stats and trending information. It needs -to be able to show global aggregated information from the first (collectors) to show 'what is the -busiest nginx' in addition to 'what is the busiest website' or 'which client_prefix or -http_request_uri is causing the most HTTP 503s'. It runs on a central machine and can show trending -information; useful for ddos detection. This aggregator is an RPC client of the collectors, and -itself presents a gRPC interface. +1) A **collector** that tails any number of nginx log files and maintains an in-memory structure of +`{website, client_prefix, http_request_uri, http_response}` counts across all files. It answers +TopN and Trend queries via gRPC and pushes minute snapshots to the aggregator via server-streaming. +Runs on each nginx machine in the cluster. No UI — gRPC interface only. -3) An HTTP companion frontend to the aggregator that can query either collector or aggregator and -answer user queries in a drilldown fashion, eg 'restrict to http_response=429' then 'restrict to -website=www.example.com' and so on. This is an interactive rollup UI that helps operators see -which websites are performing well, and which are performing poorly (eg excessive requests, -excessive http response errors, DDoS) +2) An **aggregator** that subscribes to the snapshot stream from all collectors, merges their data +into a unified in-memory cache, and exposes the same gRPC interface. Answers questions like "what +is the busiest website globally", "which client prefix is causing the most HTTP 503s", and shows +trending information useful for DDoS detection. Runs on a central machine. -Programs are written in Golang with a modern, responsive interactive interface. +3) An **HTTP frontend** companion to the aggregator that renders a drilldown dashboard. Operators +can restrict by `http_response=429`, then by `website=www.example.com`, and so on. Works with +either a collector or aggregator as its backend. Zero JavaScript — server-rendered HTML with inline +SVG sparklines and meta-refresh. + +4) A **CLI** for shell-based debugging. Sends `topn`, `trend`, and `stream` queries to any +collector or aggregator, fans out to multiple targets in parallel, and outputs human-readable +tables or newline-delimited JSON. + +Programs are written in Go. No CGO, no external runtime dependencies. --- @@ -33,26 +33,39 @@ DESIGN nginx-logtail/ ├── proto/ │ └── logtail.proto # shared protobuf definitions +├── internal/ +│ └── store/ +│ └── store.go # shared types: Tuple4, Entry, Snapshot, ring helpers └── cmd/ ├── collector/ │ ├── main.go - │ ├── tailer.go # tail multiple log files via fsnotify, handle logrotate - │ ├── parser.go # tab-separated logtail log_format parser + │ ├── tailer.go # MultiTailer: tail N files via one shared fsnotify watcher + │ ├── parser.go # tab-separated logtail log_format parser (~50 ns/line) │ ├── store.go # bounded top-K in-memory store + tiered ring buffers - │ └── server.go # gRPC server with server-streaming StreamSnapshots + │ └── server.go # gRPC server: TopN, Trend, StreamSnapshots ├── aggregator/ │ ├── main.go - │ ├── subscriber.go # opens streaming RPC to each collector, merges into cache - │ ├── merger.go # merge/sum TopN entries across sources - │ ├── cache.go # merged snapshot + tiered ring buffer served to frontend + │ ├── subscriber.go # one goroutine per collector; StreamSnapshots with backoff + │ ├── merger.go # delta-merge: O(snapshot_size) per update + │ ├── cache.go # tick-based ring buffer cache served to clients │ └── server.go # gRPC server (same surface as collector) ├── frontend/ │ ├── main.go - │ ├── handler.go # HTTP handlers, filter state in URL query string - │ ├── client.go # gRPC client to aggregator (or collector) - │ └── templates/ # server-rendered HTML + inline SVG sparklines + │ ├── handler.go # URL param parsing, concurrent TopN+Trend, template exec + │ ├── client.go # gRPC dial helper + │ ├── sparkline.go # TrendPoints → inline SVG polyline + │ ├── format.go # fmtCount (space thousands separator) + │ └── templates/ + │ ├── base.html # outer HTML shell, inline CSS, meta-refresh + │ └── index.html # window tabs, group-by tabs, breadcrumb, table, footer └── cli/ - └── main.go # topn / trend / stream subcommands, JSON output + ├── main.go # subcommand dispatch and usage + ├── flags.go # shared flags, parseTargets, buildFilter, parseWindow + ├── client.go # gRPC dial helper + ├── format.go # printTable, fmtCount, fmtTime, targetHeader + ├── cmd_topn.go # topn: concurrent fan-out, table + JSON output + ├── cmd_trend.go # trend: concurrent fan-out, table + JSON output + └── cmd_stream.go # stream: multiplexed streams, auto-reconnect ``` ## Data Model @@ -78,13 +91,13 @@ Two ring buffers at different resolutions cover all query windows up to 24 hours Supported query windows and which tier they read from: | Window | Tier | Buckets summed | -|--------|--------|---------------| -| 1 min | fine | last 1 | -| 5 min | fine | last 5 | -| 15 min | fine | last 15 | -| 60 min | fine | all 60 | -| 6 h | coarse | last 72 | -| 24 h | coarse | all 288 | +|--------|--------|----------------| +| 1 min | fine | last 1 | +| 5 min | fine | last 5 | +| 15 min | fine | last 15 | +| 60 min | fine | all 60 | +| 6 h | coarse | last 72 | +| 24 h | coarse | all 288 | Every minute: snapshot live map → top-50K → append to fine ring, reset live map. Every 5 minutes: merge last 5 fine snapshots → top-5K → append to coarse ring. @@ -94,12 +107,12 @@ Every 5 minutes: merge last 5 fine snapshots → top-5K → append to coarse rin Entry size: ~30 B website + ~15 B prefix + ~50 B URI + 3 B status + 8 B count + ~80 B Go map overhead ≈ **~186 bytes per entry**. -| Structure | Entries | Size | -|-------------------------|------------|------------| -| Live map (capped) | 100 000 | ~19 MB | -| Fine ring (60 × 1-min) | 60 × 50 000 | ~558 MB | -| Coarse ring (288 × 5-min)| 288 × 5 000 | ~268 MB | -| **Total** | | **~845 MB** | +| Structure | Entries | Size | +|-------------------------|-------------|-------------| +| Live map (capped) | 100 000 | ~19 MB | +| Fine ring (60 × 1-min) | 60 × 50 000 | ~558 MB | +| Coarse ring (288 × 5-min)| 288 × 5 000| ~268 MB | +| **Total** | | **~845 MB** | The live map is **hard-capped at 100 K entries**. Once full, only updates to existing keys are accepted; new keys are dropped until the next rotation resets the map. This keeps memory bounded @@ -146,36 +159,38 @@ message TopNRequest { Filter filter = 1; GroupBy group_by = 2; int32 n = 3; Wi message TopNEntry { string label = 1; int64 count = 2; } message TopNResponse { repeated TopNEntry entries = 1; string source = 2; } -// Trend: one total count per minute bucket, for sparklines +// Trend: one total count per minute (or 5-min) bucket, for sparklines message TrendRequest { Filter filter = 1; Window window = 4; } message TrendPoint { int64 timestamp_unix = 1; int64 count = 2; } -message TrendResponse { repeated TrendPoint points = 1; } +message TrendResponse { repeated TrendPoint points = 1; string source = 2; } -// Streaming: collector pushes a snapshot after every minute rotation +// Streaming: collector pushes a fine snapshot after every minute rotation message SnapshotRequest {} message Snapshot { - string source = 1; - int64 timestamp = 2; - repeated TopNEntry entries = 3; // full top-50K for this bucket + string source = 1; + int64 timestamp = 2; + repeated TopNEntry entries = 3; // full top-50K for this bucket } service LogtailService { - rpc TopN(TopNRequest) returns (TopNResponse); - rpc Trend(TrendRequest) returns (TrendResponse); + rpc TopN(TopNRequest) returns (TopNResponse); + rpc Trend(TrendRequest) returns (TrendResponse); rpc StreamSnapshots(SnapshotRequest) returns (stream Snapshot); } // Both collector and aggregator implement LogtailService. -// Aggregator's StreamSnapshots fans out to all collectors and merges. +// The aggregator's StreamSnapshots re-streams the merged view. ``` ## Program 1 — Collector ### tailer.go -- One goroutine per log file. Opens file, seeks to EOF. -- Uses **fsnotify** (inotify on Linux) to detect writes. On `WRITE` event: read all new lines. -- On `RENAME`/`REMOVE` event (logrotate): drain to EOF of old fd, then **re-open** the original - path (with retry backoff) and resume from position 0. No lines are lost between drain and reopen. -- Emits `LogRecord` structs on a shared buffered channel (size 200 K — absorbs ~20 s of peak load). +- **`MultiTailer`**: one shared `fsnotify.Watcher` for all files regardless of count — avoids + the inotify instance limit when tailing hundreds of files. +- On `WRITE` event: read all new lines from that file's `bufio.Reader`. +- On `RENAME`/`REMOVE` (logrotate): drain old fd to EOF, close, start retry-open goroutine with + exponential backoff. Sends the new `*os.File` back via a channel to keep map access single-threaded. +- Emits `LogRecord` structs on a shared buffered channel (capacity 200 K — absorbs ~20 s of peak). +- Accepts paths via `--logs` (comma-separated or glob) and `--logs-file` (one path/glob per line). ### parser.go - Parses the fixed **logtail** nginx log format — tab-separated, fixed field order, no quoting: @@ -184,174 +199,130 @@ service LogtailService { log_format logtail '$host\t$remote_addr\t$msec\t$request_method\t$request_uri\t$status\t$body_bytes_sent\t$request_time'; ``` - Example line: - ``` - www.example.com 1.2.3.4 1741954800.123 GET /api/v1/search 200 1452 0.043 - ``` + | # | Field | Used for | + |---|-------------------|------------------| + | 0 | `$host` | website | + | 1 | `$remote_addr` | client_prefix | + | 2 | `$msec` | (discarded) | + | 3 | `$request_method` | (discarded) | + | 4 | `$request_uri` | http_request_uri | + | 5 | `$status` | http_response | + | 6 | `$body_bytes_sent`| (discarded) | + | 7 | `$request_time` | (discarded) | - Field positions (0-indexed): - - | # | Field | Used for | - |---|------------------|-----------------| - | 0 | `$host` | website | - | 1 | `$remote_addr` | client_prefix | - | 2 | `$msec` | (discarded) | - | 3 | `$request_method`| (discarded) | - | 4 | `$request_uri` | http_request_uri| - | 5 | `$status` | http_response | - | 6 | `$body_bytes_sent`| (discarded) | - | 7 | `$request_time` | (discarded) | - -- At runtime: `strings.SplitN(line, "\t", 8)` — single call, ~50 ns/line. No regex, no state machine. +- `strings.SplitN(line, "\t", 8)` — ~50 ns/line. No regex. - `$request_uri`: query string discarded at first `?`. -- `$remote_addr`: truncated to /24 (IPv4) or /48 (IPv6); prefix lengths configurable. -- Lines with fewer than 8 fields are silently skipped (malformed / truncated write). +- `$remote_addr`: truncated to /24 (IPv4) or /48 (IPv6); prefix lengths configurable via flags. +- Lines with fewer than 8 fields are silently skipped. ### store.go - **Single aggregator goroutine** reads from the channel and updates the live map — no locking on the hot path. At 10 K lines/s the goroutine uses <1% CPU. - Live map: `map[Tuple4]int64`, hard-capped at 100 K entries (new keys dropped when full). -- **Minute ticker**: goroutine heap-selects top-50K entries from live map, writes snapshot into - fine ring buffer slot, clears live map, advances fine ring head. -- Every 5 fine ticks: merge last 5 fine snapshots → heap-select top-5K → write to coarse ring. -- Fine ring: `[60]Snapshot` circular array. Coarse ring: `[288]Snapshot` circular array. - Each Snapshot is `[]TopNEntry` sorted desc by count (already sorted, merge is a heap pass). -- **TopN query path**: RLock relevant ring, sum the bucket range, group by dimension, apply filter, - heap-select top N. Worst case: 288×5K = 1.4M iterations — completes in <20 ms. -- **Trend query path**: for each bucket in range, sum counts of entries matching filter, emit one - `TrendPoint`. O(buckets × K) but result is tiny (max 288 points). +- **Minute ticker**: heap-selects top-50K entries, writes snapshot to fine ring, resets live map. +- Every 5 fine ticks: merge last 5 fine snapshots → top-5K → write to coarse ring. +- **TopN query**: RLock ring, sum bucket range, apply filter, group by dimension, heap-select top N. +- **Trend query**: per-bucket filtered sum, returns one `TrendPoint` per bucket. +- **Subscriber fan-out**: per-subscriber buffered channel; `Subscribe`/`Unsubscribe` for streaming. ### server.go -- gRPC server on configurable port (default :9090). -- `TopN` and `Trend`: read-only calls into store, answered directly. -- `StreamSnapshots`: on each minute rotation the store signals a broadcast channel; the streaming - handler wakes, reads the latest snapshot from the ring, and sends it to all connected aggregators. - Uses `sync.Cond` or a fan-out via per-subscriber buffered channels. +- gRPC server on configurable port (default `:9090`). +- `TopN` and `Trend`: unary, answered from the ring buffer under RLock. +- `StreamSnapshots`: registers a subscriber channel; loops `Recv` on it; 30 s keepalive ticker. ## Program 2 — Aggregator ### subscriber.go -- On startup: dials each collector, calls `StreamSnapshots`, receives `Snapshot` messages. -- Each incoming snapshot is handed to **merger.go**. Reconnects with exponential backoff on - stream error. Marks collector as degraded after 3 failed reconnects; clears on success. +- One goroutine per collector. Dials, calls `StreamSnapshots`, forwards each `Snapshot` to the + merger. +- Reconnects with exponential backoff (100 ms → doubles → cap 30 s). +- After 3 consecutive failures: calls `merger.Zero(addr)` to remove that collector's contribution + from the merged view (prevents stale counts accumulating during outages). +- Resets failure count on first successful `Recv`; logs recovery. ### merger.go -- Maintains one `map[Tuple4]int64` per collector (latest snapshot only — no ring buffer here, - the aggregator's cache serves that role). -- On each new snapshot from a collector: replace that collector's map, then rebuild the merged - view by summing across all collector maps. Store merged result into cache.go's ring buffer. +- **Delta strategy**: on each new snapshot from collector X, subtract X's previous entries from + `merged`, add the new entries, store new map. O(snapshot_size) per update — not + O(N_collectors × snapshot_size). +- `Zero(addr)`: subtracts the collector's last-known contribution and deletes its entry — called + when a collector is marked degraded. ### cache.go -- Same ring-buffer structure as the collector store (60 slots), populated by merger. -- `TopN` and `Trend` queries are answered from this cache — no live fan-out needed at query time, - satisfying the 250 ms SLA with headroom. -- Also tracks per-collector entry counts for "busiest nginx" queries (answered by treating - `source` as an additional group-by dimension). +- **Tick-based rotation** (1-min ticker, not snapshot-triggered): keeps the aggregator ring aligned + to the same 1-minute cadence as collectors regardless of how many collectors are connected. +- Same tiered ring structure as the collector store; populated from `merger.TopK()` each tick. +- `QueryTopN`, `QueryTrend`, `Subscribe`/`Unsubscribe` — identical interface to collector store. ### server.go -- Implements the same `LogtailService` proto as the collector. -- `StreamSnapshots` on the aggregator re-streams merged snapshots to any downstream consumer - (e.g. a second-tier aggregator, or monitoring). +- Implements `LogtailService` backed by the cache (not live fan-out). +- `StreamSnapshots` re-streams merged fine snapshots; usable by a second-tier aggregator or + monitoring system. ## Program 3 — Frontend ### handler.go -- Filter state lives entirely in the **URL query string** (no server-side session needed; multiple - operators see independent views without shared state). Parameters: `w` (window), `by` (group_by), - `f_website`, `f_prefix`, `f_uri`, `f_status`. -- Main page: renders a ranked table. Clicking a row appends that dimension to the URL filter and - redirects. A breadcrumb shows active filters; each token is a link that removes it. -- **Auto-refresh**: `` — simple, reliable, no JS required. -- A `?raw=1` flag returns JSON for scripting/curl use. +- All filter state in the **URL query string**: `w` (window), `by` (group_by), `f_website`, + `f_prefix`, `f_uri`, `f_status`, `n`, `target`. No server-side session — URLs are shareable + and bookmarkable; multiple operators see independent views. +- `TopN` and `Trend` RPCs issued **concurrently** (both with a 5 s deadline); page renders with + whatever completes. Trend failure suppresses the sparkline without erroring the page. +- **Drilldown**: clicking a table row adds the current dimension's filter and advances `by` through + `website → prefix → uri → status → website` (cycles). +- **`raw=1`**: returns the TopN result as JSON — same URL, no CLI needed for scripting. +- **`target=` override**: per-request gRPC endpoint override for comparing sources. +- Error pages render at HTTP 502 with the window/group-by tabs still functional. + +### sparkline.go +- `renderSparkline([]*pb.TrendPoint) template.HTML` — fixed `viewBox="0 0 300 60"` SVG, + Y-scaled to max count, rendered as ``. Returns `""` for fewer than 2 points or + all-zero data. ### templates/ -- Base layout with filter breadcrumb and window selector tabs (1m / 5m / 15m / 60m / 6h / 24h). -- Table partial: columns are label, count, % of total, bar (inline ``). -- Sparkline partial: inline SVG polyline built from `TrendResponse.points` — 60 points, scaled to - the bucket's max, rendered server-side. No JS, no external assets. +- `base.html`: outer shell, inline CSS (~40 lines), conditional ``. +- `index.html`: window tabs, group-by tabs, filter breadcrumb with `×` remove links, sparkline, + TopN table with `` bars (% relative to rank-1), footer with source and refresh info. +- No external CSS, no web fonts, no JavaScript. Renders in w3m/lynx. ## Program 4 — CLI -A single binary (`cmd/cli/main.go`) for shell-based debugging and programmatic top-K queries. -Talks to any collector or aggregator via gRPC. All output is JSON. - ### Subcommands ``` -cli topn --target HOST:PORT [filter flags] [--by DIM] [--window W] [--n N] [--pretty] -cli trend --target HOST:PORT [filter flags] [--window W] [--pretty] -cli stream --target HOST:PORT [--pretty] +logtail-cli topn [flags] ranked label → count table (exits after one response) +logtail-cli trend [flags] per-bucket time series (exits after one response) +logtail-cli stream [flags] live snapshot feed (runs until Ctrl-C, auto-reconnects) ``` ### Flags -| Flag | Default | Description | -|---------------|--------------|--------------------------------------------------------| -| `--target` | `localhost:9090` | gRPC address of collector or aggregator | -| `--by` | `website` | Group-by dimension: `website`, `prefix`, `uri`, `status` | -| `--window` | `5m` | Time window: `1m` `5m` `15m` `60m` `6h` `24h` | -| `--n` | `10` | Number of top entries to return | -| `--website` | — | Filter: restrict to this website | -| `--prefix` | — | Filter: restrict to this client prefix | -| `--uri` | — | Filter: restrict to this request URI | -| `--status` | — | Filter: restrict to this HTTP status code | -| `--pretty` | false | Indent JSON output | +**Shared** (all subcommands): -### Output format +| Flag | Default | Description | +|--------------|------------------|----------------------------------------------------------| +| `--target` | `localhost:9090` | Comma-separated `host:port` list; fan-out to all | +| `--json` | false | Emit newline-delimited JSON instead of a table | +| `--website` | — | Filter: website | +| `--prefix` | — | Filter: client prefix | +| `--uri` | — | Filter: request URI | +| `--status` | — | Filter: HTTP status code | -**`topn`** — single JSON object, exits after one response: -```json -{ - "target": "agg:9091", "window": "5m", "group_by": "prefix", - "filter": {"status": 429, "website": "www.example.com"}, - "queried_at": "2026-03-14T12:00:00Z", - "entries": [ - {"rank": 1, "label": "1.2.3.0/24", "count": 8471}, - {"rank": 2, "label": "5.6.7.0/24", "count": 3201} - ] -} -``` +**`topn` only**: `--n 10`, `--window 5m`, `--group-by website` -**`trend`** — single JSON object, exits after one response: -```json -{ - "target": "agg:9091", "window": "24h", "filter": {"status": 503}, - "queried_at": "2026-03-14T12:00:00Z", - "points": [ - {"time": "2026-03-14T11:00:00Z", "count": 45}, - {"time": "2026-03-14T11:05:00Z", "count": 120} - ] -} -``` +**`trend` only**: `--window 5m` -**`stream`** — NDJSON (one JSON object per line, unbounded), suitable for `| jq -c 'select(...)'`: -```json -{"source": "nginx3:9090", "bucket_time": "2026-03-14T12:01:00Z", "entry_count": 42318, "top5": [{"label": "www.example.com", "count": 18000}, ...]} -``` +### Multi-target fan-out -### Example usage +`--target` accepts a comma-separated list. All targets are queried concurrently; results are +printed in order with a per-target header. Single-target output omits the header for clean +pipe-to-`jq` use. -```bash -# Who is hammering us with 429s right now? -cli topn --target agg:9091 --window 1m --by prefix --status 429 --n 20 | jq '.entries[]' +### Output -# Which website has the most 503s over the last 24h? -cli topn --target agg:9091 --window 24h --by website --status 503 +Default: human-readable table with space-separated thousands (`18 432`). +`--json`: one JSON object per target (NDJSON for `stream`). -# Trend of all traffic to one site over 6h (for a quick graph) -cli trend --target agg:9091 --window 6h --website api.example.com | jq '.points[] | [.time, .count]' - -# Watch live snapshots from one collector, filter for high-volume buckets -cli stream --target nginx3:9090 | jq -c 'select(.entry_count > 10000)' -``` - -### Implementation notes - -- Single `main.go` using the standard `flag` package with a manual subcommand dispatch — - no external CLI framework needed for three subcommands. -- Shares no code with the other binaries; duplicates the gRPC client setup locally (it's three - lines). Avoids creating a shared internal package for something this small. -- Non-zero exit code on any gRPC error so it composes cleanly in shell scripts. +`stream` reconnects automatically on error (5 s backoff). All other subcommands exit immediately +with a non-zero code on gRPC error. ## Key Design Decisions @@ -360,12 +331,17 @@ cli stream --target nginx3:9090 | jq -c 'select(.entry_count > 10000)' | Single aggregator goroutine in collector | Eliminates all map lock contention on the 10 K/s hot path | | Hard cap live map at 100 K entries | Bounds memory regardless of DDoS cardinality explosion | | Ring buffer of sorted snapshots (not raw maps) | TopN queries avoid re-sorting; merge is a single heap pass | -| Push-based streaming (collector → aggregator) | Aggregator cache is always fresh; query latency is cache-read only | -| Same `LogtailService` for collector and aggregator | Frontend works with either; useful for single-box and debugging | -| Filter state in URL, not session cookie | Supports multiple concurrent operators; shareable/bookmarkable URLs | +| Push-based streaming (collector → aggregator) | Aggregator cache always fresh; query latency is cache-read only | +| Delta merge in aggregator | O(snapshot_size) per update, not O(N_collectors × size) | +| Tick-based cache rotation in aggregator | Ring stays on the same 1-min cadence regardless of collector count | +| Degraded collector zeroing | Stale counts from failed collectors don't accumulate in the merged view | +| Same `LogtailService` for collector and aggregator | CLI and frontend work with either; no special-casing | +| `internal/store` shared package | ~200 lines of ring-buffer logic shared between collector and aggregator | +| Filter state in URL, not session cookie | Multiple concurrent operators; shareable/bookmarkable URLs | | Query strings stripped at ingest | Major cardinality reduction; prevents URI explosion under attack | | No persistent storage | Simplicity; acceptable for ops dashboards (restart = lose history) | | Trusted internal network, no TLS | Reduces operational complexity; add a TLS proxy if needed later | | Server-side SVG sparklines, meta-refresh | Zero JS dependencies; works in terminal browsers and curl | -| CLI outputs JSON, NDJSON for streaming | Composable with jq; non-zero exit on error for shell scripts | -| CLI uses stdlib `flag`, no framework | Three subcommands don't justify a dependency; single file | +| CLI default: human-readable table | Operator-friendly by default; `--json` opt-in for scripting | +| CLI multi-target fan-out | Compare a collector vs. aggregator, or two collectors, in one command | +| CLI uses stdlib `flag`, no framework | Four subcommands don't justify a dependency |