Update README.md

This commit is contained in:
2026-03-14 20:50:10 +01:00
parent c092561af2
commit 2962590a74

298
README.md
View File

@@ -1,27 +1,27 @@
SPECIFICATION SPECIFICATION
This project contains three programs: This project contains four programs:
1) A collector that can tail any number of nginx logfiles, and will keep a data structure of
{website,client_prefix,http_request_uri,http_response} across all logfiles in memory. It is
queryable and can give topN clients by website and by http_request; in other words I can see "who is
causing the most HTTP 429" or "what is the busiest website". This program pre-aggregates the logs
into a queryable structure. It runs on any number (10 or so) of nginx machines in a cluster. There
is no UI here, only a gRPC interface.
2) An aggregator that can query the first one and show global stats and trending information. It needs 1) A **collector** that tails any number of nginx log files and maintains an in-memory structure of
to be able to show global aggregated information from the first (collectors) to show 'what is the `{website, client_prefix, http_request_uri, http_response}` counts across all files. It answers
busiest nginx' in addition to 'what is the busiest website' or 'which client_prefix or TopN and Trend queries via gRPC and pushes minute snapshots to the aggregator via server-streaming.
http_request_uri is causing the most HTTP 503s'. It runs on a central machine and can show trending Runs on each nginx machine in the cluster. No UI — gRPC interface only.
information; useful for ddos detection. This aggregator is an RPC client of the collectors, and
itself presents a gRPC interface.
3) An HTTP companion frontend to the aggregator that can query either collector or aggregator and 2) An **aggregator** that subscribes to the snapshot stream from all collectors, merges their data
answer user queries in a drilldown fashion, eg 'restrict to http_response=429' then 'restrict to into a unified in-memory cache, and exposes the same gRPC interface. Answers questions like "what
website=www.example.com' and so on. This is an interactive rollup UI that helps operators see is the busiest website globally", "which client prefix is causing the most HTTP 503s", and shows
which websites are performing well, and which are performing poorly (eg excessive requests, trending information useful for DDoS detection. Runs on a central machine.
excessive http response errors, DDoS)
Programs are written in Golang with a modern, responsive interactive interface. 3) An **HTTP frontend** companion to the aggregator that renders a drilldown dashboard. Operators
can restrict by `http_response=429`, then by `website=www.example.com`, and so on. Works with
either a collector or aggregator as its backend. Zero JavaScript — server-rendered HTML with inline
SVG sparklines and meta-refresh.
4) A **CLI** for shell-based debugging. Sends `topn`, `trend`, and `stream` queries to any
collector or aggregator, fans out to multiple targets in parallel, and outputs human-readable
tables or newline-delimited JSON.
Programs are written in Go. No CGO, no external runtime dependencies.
--- ---
@@ -33,26 +33,39 @@ DESIGN
nginx-logtail/ nginx-logtail/
├── proto/ ├── proto/
│ └── logtail.proto # shared protobuf definitions │ └── logtail.proto # shared protobuf definitions
├── internal/
│ └── store/
│ └── store.go # shared types: Tuple4, Entry, Snapshot, ring helpers
└── cmd/ └── cmd/
├── collector/ ├── collector/
│ ├── main.go │ ├── main.go
│ ├── tailer.go # tail multiple log files via fsnotify, handle logrotate │ ├── tailer.go # MultiTailer: tail N files via one shared fsnotify watcher
│ ├── parser.go # tab-separated logtail log_format parser │ ├── parser.go # tab-separated logtail log_format parser (~50 ns/line)
│ ├── store.go # bounded top-K in-memory store + tiered ring buffers │ ├── store.go # bounded top-K in-memory store + tiered ring buffers
│ └── server.go # gRPC server with server-streaming StreamSnapshots │ └── server.go # gRPC server: TopN, Trend, StreamSnapshots
├── aggregator/ ├── aggregator/
│ ├── main.go │ ├── main.go
│ ├── subscriber.go # opens streaming RPC to each collector, merges into cache │ ├── subscriber.go # one goroutine per collector; StreamSnapshots with backoff
│ ├── merger.go # merge/sum TopN entries across sources │ ├── merger.go # delta-merge: O(snapshot_size) per update
│ ├── cache.go # merged snapshot + tiered ring buffer served to frontend │ ├── cache.go # tick-based ring buffer cache served to clients
│ └── server.go # gRPC server (same surface as collector) │ └── server.go # gRPC server (same surface as collector)
├── frontend/ ├── frontend/
│ ├── main.go │ ├── main.go
│ ├── handler.go # HTTP handlers, filter state in URL query string │ ├── handler.go # URL param parsing, concurrent TopN+Trend, template exec
│ ├── client.go # gRPC client to aggregator (or collector) │ ├── client.go # gRPC dial helper
── templates/ # server-rendered HTML + inline SVG sparklines ── sparkline.go # TrendPoints → inline SVG polyline
│ ├── format.go # fmtCount (space thousands separator)
│ └── templates/
│ ├── base.html # outer HTML shell, inline CSS, meta-refresh
│ └── index.html # window tabs, group-by tabs, breadcrumb, table, footer
└── cli/ └── cli/
── main.go # topn / trend / stream subcommands, JSON output ── main.go # subcommand dispatch and usage
├── flags.go # shared flags, parseTargets, buildFilter, parseWindow
├── client.go # gRPC dial helper
├── format.go # printTable, fmtCount, fmtTime, targetHeader
├── cmd_topn.go # topn: concurrent fan-out, table + JSON output
├── cmd_trend.go # trend: concurrent fan-out, table + JSON output
└── cmd_stream.go # stream: multiplexed streams, auto-reconnect
``` ```
## Data Model ## Data Model
@@ -78,7 +91,7 @@ Two ring buffers at different resolutions cover all query windows up to 24 hours
Supported query windows and which tier they read from: Supported query windows and which tier they read from:
| Window | Tier | Buckets summed | | Window | Tier | Buckets summed |
|--------|--------|---------------| |--------|--------|----------------|
| 1 min | fine | last 1 | | 1 min | fine | last 1 |
| 5 min | fine | last 5 | | 5 min | fine | last 5 |
| 15 min | fine | last 15 | | 15 min | fine | last 15 |
@@ -95,7 +108,7 @@ Entry size: ~30 B website + ~15 B prefix + ~50 B URI + 3 B status + 8 B count +
overhead ≈ **~186 bytes per entry**. overhead ≈ **~186 bytes per entry**.
| Structure | Entries | Size | | Structure | Entries | Size |
|-------------------------|------------|------------| |-------------------------|-------------|-------------|
| Live map (capped) | 100 000 | ~19 MB | | Live map (capped) | 100 000 | ~19 MB |
| Fine ring (60 × 1-min) | 60 × 50 000 | ~558 MB | | Fine ring (60 × 1-min) | 60 × 50 000 | ~558 MB |
| Coarse ring (288 × 5-min)| 288 × 5 000| ~268 MB | | Coarse ring (288 × 5-min)| 288 × 5 000| ~268 MB |
@@ -146,12 +159,12 @@ message TopNRequest { Filter filter = 1; GroupBy group_by = 2; int32 n = 3; Wi
message TopNEntry { string label = 1; int64 count = 2; } message TopNEntry { string label = 1; int64 count = 2; }
message TopNResponse { repeated TopNEntry entries = 1; string source = 2; } message TopNResponse { repeated TopNEntry entries = 1; string source = 2; }
// Trend: one total count per minute bucket, for sparklines // Trend: one total count per minute (or 5-min) bucket, for sparklines
message TrendRequest { Filter filter = 1; Window window = 4; } message TrendRequest { Filter filter = 1; Window window = 4; }
message TrendPoint { int64 timestamp_unix = 1; int64 count = 2; } message TrendPoint { int64 timestamp_unix = 1; int64 count = 2; }
message TrendResponse { repeated TrendPoint points = 1; } message TrendResponse { repeated TrendPoint points = 1; string source = 2; }
// Streaming: collector pushes a snapshot after every minute rotation // Streaming: collector pushes a fine snapshot after every minute rotation
message SnapshotRequest {} message SnapshotRequest {}
message Snapshot { message Snapshot {
string source = 1; string source = 1;
@@ -165,17 +178,19 @@ service LogtailService {
rpc StreamSnapshots(SnapshotRequest) returns (stream Snapshot); rpc StreamSnapshots(SnapshotRequest) returns (stream Snapshot);
} }
// Both collector and aggregator implement LogtailService. // Both collector and aggregator implement LogtailService.
// Aggregator's StreamSnapshots fans out to all collectors and merges. // The aggregator's StreamSnapshots re-streams the merged view.
``` ```
## Program 1 — Collector ## Program 1 — Collector
### tailer.go ### tailer.go
- One goroutine per log file. Opens file, seeks to EOF. - **`MultiTailer`**: one shared `fsnotify.Watcher` for all files regardless of count — avoids
- Uses **fsnotify** (inotify on Linux) to detect writes. On `WRITE` event: read all new lines. the inotify instance limit when tailing hundreds of files.
- On `RENAME`/`REMOVE` event (logrotate): drain to EOF of old fd, then **re-open** the original - On `WRITE` event: read all new lines from that file's `bufio.Reader`.
path (with retry backoff) and resume from position 0. No lines are lost between drain and reopen. - On `RENAME`/`REMOVE` (logrotate): drain old fd to EOF, close, start retry-open goroutine with
- Emits `LogRecord` structs on a shared buffered channel (size 200 K — absorbs ~20 s of peak load). exponential backoff. Sends the new `*os.File` back via a channel to keep map access single-threaded.
- Emits `LogRecord` structs on a shared buffered channel (capacity 200 K — absorbs ~20 s of peak).
- Accepts paths via `--logs` (comma-separated or glob) and `--logs-file` (one path/glob per line).
### parser.go ### parser.go
- Parses the fixed **logtail** nginx log format — tab-separated, fixed field order, no quoting: - Parses the fixed **logtail** nginx log format — tab-separated, fixed field order, no quoting:
@@ -184,15 +199,8 @@ service LogtailService {
log_format logtail '$host\t$remote_addr\t$msec\t$request_method\t$request_uri\t$status\t$body_bytes_sent\t$request_time'; log_format logtail '$host\t$remote_addr\t$msec\t$request_method\t$request_uri\t$status\t$body_bytes_sent\t$request_time';
``` ```
Example line:
```
www.example.com 1.2.3.4 1741954800.123 GET /api/v1/search 200 1452 0.043
```
Field positions (0-indexed):
| # | Field | Used for | | # | Field | Used for |
|---|------------------|-----------------| |---|-------------------|------------------|
| 0 | `$host` | website | | 0 | `$host` | website |
| 1 | `$remote_addr` | client_prefix | | 1 | `$remote_addr` | client_prefix |
| 2 | `$msec` | (discarded) | | 2 | `$msec` | (discarded) |
@@ -202,156 +210,119 @@ service LogtailService {
| 6 | `$body_bytes_sent`| (discarded) | | 6 | `$body_bytes_sent`| (discarded) |
| 7 | `$request_time` | (discarded) | | 7 | `$request_time` | (discarded) |
- At runtime: `strings.SplitN(line, "\t", 8)` — single call, ~50 ns/line. No regex, no state machine. - `strings.SplitN(line, "\t", 8)` — ~50 ns/line. No regex.
- `$request_uri`: query string discarded at first `?`. - `$request_uri`: query string discarded at first `?`.
- `$remote_addr`: truncated to /24 (IPv4) or /48 (IPv6); prefix lengths configurable. - `$remote_addr`: truncated to /24 (IPv4) or /48 (IPv6); prefix lengths configurable via flags.
- Lines with fewer than 8 fields are silently skipped (malformed / truncated write). - Lines with fewer than 8 fields are silently skipped.
### store.go ### store.go
- **Single aggregator goroutine** reads from the channel and updates the live map — no locking on - **Single aggregator goroutine** reads from the channel and updates the live map — no locking on
the hot path. At 10 K lines/s the goroutine uses <1% CPU. the hot path. At 10 K lines/s the goroutine uses <1% CPU.
- Live map: `map[Tuple4]int64`, hard-capped at 100 K entries (new keys dropped when full). - Live map: `map[Tuple4]int64`, hard-capped at 100 K entries (new keys dropped when full).
- **Minute ticker**: goroutine heap-selects top-50K entries from live map, writes snapshot into - **Minute ticker**: heap-selects top-50K entries, writes snapshot to fine ring, resets live map.
fine ring buffer slot, clears live map, advances fine ring head. - Every 5 fine ticks: merge last 5 fine snapshots → top-5K → write to coarse ring.
- Every 5 fine ticks: merge last 5 fine snapshots → heap-select top-5K → write to coarse ring. - **TopN query**: RLock ring, sum bucket range, apply filter, group by dimension, heap-select top N.
- Fine ring: `[60]Snapshot` circular array. Coarse ring: `[288]Snapshot` circular array. - **Trend query**: per-bucket filtered sum, returns one `TrendPoint` per bucket.
Each Snapshot is `[]TopNEntry` sorted desc by count (already sorted, merge is a heap pass). - **Subscriber fan-out**: per-subscriber buffered channel; `Subscribe`/`Unsubscribe` for streaming.
- **TopN query path**: RLock relevant ring, sum the bucket range, group by dimension, apply filter,
heap-select top N. Worst case: 288×5K = 1.4M iterations — completes in <20 ms.
- **Trend query path**: for each bucket in range, sum counts of entries matching filter, emit one
`TrendPoint`. O(buckets × K) but result is tiny (max 288 points).
### server.go ### server.go
- gRPC server on configurable port (default :9090). - gRPC server on configurable port (default `:9090`).
- `TopN` and `Trend`: read-only calls into store, answered directly. - `TopN` and `Trend`: unary, answered from the ring buffer under RLock.
- `StreamSnapshots`: on each minute rotation the store signals a broadcast channel; the streaming - `StreamSnapshots`: registers a subscriber channel; loops `Recv` on it; 30 s keepalive ticker.
handler wakes, reads the latest snapshot from the ring, and sends it to all connected aggregators.
Uses `sync.Cond` or a fan-out via per-subscriber buffered channels.
## Program 2 — Aggregator ## Program 2 — Aggregator
### subscriber.go ### subscriber.go
- On startup: dials each collector, calls `StreamSnapshots`, receives `Snapshot` messages. - One goroutine per collector. Dials, calls `StreamSnapshots`, forwards each `Snapshot` to the
- Each incoming snapshot is handed to **merger.go**. Reconnects with exponential backoff on merger.
stream error. Marks collector as degraded after 3 failed reconnects; clears on success. - Reconnects with exponential backoff (100 ms → doubles → cap 30 s).
- After 3 consecutive failures: calls `merger.Zero(addr)` to remove that collector's contribution
from the merged view (prevents stale counts accumulating during outages).
- Resets failure count on first successful `Recv`; logs recovery.
### merger.go ### merger.go
- Maintains one `map[Tuple4]int64` per collector (latest snapshot only — no ring buffer here, - **Delta strategy**: on each new snapshot from collector X, subtract X's previous entries from
the aggregator's cache serves that role). `merged`, add the new entries, store new map. O(snapshot_size) per update — not
- On each new snapshot from a collector: replace that collector's map, then rebuild the merged O(N_collectors × snapshot_size).
view by summing across all collector maps. Store merged result into cache.go's ring buffer. - `Zero(addr)`: subtracts the collector's last-known contribution and deletes its entry — called
when a collector is marked degraded.
### cache.go ### cache.go
- Same ring-buffer structure as the collector store (60 slots), populated by merger. - **Tick-based rotation** (1-min ticker, not snapshot-triggered): keeps the aggregator ring aligned
- `TopN` and `Trend` queries are answered from this cache — no live fan-out needed at query time, to the same 1-minute cadence as collectors regardless of how many collectors are connected.
satisfying the 250 ms SLA with headroom. - Same tiered ring structure as the collector store; populated from `merger.TopK()` each tick.
- Also tracks per-collector entry counts for "busiest nginx" queries (answered by treating - `QueryTopN`, `QueryTrend`, `Subscribe`/`Unsubscribe` — identical interface to collector store.
`source` as an additional group-by dimension).
### server.go ### server.go
- Implements the same `LogtailService` proto as the collector. - Implements `LogtailService` backed by the cache (not live fan-out).
- `StreamSnapshots` on the aggregator re-streams merged snapshots to any downstream consumer - `StreamSnapshots` re-streams merged fine snapshots; usable by a second-tier aggregator or
(e.g. a second-tier aggregator, or monitoring). monitoring system.
## Program 3 — Frontend ## Program 3 — Frontend
### handler.go ### handler.go
- Filter state lives entirely in the **URL query string** (no server-side session needed; multiple - All filter state in the **URL query string**: `w` (window), `by` (group_by), `f_website`,
operators see independent views without shared state). Parameters: `w` (window), `by` (group_by), `f_prefix`, `f_uri`, `f_status`, `n`, `target`. No server-side session — URLs are shareable
`f_website`, `f_prefix`, `f_uri`, `f_status`. and bookmarkable; multiple operators see independent views.
- Main page: renders a ranked table. Clicking a row appends that dimension to the URL filter and - `TopN` and `Trend` RPCs issued **concurrently** (both with a 5 s deadline); page renders with
redirects. A breadcrumb shows active filters; each token is a link that removes it. whatever completes. Trend failure suppresses the sparkline without erroring the page.
- **Auto-refresh**: `<meta http-equiv="refresh" content="30">` — simple, reliable, no JS required. - **Drilldown**: clicking a table row adds the current dimension's filter and advances `by` through
- A `?raw=1` flag returns JSON for scripting/curl use. `website → prefix → uri → status → website` (cycles).
- **`raw=1`**: returns the TopN result as JSON — same URL, no CLI needed for scripting.
- **`target=` override**: per-request gRPC endpoint override for comparing sources.
- Error pages render at HTTP 502 with the window/group-by tabs still functional.
### sparkline.go
- `renderSparkline([]*pb.TrendPoint) template.HTML` — fixed `viewBox="0 0 300 60"` SVG,
Y-scaled to max count, rendered as `<polyline>`. Returns `""` for fewer than 2 points or
all-zero data.
### templates/ ### templates/
- Base layout with filter breadcrumb and window selector tabs (1m / 5m / 15m / 60m / 6h / 24h). - `base.html`: outer shell, inline CSS (~40 lines), conditional `<meta http-equiv="refresh">`.
- Table partial: columns are label, count, % of total, bar (inline `<meter>`). - `index.html`: window tabs, group-by tabs, filter breadcrumb with `×` remove links, sparkline,
- Sparkline partial: inline SVG polyline built from `TrendResponse.points` — 60 points, scaled to TopN table with `<meter>` bars (% relative to rank-1), footer with source and refresh info.
the bucket's max, rendered server-side. No JS, no external assets. - No external CSS, no web fonts, no JavaScript. Renders in w3m/lynx.
## Program 4 — CLI ## Program 4 — CLI
A single binary (`cmd/cli/main.go`) for shell-based debugging and programmatic top-K queries.
Talks to any collector or aggregator via gRPC. All output is JSON.
### Subcommands ### Subcommands
``` ```
cli topn --target HOST:PORT [filter flags] [--by DIM] [--window W] [--n N] [--pretty] logtail-cli topn [flags] ranked label → count table (exits after one response)
cli trend --target HOST:PORT [filter flags] [--window W] [--pretty] logtail-cli trend [flags] per-bucket time series (exits after one response)
cli stream --target HOST:PORT [--pretty] logtail-cli stream [flags] live snapshot feed (runs until Ctrl-C, auto-reconnects)
``` ```
### Flags ### Flags
**Shared** (all subcommands):
| Flag | Default | Description | | Flag | Default | Description |
|---------------|--------------|--------------------------------------------------------| |--------------|------------------|----------------------------------------------------------|
| `--target` | `localhost:9090` | gRPC address of collector or aggregator | | `--target` | `localhost:9090` | Comma-separated `host:port` list; fan-out to all |
| `--by` | `website` | Group-by dimension: `website`, `prefix`, `uri`, `status` | | `--json` | false | Emit newline-delimited JSON instead of a table |
| `--window` | `5m` | Time window: `1m` `5m` `15m` `60m` `6h` `24h` | | `--website` | — | Filter: website |
| `--n` | `10` | Number of top entries to return | | `--prefix` | — | Filter: client prefix |
| `--website` | — | Filter: restrict to this website | | `--uri` | — | Filter: request URI |
| `--prefix` | — | Filter: restrict to this client prefix | | `--status` | — | Filter: HTTP status code |
| `--uri` | — | Filter: restrict to this request URI |
| `--status` | — | Filter: restrict to this HTTP status code |
| `--pretty` | false | Indent JSON output |
### Output format **`topn` only**: `--n 10`, `--window 5m`, `--group-by website`
**`topn`** — single JSON object, exits after one response: **`trend` only**: `--window 5m`
```json
{
"target": "agg:9091", "window": "5m", "group_by": "prefix",
"filter": {"status": 429, "website": "www.example.com"},
"queried_at": "2026-03-14T12:00:00Z",
"entries": [
{"rank": 1, "label": "1.2.3.0/24", "count": 8471},
{"rank": 2, "label": "5.6.7.0/24", "count": 3201}
]
}
```
**`trend`** — single JSON object, exits after one response: ### Multi-target fan-out
```json
{
"target": "agg:9091", "window": "24h", "filter": {"status": 503},
"queried_at": "2026-03-14T12:00:00Z",
"points": [
{"time": "2026-03-14T11:00:00Z", "count": 45},
{"time": "2026-03-14T11:05:00Z", "count": 120}
]
}
```
**`stream`** — NDJSON (one JSON object per line, unbounded), suitable for `| jq -c 'select(...)'`: `--target` accepts a comma-separated list. All targets are queried concurrently; results are
```json printed in order with a per-target header. Single-target output omits the header for clean
{"source": "nginx3:9090", "bucket_time": "2026-03-14T12:01:00Z", "entry_count": 42318, "top5": [{"label": "www.example.com", "count": 18000}, ...]} pipe-to-`jq` use.
```
### Example usage ### Output
```bash Default: human-readable table with space-separated thousands (`18 432`).
# Who is hammering us with 429s right now? `--json`: one JSON object per target (NDJSON for `stream`).
cli topn --target agg:9091 --window 1m --by prefix --status 429 --n 20 | jq '.entries[]'
# Which website has the most 503s over the last 24h? `stream` reconnects automatically on error (5 s backoff). All other subcommands exit immediately
cli topn --target agg:9091 --window 24h --by website --status 503 with a non-zero code on gRPC error.
# Trend of all traffic to one site over 6h (for a quick graph)
cli trend --target agg:9091 --window 6h --website api.example.com | jq '.points[] | [.time, .count]'
# Watch live snapshots from one collector, filter for high-volume buckets
cli stream --target nginx3:9090 | jq -c 'select(.entry_count > 10000)'
```
### Implementation notes
- Single `main.go` using the standard `flag` package with a manual subcommand dispatch —
no external CLI framework needed for three subcommands.
- Shares no code with the other binaries; duplicates the gRPC client setup locally (it's three
lines). Avoids creating a shared internal package for something this small.
- Non-zero exit code on any gRPC error so it composes cleanly in shell scripts.
## Key Design Decisions ## Key Design Decisions
@@ -360,12 +331,17 @@ cli stream --target nginx3:9090 | jq -c 'select(.entry_count > 10000)'
| Single aggregator goroutine in collector | Eliminates all map lock contention on the 10 K/s hot path | | Single aggregator goroutine in collector | Eliminates all map lock contention on the 10 K/s hot path |
| Hard cap live map at 100 K entries | Bounds memory regardless of DDoS cardinality explosion | | Hard cap live map at 100 K entries | Bounds memory regardless of DDoS cardinality explosion |
| Ring buffer of sorted snapshots (not raw maps) | TopN queries avoid re-sorting; merge is a single heap pass | | Ring buffer of sorted snapshots (not raw maps) | TopN queries avoid re-sorting; merge is a single heap pass |
| Push-based streaming (collector → aggregator) | Aggregator cache is always fresh; query latency is cache-read only | | Push-based streaming (collector → aggregator) | Aggregator cache always fresh; query latency is cache-read only |
| Same `LogtailService` for collector and aggregator | Frontend works with either; useful for single-box and debugging | | Delta merge in aggregator | O(snapshot_size) per update, not O(N_collectors × size) |
| Filter state in URL, not session cookie | Supports multiple concurrent operators; shareable/bookmarkable URLs | | Tick-based cache rotation in aggregator | Ring stays on the same 1-min cadence regardless of collector count |
| Degraded collector zeroing | Stale counts from failed collectors don't accumulate in the merged view |
| Same `LogtailService` for collector and aggregator | CLI and frontend work with either; no special-casing |
| `internal/store` shared package | ~200 lines of ring-buffer logic shared between collector and aggregator |
| Filter state in URL, not session cookie | Multiple concurrent operators; shareable/bookmarkable URLs |
| Query strings stripped at ingest | Major cardinality reduction; prevents URI explosion under attack | | Query strings stripped at ingest | Major cardinality reduction; prevents URI explosion under attack |
| No persistent storage | Simplicity; acceptable for ops dashboards (restart = lose history) | | No persistent storage | Simplicity; acceptable for ops dashboards (restart = lose history) |
| Trusted internal network, no TLS | Reduces operational complexity; add a TLS proxy if needed later | | Trusted internal network, no TLS | Reduces operational complexity; add a TLS proxy if needed later |
| Server-side SVG sparklines, meta-refresh | Zero JS dependencies; works in terminal browsers and curl | | Server-side SVG sparklines, meta-refresh | Zero JS dependencies; works in terminal browsers and curl |
| CLI outputs JSON, NDJSON for streaming | Composable with jq; non-zero exit on error for shell scripts | | CLI default: human-readable table | Operator-friendly by default; `--json` opt-in for scripting |
| CLI uses stdlib `flag`, no framework | Three subcommands don't justify a dependency; single file | | CLI multi-target fan-out | Compare a collector vs. aggregator, or two collectors, in one command |
| CLI uses stdlib `flag`, no framework | Four subcommands don't justify a dependency |