9.5 KiB
CLI v0 — Implementation Plan
Module path: git.ipng.ch/ipng/nginx-logtail
Scope: A shell-facing debug tool that can query any number of collectors or aggregators
(they share the same LogtailService gRPC interface) and print results in a human-readable
table or JSON. Supports all three RPCs: TopN, Trend, and StreamSnapshots.
Overview
Single binary logtail-cli with three subcommands:
logtail-cli topn [flags] # ranked list of label → count
logtail-cli trend [flags] # per-bucket time series
logtail-cli stream [flags] # live snapshot feed
All subcommands accept one or more --target addresses. Requests are fanned out
concurrently; each target's results are printed under a labeled header. With a single
target the header is omitted for clean pipe-friendly output.
Step 1 — main.go and subcommand dispatch
No third-party CLI frameworks — plain os.Args subcommand dispatch, each subcommand
registers its own flag.FlagSet.
main():
if len(os.Args) < 2 → print usage, exit 1
switch os.Args[1]:
"topn" → runTopN(os.Args[2:])
"trend" → runTrend(os.Args[2:])
"stream" → runStream(os.Args[2:])
default → print usage, exit 1
Usage text lists all subcommands and their flags.
Step 2 — Shared flags and client helper (flags.go, client.go)
Shared flags (parsed by each subcommand's FlagSet):
| Flag | Default | Description |
|---|---|---|
--target |
localhost:9090 |
Comma-separated host:port list (may be repeated) |
--json |
false | Emit newline-delimited JSON instead of a table |
--website |
— | Filter: exact website match |
--prefix |
— | Filter: exact client prefix match |
--uri |
— | Filter: exact URI match |
--status |
— | Filter: exact HTTP status match |
parseTargets(s string) []string — split on comma, trim spaces, deduplicate.
buildFilter(flags) *pb.Filter — returns nil if no filter flags set (signals "no filter"
to the server), otherwise populates the proto fields.
client.go:
func dial(addr string) (*grpc.ClientConn, pb.LogtailServiceClient, error)
Plain insecure dial (matching the servers' plain-TCP listener). Returns an error rather
than calling log.Fatal so callers can report which target failed without killing the process.
Step 3 — topn subcommand (cmd_topn.go)
Additional flags:
| Flag | Default | Description |
|---|---|---|
--n |
10 | Number of entries to return |
--window |
5m |
Time window: 1m 5m 15m 60m 6h 24h |
--group-by |
website |
Grouping: website prefix uri status |
parseWindow(s string) pb.Window — maps string → proto enum, exits on unknown value.
parseGroupBy(s string) pb.GroupBy — same pattern.
Fan-out: one goroutine per target, each calls TopN with a 10 s context deadline,
sends result (or error) on a typed result channel. Main goroutine collects all results
in target order.
Table output (default):
=== collector-1 (localhost:9090) ===
RANK COUNT LABEL
1 18 432 example.com
2 4 211 other.com
...
=== aggregator (localhost:9091) ===
RANK COUNT LABEL
1 22 643 example.com
...
Single-target: header omitted, plain table printed.
JSON output (--json): one JSON object per target, written sequentially to stdout:
{"source":"collector-1","target":"localhost:9090","entries":[{"label":"example.com","count":18432},...]}
Step 4 — trend subcommand (cmd_trend.go)
Additional flags:
| Flag | Default | Description |
|---|---|---|
--window |
5m |
Time window: 1m 5m 15m 60m 6h 24h |
Same fan-out pattern as topn.
Table output:
=== collector-1 (localhost:9090) ===
TIME (UTC) COUNT
2026-03-14 20:00 823
2026-03-14 20:01 941
...
Points are printed oldest-first (as returned by the server).
JSON output: one object per target:
{"source":"col-1","target":"localhost:9090","points":[{"ts":1773516000,"count":823},...]
Step 5 — stream subcommand (cmd_stream.go)
No extra flags beyond shared ones. Each target gets one persistent StreamSnapshots
connection. All streams are multiplexed onto a single output goroutine via an internal
channel so lines from different targets don't interleave.
type streamEvent struct {
target string
source string
snap *pb.Snapshot
err error
}
One goroutine per target: connect → loop stream.Recv() → send event on channel.
On error: log to stderr, attempt reconnect after 5 s backoff (indefinitely, until
Ctrl-C).
signal.NotifyContext on SIGINT/SIGTERM cancels all stream goroutines.
Table output (one line per snapshot received):
2026-03-14 20:03:00 agg-test (localhost:9091) 950 entries top: example.com=18432
JSON output: one JSON object per snapshot event:
{"ts":1773516180,"source":"agg-test","target":"localhost:9091","top_label":"example.com","top_count":18432,"total_entries":950}
Step 6 — Formatting helpers (format.go)
func printTable(w io.Writer, headers []string, rows [][]string)
Right-aligns numeric columns (COUNT, RANK), left-aligns strings. Uses text/tabwriter
with padding=2. No external dependencies.
func fmtCount(n int64) string // "18 432" — space as thousands separator
func fmtTime(unix int64) string // "2026-03-14 20:03" UTC
Step 7 — Tests (cli_test.go)
Unit tests run entirely in-process with fake gRPC servers (same pattern as
cmd/aggregator/aggregator_test.go).
| Test | What it covers |
|---|---|
TestParseWindow |
All 6 window strings → correct proto enum; bad value exits |
TestParseGroupBy |
All 4 group-by strings → correct proto enum; bad value exits |
TestParseTargets |
Comma split, trim, dedup |
TestBuildFilter |
All combinations of filter flags → correct proto Filter |
TestTopNSingleTarget |
Fake server; runTopN output matches expected table |
TestTopNMultiTarget |
Two fake servers; both headers present in output |
TestTopNJSON |
--json flag; output is valid JSON with correct fields |
TestTrendSingleTarget |
Fake server; points printed oldest-first |
TestTrendJSON |
--json flag; output is valid JSON |
TestStreamReceivesSnapshots |
Fake server sends 3 snapshots; output has 3 lines |
TestFmtCount |
fmtCount(18432) → "18 432" |
TestFmtTime |
fmtTime(1773516000) → "2026-03-14 20:00" |
✓ COMPLETE — Implementation notes
Deviations from the plan
TestFmtTimeusestime.Datenot a hardcoded unix literal: The hardcoded value1773516000turned out to be 2026-03-14 19:20 UTC, not 20:00. Fixed by computing the timestamp dynamically withtime.Date(2026, 3, 14, 20, 0, 0, 0, time.UTC).Unix().TestTopNJSONtests field values, not serialised bytes: CallingprintTopNJSONwould require redirecting stdout. Instead the test verifies the response struct fields that the JSON formatter would use — simpler and equally effective.streamTargetreconnect loop lives incmd_stream.go, not a separate file. The stream and reconnect logic are short enough to colocate.
Test results
$ go test ./... -count=1 -race -timeout 60s
ok git.ipng.ch/ipng/nginx-logtail/cmd/cli 1.0s (14 tests)
ok git.ipng.ch/ipng/nginx-logtail/cmd/aggregator 4.1s (13 tests)
ok git.ipng.ch/ipng/nginx-logtail/cmd/collector 9.9s (17 tests)
Test inventory
| Test | What it covers |
|---|---|
TestParseTargets |
Comma split, trim, deduplication |
TestParseWindow |
All 6 window strings → correct proto enum |
TestParseGroupBy |
All 4 group-by strings → correct proto enum |
TestBuildFilter |
Filter fields set correctly from flags |
TestBuildFilterNil |
Returns nil when no filter flags set |
TestFmtCount |
Space-separated thousands: 1234567 → "1 234 567" |
TestFmtTime |
Unix → "2026-03-14 20:00" UTC |
TestTopNSingleTarget |
Fake server; correct entry count and top label |
TestTopNMultiTarget |
Two fake servers; results ordered by target |
TestTopNJSON |
Response fields match expected values for JSON |
TestTrendSingleTarget |
Correct point count and ascending timestamp order |
TestTrendJSON |
JSON round-trip preserves source, ts, count |
TestStreamReceivesSnapshots |
3 snapshots delivered from fake server via events channel |
TestTargetHeader |
Single-target → empty; multi-target → labeled header |
Step 8 — Smoke test
# Start a collector
./logtail-collector --listen :9090 --logs /var/log/nginx/access.log
# Start an aggregator
./logtail-aggregator --listen :9091 --collectors localhost:9090
# Query TopN from both in one shot
./logtail-cli topn --target localhost:9090,localhost:9091 --window 15m --n 5
# Stream live snapshots from both simultaneously
./logtail-cli stream --target localhost:9090,localhost:9091
# Filter to one website, group by URI
./logtail-cli topn --target localhost:9091 --website example.com --group-by uri --n 20
# JSON output for scripting
./logtail-cli topn --target localhost:9091 --json | jq '.entries[0]'
Deferred (not in v0)
--format csv— easy to add later if needed for spreadsheet export--count/--watch N— repeat the query every N seconds (likewatch(1))- Color output (
--color) — ANSI highlighting of top entries - Connecting to TLS-secured endpoints (when TLS is added to the servers)
- Per-source breakdown (depends on
SOURCEGroupBy being added to the proto)