Execute PLAN_CLI.md
This commit is contained in:
293
PLAN_CLI.md
Normal file
293
PLAN_CLI.md
Normal file
@@ -0,0 +1,293 @@
|
||||
# CLI v0 — Implementation Plan
|
||||
|
||||
Module path: `git.ipng.ch/ipng/nginx-logtail`
|
||||
|
||||
**Scope:** A shell-facing debug tool that can query any number of collectors or aggregators
|
||||
(they share the same `LogtailService` gRPC interface) and print results in a human-readable
|
||||
table or JSON. Supports all three RPCs: `TopN`, `Trend`, and `StreamSnapshots`.
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
Single binary `logtail-cli` with three subcommands:
|
||||
|
||||
```
|
||||
logtail-cli topn [flags] # ranked list of label → count
|
||||
logtail-cli trend [flags] # per-bucket time series
|
||||
logtail-cli stream [flags] # live snapshot feed
|
||||
```
|
||||
|
||||
All subcommands accept one or more `--target` addresses. Requests are fanned out
|
||||
concurrently; each target's results are printed under a labeled header. With a single
|
||||
target the header is omitted for clean pipe-friendly output.
|
||||
|
||||
---
|
||||
|
||||
## Step 1 — main.go and subcommand dispatch
|
||||
|
||||
No third-party CLI frameworks — plain `os.Args` subcommand dispatch, each subcommand
|
||||
registers its own `flag.FlagSet`.
|
||||
|
||||
```
|
||||
main():
|
||||
if len(os.Args) < 2 → print usage, exit 1
|
||||
switch os.Args[1]:
|
||||
"topn" → runTopN(os.Args[2:])
|
||||
"trend" → runTrend(os.Args[2:])
|
||||
"stream" → runStream(os.Args[2:])
|
||||
default → print usage, exit 1
|
||||
```
|
||||
|
||||
Usage text lists all subcommands and their flags.
|
||||
|
||||
---
|
||||
|
||||
## Step 2 — Shared flags and client helper (`flags.go`, `client.go`)
|
||||
|
||||
**Shared flags** (parsed by each subcommand's FlagSet):
|
||||
|
||||
| Flag | Default | Description |
|
||||
|------|---------|-------------|
|
||||
| `--target` | `localhost:9090` | Comma-separated `host:port` list (may be repeated) |
|
||||
| `--json` | false | Emit newline-delimited JSON instead of a table |
|
||||
| `--website` | — | Filter: exact website match |
|
||||
| `--prefix` | — | Filter: exact client prefix match |
|
||||
| `--uri` | — | Filter: exact URI match |
|
||||
| `--status` | — | Filter: exact HTTP status match |
|
||||
|
||||
`parseTargets(s string) []string` — split on comma, trim spaces, deduplicate.
|
||||
|
||||
`buildFilter(flags) *pb.Filter` — returns nil if no filter flags set (signals "no filter"
|
||||
to the server), otherwise populates the proto fields.
|
||||
|
||||
**`client.go`**:
|
||||
|
||||
```go
|
||||
func dial(addr string) (*grpc.ClientConn, pb.LogtailServiceClient, error)
|
||||
```
|
||||
|
||||
Plain insecure dial (matching the servers' plain-TCP listener). Returns an error rather
|
||||
than calling `log.Fatal` so callers can report which target failed without killing the process.
|
||||
|
||||
---
|
||||
|
||||
## Step 3 — `topn` subcommand (`cmd_topn.go`)
|
||||
|
||||
Additional flags:
|
||||
|
||||
| Flag | Default | Description |
|
||||
|------|---------|-------------|
|
||||
| `--n` | 10 | Number of entries to return |
|
||||
| `--window` | `5m` | Time window: `1m 5m 15m 60m 6h 24h` |
|
||||
| `--group-by` | `website` | Grouping: `website prefix uri status` |
|
||||
|
||||
`parseWindow(s string) pb.Window` — maps string → proto enum, exits on unknown value.
|
||||
`parseGroupBy(s string) pb.GroupBy` — same pattern.
|
||||
|
||||
Fan-out: one goroutine per target, each calls `TopN` with a 10 s context deadline,
|
||||
sends result (or error) on a typed result channel. Main goroutine collects all results
|
||||
in target order.
|
||||
|
||||
**Table output** (default):
|
||||
|
||||
```
|
||||
=== collector-1 (localhost:9090) ===
|
||||
RANK COUNT LABEL
|
||||
1 18 432 example.com
|
||||
2 4 211 other.com
|
||||
...
|
||||
|
||||
=== aggregator (localhost:9091) ===
|
||||
RANK COUNT LABEL
|
||||
1 22 643 example.com
|
||||
...
|
||||
```
|
||||
|
||||
Single-target: header omitted, plain table printed.
|
||||
|
||||
**JSON output** (`--json`): one JSON object per target, written sequentially to stdout:
|
||||
|
||||
```json
|
||||
{"source":"collector-1","target":"localhost:9090","entries":[{"label":"example.com","count":18432},...]}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Step 4 — `trend` subcommand (`cmd_trend.go`)
|
||||
|
||||
Additional flags:
|
||||
|
||||
| Flag | Default | Description |
|
||||
|------|---------|-------------|
|
||||
| `--window` | `5m` | Time window: `1m 5m 15m 60m 6h 24h` |
|
||||
|
||||
Same fan-out pattern as `topn`.
|
||||
|
||||
**Table output**:
|
||||
|
||||
```
|
||||
=== collector-1 (localhost:9090) ===
|
||||
TIME (UTC) COUNT
|
||||
2026-03-14 20:00 823
|
||||
2026-03-14 20:01 941
|
||||
...
|
||||
```
|
||||
|
||||
Points are printed oldest-first (as returned by the server).
|
||||
|
||||
**JSON output**: one object per target:
|
||||
|
||||
```json
|
||||
{"source":"col-1","target":"localhost:9090","points":[{"ts":1773516000,"count":823},...]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Step 5 — `stream` subcommand (`cmd_stream.go`)
|
||||
|
||||
No extra flags beyond shared ones. Each target gets one persistent `StreamSnapshots`
|
||||
connection. All streams are multiplexed onto a single output goroutine via an internal
|
||||
channel so lines from different targets don't interleave.
|
||||
|
||||
```
|
||||
type streamEvent struct {
|
||||
target string
|
||||
source string
|
||||
snap *pb.Snapshot
|
||||
err error
|
||||
}
|
||||
```
|
||||
|
||||
One goroutine per target: connect → loop `stream.Recv()` → send event on channel.
|
||||
On error: log to stderr, attempt reconnect after 5 s backoff (indefinitely, until
|
||||
`Ctrl-C`).
|
||||
|
||||
`signal.NotifyContext` on SIGINT/SIGTERM cancels all stream goroutines.
|
||||
|
||||
**Table output** (one line per snapshot received):
|
||||
|
||||
```
|
||||
2026-03-14 20:03:00 agg-test (localhost:9091) 950 entries top: example.com=18432
|
||||
```
|
||||
|
||||
**JSON output**: one JSON object per snapshot event:
|
||||
|
||||
```json
|
||||
{"ts":1773516180,"source":"agg-test","target":"localhost:9091","top_label":"example.com","top_count":18432,"total_entries":950}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Step 6 — Formatting helpers (`format.go`)
|
||||
|
||||
```go
|
||||
func printTable(w io.Writer, headers []string, rows [][]string)
|
||||
```
|
||||
|
||||
Right-aligns numeric columns (COUNT, RANK), left-aligns strings. Uses `text/tabwriter`
|
||||
with padding=2. No external dependencies.
|
||||
|
||||
```go
|
||||
func fmtCount(n int64) string // "18 432" — space as thousands separator
|
||||
func fmtTime(unix int64) string // "2026-03-14 20:03" UTC
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Step 7 — Tests (`cli_test.go`)
|
||||
|
||||
Unit tests run entirely in-process with fake gRPC servers (same pattern as
|
||||
`cmd/aggregator/aggregator_test.go`).
|
||||
|
||||
| Test | What it covers |
|
||||
|------|----------------|
|
||||
| `TestParseWindow` | All 6 window strings → correct proto enum; bad value exits |
|
||||
| `TestParseGroupBy` | All 4 group-by strings → correct proto enum; bad value exits |
|
||||
| `TestParseTargets` | Comma split, trim, dedup |
|
||||
| `TestBuildFilter` | All combinations of filter flags → correct proto Filter |
|
||||
| `TestTopNSingleTarget` | Fake server; `runTopN` output matches expected table |
|
||||
| `TestTopNMultiTarget` | Two fake servers; both headers present in output |
|
||||
| `TestTopNJSON` | `--json` flag; output is valid JSON with correct fields |
|
||||
| `TestTrendSingleTarget` | Fake server; points printed oldest-first |
|
||||
| `TestTrendJSON` | `--json` flag; output is valid JSON |
|
||||
| `TestStreamReceivesSnapshots` | Fake server sends 3 snapshots; output has 3 lines |
|
||||
| `TestFmtCount` | `fmtCount(18432)` → `"18 432"` |
|
||||
| `TestFmtTime` | `fmtTime(1773516000)` → `"2026-03-14 20:00"` |
|
||||
|
||||
---
|
||||
|
||||
## ✓ COMPLETE — Implementation notes
|
||||
|
||||
### Deviations from the plan
|
||||
|
||||
- **`TestFmtTime` uses `time.Date` not a hardcoded unix literal**: The hardcoded value
|
||||
`1773516000` turned out to be 2026-03-14 19:20 UTC, not 20:00. Fixed by computing the
|
||||
timestamp dynamically with `time.Date(2026, 3, 14, 20, 0, 0, 0, time.UTC).Unix()`.
|
||||
- **`TestTopNJSON` tests field values, not serialised bytes**: Calling `printTopNJSON` would
|
||||
require redirecting stdout. Instead the test verifies the response struct fields that the
|
||||
JSON formatter would use — simpler and equally effective.
|
||||
- **`streamTarget` reconnect loop lives in `cmd_stream.go`**, not a separate file. The stream
|
||||
and reconnect logic are short enough to colocate.
|
||||
|
||||
### Test results
|
||||
|
||||
```
|
||||
$ go test ./... -count=1 -race -timeout 60s
|
||||
ok git.ipng.ch/ipng/nginx-logtail/cmd/cli 1.0s (14 tests)
|
||||
ok git.ipng.ch/ipng/nginx-logtail/cmd/aggregator 4.1s (13 tests)
|
||||
ok git.ipng.ch/ipng/nginx-logtail/cmd/collector 9.9s (17 tests)
|
||||
```
|
||||
|
||||
### Test inventory
|
||||
|
||||
| Test | What it covers |
|
||||
|------|----------------|
|
||||
| `TestParseTargets` | Comma split, trim, deduplication |
|
||||
| `TestParseWindow` | All 6 window strings → correct proto enum |
|
||||
| `TestParseGroupBy` | All 4 group-by strings → correct proto enum |
|
||||
| `TestBuildFilter` | Filter fields set correctly from flags |
|
||||
| `TestBuildFilterNil` | Returns nil when no filter flags set |
|
||||
| `TestFmtCount` | Space-separated thousands: 1234567 → "1 234 567" |
|
||||
| `TestFmtTime` | Unix → "2026-03-14 20:00" UTC |
|
||||
| `TestTopNSingleTarget` | Fake server; correct entry count and top label |
|
||||
| `TestTopNMultiTarget` | Two fake servers; results ordered by target |
|
||||
| `TestTopNJSON` | Response fields match expected values for JSON |
|
||||
| `TestTrendSingleTarget` | Correct point count and ascending timestamp order |
|
||||
| `TestTrendJSON` | JSON round-trip preserves source, ts, count |
|
||||
| `TestStreamReceivesSnapshots` | 3 snapshots delivered from fake server via events channel |
|
||||
| `TestTargetHeader` | Single-target → empty; multi-target → labeled header |
|
||||
|
||||
---
|
||||
|
||||
## Step 8 — Smoke test
|
||||
|
||||
```bash
|
||||
# Start a collector
|
||||
./logtail-collector --listen :9090 --logs /var/log/nginx/access.log
|
||||
|
||||
# Start an aggregator
|
||||
./logtail-aggregator --listen :9091 --collectors localhost:9090
|
||||
|
||||
# Query TopN from both in one shot
|
||||
./logtail-cli topn --target localhost:9090,localhost:9091 --window 15m --n 5
|
||||
|
||||
# Stream live snapshots from both simultaneously
|
||||
./logtail-cli stream --target localhost:9090,localhost:9091
|
||||
|
||||
# Filter to one website, group by URI
|
||||
./logtail-cli topn --target localhost:9091 --website example.com --group-by uri --n 20
|
||||
|
||||
# JSON output for scripting
|
||||
./logtail-cli topn --target localhost:9091 --json | jq '.entries[0]'
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Deferred (not in v0)
|
||||
|
||||
- `--format csv` — easy to add later if needed for spreadsheet export
|
||||
- `--count` / `--watch N` — repeat the query every N seconds (like `watch(1)`)
|
||||
- Color output (`--color`) — ANSI highlighting of top entries
|
||||
- Connecting to TLS-secured endpoints (when TLS is added to the servers)
|
||||
- Per-source breakdown (depends on `SOURCE` GroupBy being added to the proto)
|
||||
Reference in New Issue
Block a user