Execute PLAN_FRONTEND.md

This commit is contained in:
2026-03-14 20:42:51 +01:00
parent b9ec67ec00
commit 4369e66dee
9 changed files with 1571 additions and 0 deletions

334
PLAN_FRONTEND.md Normal file
View File

@@ -0,0 +1,334 @@
# Frontend v0 — Implementation Plan
Module path: `git.ipng.ch/ipng/nginx-logtail`
**Scope:** An HTTP server that queries a collector or aggregator and renders a drilldown TopN
dashboard with trend sparklines. Zero JavaScript. Filter state in the URL. Auto-refreshes every
30 seconds. Works with any `LogtailService` endpoint (collector or aggregator).
---
## Overview
Single page, multiple views driven entirely by URL query parameters:
```
http://frontend:8080/?target=agg:9091&w=5m&by=website&f_status=429&n=25
```
Clicking a table row drills down: it adds a filter for the clicked label and advances
`by` to the next dimension in the hierarchy (`website → prefix → uri → status`). The
breadcrumb strip shows all active filters; each token is a link that removes it.
---
## Step 1 — main.go
Flags:
| Flag | Default | Description |
|------|---------|-------------|
| `--listen` | `:8080` | HTTP listen address |
| `--target` | `localhost:9091` | Default gRPC endpoint (aggregator or collector) |
| `--n` | `25` | Default number of table rows |
| `--refresh` | `30` | `<meta refresh>` interval in seconds; 0 to disable |
Wire-up:
1. Parse flags
2. Register `http.HandleFunc("/", handler)` (single handler, all state in URL)
3. `http.ListenAndServe`
4. `signal.NotifyContext` for clean shutdown on SIGINT/SIGTERM
---
## Step 2 — client.go
```go
func dial(addr string) (*grpc.ClientConn, pb.LogtailServiceClient, error)
```
Identical to the CLI version — plain insecure dial. A new connection is opened per HTTP
request. At a 30-second page refresh rate this is negligible; pooling is not needed.
---
## Step 3 — handler.go
### URL parameters
| Param | Default | Values |
|-------|---------|--------|
| `target` | flag default | `host:port` |
| `w` | `5m` | `1m 5m 15m 60m 6h 24h` |
| `by` | `website` | `website prefix uri status` |
| `n` | flag default | positive integer |
| `f_website` | — | string |
| `f_prefix` | — | string |
| `f_uri` | — | string |
| `f_status` | — | integer string |
| `raw` | — | `1` → respond with JSON instead of HTML |
### Request flow
```
parseURLParams(r) → QueryParams
buildFilter(QueryParams) → *pb.Filter
dial(target) → client
concurrent:
client.TopN(filter, groupBy, n, window) → TopNResponse
client.Trend(filter, window) → TrendResponse
renderSparkline(TrendResponse.Points) → template.HTML
buildTableRows(TopNResponse, QueryParams) → []TableRow (includes drill-down URL per row)
buildBreadcrumbs(QueryParams) → []Crumb
execute template → w
```
TopN and Trend RPCs are issued concurrently (both have a 5 s context deadline). If Trend
fails, the sparkline is omitted silently rather than returning an error page.
### `raw=1` mode
Returns the TopN response as JSON (same format as the CLI's `--json`). Useful for scripting
and `curl` without needing the CLI binary.
### Drill-down URL construction
Dimension advance hierarchy (for row-click links):
```
website → CLIENT_PREFIX → REQUEST_URI → HTTP_RESPONSE → (no advance; all dims filtered)
```
Row-click URL: take current params, add the filter for the current `by` dimension, and set
`by` to the next dimension. If already on the last dimension (`status`), keep `by` unchanged.
### Types
```go
type QueryParams struct {
Target string
Window pb.Window
WindowS string // "5m" — for display
GroupBy pb.GroupBy
GroupByS string // "website" — for display
N int
Filter filterState
}
type filterState struct {
Website string
Prefix string
URI string
Status string // string so empty means "unset"
}
type TableRow struct {
Rank int
Label string
Count int64
Pct float64 // 0100, relative to top entry
DrillURL string // href for this row
}
type Crumb struct {
Text string // e.g. "website=example.com"
RemoveURL string // current URL with this filter removed
}
type PageData struct {
Params QueryParams
Source string
Entries []TableRow
TotalCount int64
Sparkline template.HTML // "" if trend call failed
Breadcrumbs []Crumb
RefreshSecs int
Error string // non-empty → show error banner, no table
}
```
---
## Step 4 — sparkline.go
```go
func renderSparkline(points []*pb.TrendPoint) template.HTML
```
- Fixed `viewBox="0 0 300 60"` SVG.
- X axis: evenly-spaced buckets across 300 px.
- Y axis: linear scale from 0 to max count, inverted (SVG y=0 is top).
- Rendered as a `<polyline>` with `stroke` and `fill="none"`. Minimal inline style, no classes.
- If `len(points) < 2`, returns `""` (no sparkline).
- Returns `template.HTML` (already-escaped) so the template can emit it with `{{.Sparkline}}`.
---
## Step 5 — templates/
Two files, embedded with `//go:embed templates/*.html` and parsed once at startup.
### `templates/base.html` (define "base")
Outer HTML skeleton:
- `<meta http-equiv="refresh" content="30">` (omitted if `RefreshSecs == 0`)
- Minimal inline CSS: monospace font, max-width 1000px, table styling, breadcrumb strip
- Yields a `{{template "content" .}}` block
No external CSS, no web fonts, no icons. Legible in a terminal browser (w3m, lynx).
### `templates/index.html` (define "content")
Sections in order:
**Window tabs**`1m | 5m | 15m | 60m | 6h | 24h`; current window is bold/underlined;
each is a link that swaps only `w=` in the URL.
**Group-by tabs**`by website | by prefix | by uri | by status`; current group highlighted;
links swap `by=`.
**Filter breadcrumb** — shown only when at least one filter is active:
```
Filters: [website=example.com ×] [status=429 ×]
```
Each `×` is a link to the URL without that filter.
**Error banner** — shown instead of table when `.Error` is non-empty.
**Trend sparkline** — the SVG returned by `renderSparkline`, inline. Labelled with window
and source. Omitted when `.Sparkline == ""`.
**TopN table**:
```
RANK LABEL COUNT % TREND
1 example.com 18 432 62 % ████████████
2 other.com 4 211 14 % ████
```
- `LABEL` column is a link (`DrillURL`).
- `%` is relative to the top entry (rank-1 always 100 %).
- `TREND` bar is an inline `<meter value="N" max="100">` tag — renders as a native browser bar,
degrades gracefully in text browsers to `N/100`.
- Rows beyond rank 3 show the percentage bar only if it's > 5 %, to avoid noise.
**Footer** — "source: <source> queried <timestamp> refresh 30 s" — lets operators confirm
which endpoint they're looking at.
---
## Step 6 — Tests (`frontend_test.go`)
In-process fake gRPC server (same pattern as aggregator and CLI tests).
| Test | What it covers |
|------|----------------|
| `TestParseQueryParams` | All URL params parsed correctly; defaults applied |
| `TestParseQueryParamsInvalid` | Bad `n`, bad `w`, bad `f_status` → defaults or 400 |
| `TestBuildFilterFromParams` | Populated filter; nil when nothing set |
| `TestDrillURL` | website → prefix drill; prefix → uri drill; status → no advance |
| `TestBuildCrumbs` | One crumb per active filter; remove-URL drops just that filter |
| `TestRenderSparkline` | 5 points → valid SVG containing `<polyline`; 0 points → empty |
| `TestHandlerTopN` | Fake server; GET / returns 200 with table rows in body |
| `TestHandlerRaw` | `raw=1` returns JSON with correct entries |
| `TestHandlerBadTarget` | Unreachable target → 502 with error message |
| `TestHandlerFilter` | `f_website=x` passed through to fake server's received request |
| `TestHandlerWindow` | `w=60m` → correct `pb.Window_W60M` in fake server's received request |
| `TestPctBar` | `<meter` tag present in rendered HTML |
| `TestBreadcrumbInHTML` | Filter crumb rendered; `×` link present |
---
## Step 7 — Smoke test
```bash
# Start collector and aggregator (or use existing)
./logtail-collector --listen :9090 --logs /var/log/nginx/access.log
./logtail-aggregator --listen :9091 --collectors localhost:9090
# Start frontend
./logtail-frontend --listen :8080 --target localhost:9091
# Open in browser or curl
curl -s 'http://localhost:8080/' | grep '<tr'
curl -s 'http://localhost:8080/?w=60m&by=prefix&f_status=200&raw=1' | jq '.entries[0]'
# Drill-down link check
curl -s 'http://localhost:8080/' | grep 'f_website'
```
---
## ✓ COMPLETE — Implementation notes
### Files
| File | Role |
|------|------|
| `cmd/frontend/main.go` | Flags, template loading, HTTP server, graceful shutdown |
| `cmd/frontend/client.go` | `dial()` — plain insecure gRPC, new connection per request |
| `cmd/frontend/handler.go` | URL parsing, filter building, concurrent TopN+Trend fan-out, page data assembly |
| `cmd/frontend/sparkline.go` | `renderSparkline()``[]*pb.TrendPoint` → inline `<svg><polyline>` |
| `cmd/frontend/format.go` | `fmtCount()` — space-separated thousands, registered as template func |
| `cmd/frontend/templates/base.html` | Outer HTML shell, inline CSS, meta-refresh |
| `cmd/frontend/templates/index.html` | Window tabs, group-by tabs, breadcrumb, sparkline, table, footer |
### Deviations from the plan
- **`format.go` extracted**: `fmtCount` placed in its own file (not in `handler.go`) so it can
be tested independently without loading the template.
- **`TestDialFake` added**: sanity check for the fake gRPC infrastructure used by the other tests.
- **`TestHandlerNoData` added**: verifies the "no data" message renders correctly when the server
returns an empty entry list. Total tests: 23 (plan listed 13).
- **`% relative to rank-1`** as planned; the `<meter max="100">` shows 100% for rank-1
and proportional bars below. Rank-1 is always the visual baseline.
- **`status → website` drill cycle**: clicking a row in the `by status` view adds `f_status`
and resets `by=website` (cycles back to the start of the drilldown hierarchy).
### Test results
```
$ go test ./... -count=1 -race -timeout 60s
ok git.ipng.ch/ipng/nginx-logtail/cmd/frontend 1.1s (23 tests)
ok git.ipng.ch/ipng/nginx-logtail/cmd/cli 1.0s (14 tests)
ok git.ipng.ch/ipng/nginx-logtail/cmd/aggregator 4.1s (13 tests)
ok git.ipng.ch/ipng/nginx-logtail/cmd/collector 9.7s (17 tests)
```
### Test inventory
| Test | What it covers |
|------|----------------|
| `TestParseWindowString` | All 6 window strings + bad input → default |
| `TestParseGroupByString` | All 4 group-by strings + bad input → default |
| `TestParseQueryParams` | All URL params parsed correctly |
| `TestParseQueryParamsDefaults` | Empty URL → handler defaults applied |
| `TestBuildFilter` | Filter proto fields set from filterState |
| `TestBuildFilterNil` | Returns nil when no filter set |
| `TestDrillURL` | website→prefix, prefix→uri, status→website cycle |
| `TestBuildCrumbs` | Correct text and remove-URLs for active filters |
| `TestRenderSparkline` | 5 points → SVG with polyline |
| `TestRenderSparklineTooFewPoints` | nil/1 point → empty string |
| `TestRenderSparklineAllZero` | All-zero counts → empty string |
| `TestFmtCount` | Space-thousands formatting |
| `TestHandlerTopN` | Fake server; labels and formatted counts in HTML |
| `TestHandlerRaw` | `raw=1` → JSON with source/window/group_by/entries |
| `TestHandlerBadTarget` | Unreachable target → 502 + error message in body |
| `TestHandlerFilterPassedToServer` | `f_website` + `f_status` reach gRPC filter |
| `TestHandlerWindowPassedToServer` | `w=60m``pb.Window_W60M` in request |
| `TestHandlerBreadcrumbInHTML` | Active filter renders crumb with × link |
| `TestHandlerSparklineInHTML` | Trend points → `<svg><polyline>` in page |
| `TestHandlerPctBar` | 100% for rank-1, 50% for half-count entry |
| `TestHandlerWindowTabsInHTML` | All 6 window labels rendered as links |
| `TestHandlerNoData` | Empty entry list → "no data" message |
| `TestDialFake` | Test infrastructure sanity check |
---
## Deferred (not in v0)
- Dark mode (prefers-color-scheme media query)
- Per-row mini sparklines (one Trend RPC per table row — expensive; need batching first)
- WebSocket or SSE for live push instead of meta-refresh
- Pagination for large N
- `?format=csv` download
- OIDC/basic-auth gating
- ClickHouse-backed 7d/30d windows (tracked in README)