Files
nginx-logtail/PLAN_FRONTEND.md
2026-03-14 20:42:51 +01:00

335 lines
12 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Frontend v0 — Implementation Plan
Module path: `git.ipng.ch/ipng/nginx-logtail`
**Scope:** An HTTP server that queries a collector or aggregator and renders a drilldown TopN
dashboard with trend sparklines. Zero JavaScript. Filter state in the URL. Auto-refreshes every
30 seconds. Works with any `LogtailService` endpoint (collector or aggregator).
---
## Overview
Single page, multiple views driven entirely by URL query parameters:
```
http://frontend:8080/?target=agg:9091&w=5m&by=website&f_status=429&n=25
```
Clicking a table row drills down: it adds a filter for the clicked label and advances
`by` to the next dimension in the hierarchy (`website → prefix → uri → status`). The
breadcrumb strip shows all active filters; each token is a link that removes it.
---
## Step 1 — main.go
Flags:
| Flag | Default | Description |
|------|---------|-------------|
| `--listen` | `:8080` | HTTP listen address |
| `--target` | `localhost:9091` | Default gRPC endpoint (aggregator or collector) |
| `--n` | `25` | Default number of table rows |
| `--refresh` | `30` | `<meta refresh>` interval in seconds; 0 to disable |
Wire-up:
1. Parse flags
2. Register `http.HandleFunc("/", handler)` (single handler, all state in URL)
3. `http.ListenAndServe`
4. `signal.NotifyContext` for clean shutdown on SIGINT/SIGTERM
---
## Step 2 — client.go
```go
func dial(addr string) (*grpc.ClientConn, pb.LogtailServiceClient, error)
```
Identical to the CLI version — plain insecure dial. A new connection is opened per HTTP
request. At a 30-second page refresh rate this is negligible; pooling is not needed.
---
## Step 3 — handler.go
### URL parameters
| Param | Default | Values |
|-------|---------|--------|
| `target` | flag default | `host:port` |
| `w` | `5m` | `1m 5m 15m 60m 6h 24h` |
| `by` | `website` | `website prefix uri status` |
| `n` | flag default | positive integer |
| `f_website` | — | string |
| `f_prefix` | — | string |
| `f_uri` | — | string |
| `f_status` | — | integer string |
| `raw` | — | `1` → respond with JSON instead of HTML |
### Request flow
```
parseURLParams(r) → QueryParams
buildFilter(QueryParams) → *pb.Filter
dial(target) → client
concurrent:
client.TopN(filter, groupBy, n, window) → TopNResponse
client.Trend(filter, window) → TrendResponse
renderSparkline(TrendResponse.Points) → template.HTML
buildTableRows(TopNResponse, QueryParams) → []TableRow (includes drill-down URL per row)
buildBreadcrumbs(QueryParams) → []Crumb
execute template → w
```
TopN and Trend RPCs are issued concurrently (both have a 5 s context deadline). If Trend
fails, the sparkline is omitted silently rather than returning an error page.
### `raw=1` mode
Returns the TopN response as JSON (same format as the CLI's `--json`). Useful for scripting
and `curl` without needing the CLI binary.
### Drill-down URL construction
Dimension advance hierarchy (for row-click links):
```
website → CLIENT_PREFIX → REQUEST_URI → HTTP_RESPONSE → (no advance; all dims filtered)
```
Row-click URL: take current params, add the filter for the current `by` dimension, and set
`by` to the next dimension. If already on the last dimension (`status`), keep `by` unchanged.
### Types
```go
type QueryParams struct {
Target string
Window pb.Window
WindowS string // "5m" — for display
GroupBy pb.GroupBy
GroupByS string // "website" — for display
N int
Filter filterState
}
type filterState struct {
Website string
Prefix string
URI string
Status string // string so empty means "unset"
}
type TableRow struct {
Rank int
Label string
Count int64
Pct float64 // 0100, relative to top entry
DrillURL string // href for this row
}
type Crumb struct {
Text string // e.g. "website=example.com"
RemoveURL string // current URL with this filter removed
}
type PageData struct {
Params QueryParams
Source string
Entries []TableRow
TotalCount int64
Sparkline template.HTML // "" if trend call failed
Breadcrumbs []Crumb
RefreshSecs int
Error string // non-empty → show error banner, no table
}
```
---
## Step 4 — sparkline.go
```go
func renderSparkline(points []*pb.TrendPoint) template.HTML
```
- Fixed `viewBox="0 0 300 60"` SVG.
- X axis: evenly-spaced buckets across 300 px.
- Y axis: linear scale from 0 to max count, inverted (SVG y=0 is top).
- Rendered as a `<polyline>` with `stroke` and `fill="none"`. Minimal inline style, no classes.
- If `len(points) < 2`, returns `""` (no sparkline).
- Returns `template.HTML` (already-escaped) so the template can emit it with `{{.Sparkline}}`.
---
## Step 5 — templates/
Two files, embedded with `//go:embed templates/*.html` and parsed once at startup.
### `templates/base.html` (define "base")
Outer HTML skeleton:
- `<meta http-equiv="refresh" content="30">` (omitted if `RefreshSecs == 0`)
- Minimal inline CSS: monospace font, max-width 1000px, table styling, breadcrumb strip
- Yields a `{{template "content" .}}` block
No external CSS, no web fonts, no icons. Legible in a terminal browser (w3m, lynx).
### `templates/index.html` (define "content")
Sections in order:
**Window tabs**`1m | 5m | 15m | 60m | 6h | 24h`; current window is bold/underlined;
each is a link that swaps only `w=` in the URL.
**Group-by tabs**`by website | by prefix | by uri | by status`; current group highlighted;
links swap `by=`.
**Filter breadcrumb** — shown only when at least one filter is active:
```
Filters: [website=example.com ×] [status=429 ×]
```
Each `×` is a link to the URL without that filter.
**Error banner** — shown instead of table when `.Error` is non-empty.
**Trend sparkline** — the SVG returned by `renderSparkline`, inline. Labelled with window
and source. Omitted when `.Sparkline == ""`.
**TopN table**:
```
RANK LABEL COUNT % TREND
1 example.com 18 432 62 % ████████████
2 other.com 4 211 14 % ████
```
- `LABEL` column is a link (`DrillURL`).
- `%` is relative to the top entry (rank-1 always 100 %).
- `TREND` bar is an inline `<meter value="N" max="100">` tag — renders as a native browser bar,
degrades gracefully in text browsers to `N/100`.
- Rows beyond rank 3 show the percentage bar only if it's > 5 %, to avoid noise.
**Footer** — "source: <source> queried <timestamp> refresh 30 s" — lets operators confirm
which endpoint they're looking at.
---
## Step 6 — Tests (`frontend_test.go`)
In-process fake gRPC server (same pattern as aggregator and CLI tests).
| Test | What it covers |
|------|----------------|
| `TestParseQueryParams` | All URL params parsed correctly; defaults applied |
| `TestParseQueryParamsInvalid` | Bad `n`, bad `w`, bad `f_status` → defaults or 400 |
| `TestBuildFilterFromParams` | Populated filter; nil when nothing set |
| `TestDrillURL` | website → prefix drill; prefix → uri drill; status → no advance |
| `TestBuildCrumbs` | One crumb per active filter; remove-URL drops just that filter |
| `TestRenderSparkline` | 5 points → valid SVG containing `<polyline`; 0 points → empty |
| `TestHandlerTopN` | Fake server; GET / returns 200 with table rows in body |
| `TestHandlerRaw` | `raw=1` returns JSON with correct entries |
| `TestHandlerBadTarget` | Unreachable target → 502 with error message |
| `TestHandlerFilter` | `f_website=x` passed through to fake server's received request |
| `TestHandlerWindow` | `w=60m` → correct `pb.Window_W60M` in fake server's received request |
| `TestPctBar` | `<meter` tag present in rendered HTML |
| `TestBreadcrumbInHTML` | Filter crumb rendered; `×` link present |
---
## Step 7 — Smoke test
```bash
# Start collector and aggregator (or use existing)
./logtail-collector --listen :9090 --logs /var/log/nginx/access.log
./logtail-aggregator --listen :9091 --collectors localhost:9090
# Start frontend
./logtail-frontend --listen :8080 --target localhost:9091
# Open in browser or curl
curl -s 'http://localhost:8080/' | grep '<tr'
curl -s 'http://localhost:8080/?w=60m&by=prefix&f_status=200&raw=1' | jq '.entries[0]'
# Drill-down link check
curl -s 'http://localhost:8080/' | grep 'f_website'
```
---
## ✓ COMPLETE — Implementation notes
### Files
| File | Role |
|------|------|
| `cmd/frontend/main.go` | Flags, template loading, HTTP server, graceful shutdown |
| `cmd/frontend/client.go` | `dial()` — plain insecure gRPC, new connection per request |
| `cmd/frontend/handler.go` | URL parsing, filter building, concurrent TopN+Trend fan-out, page data assembly |
| `cmd/frontend/sparkline.go` | `renderSparkline()``[]*pb.TrendPoint` → inline `<svg><polyline>` |
| `cmd/frontend/format.go` | `fmtCount()` — space-separated thousands, registered as template func |
| `cmd/frontend/templates/base.html` | Outer HTML shell, inline CSS, meta-refresh |
| `cmd/frontend/templates/index.html` | Window tabs, group-by tabs, breadcrumb, sparkline, table, footer |
### Deviations from the plan
- **`format.go` extracted**: `fmtCount` placed in its own file (not in `handler.go`) so it can
be tested independently without loading the template.
- **`TestDialFake` added**: sanity check for the fake gRPC infrastructure used by the other tests.
- **`TestHandlerNoData` added**: verifies the "no data" message renders correctly when the server
returns an empty entry list. Total tests: 23 (plan listed 13).
- **`% relative to rank-1`** as planned; the `<meter max="100">` shows 100% for rank-1
and proportional bars below. Rank-1 is always the visual baseline.
- **`status → website` drill cycle**: clicking a row in the `by status` view adds `f_status`
and resets `by=website` (cycles back to the start of the drilldown hierarchy).
### Test results
```
$ go test ./... -count=1 -race -timeout 60s
ok git.ipng.ch/ipng/nginx-logtail/cmd/frontend 1.1s (23 tests)
ok git.ipng.ch/ipng/nginx-logtail/cmd/cli 1.0s (14 tests)
ok git.ipng.ch/ipng/nginx-logtail/cmd/aggregator 4.1s (13 tests)
ok git.ipng.ch/ipng/nginx-logtail/cmd/collector 9.7s (17 tests)
```
### Test inventory
| Test | What it covers |
|------|----------------|
| `TestParseWindowString` | All 6 window strings + bad input → default |
| `TestParseGroupByString` | All 4 group-by strings + bad input → default |
| `TestParseQueryParams` | All URL params parsed correctly |
| `TestParseQueryParamsDefaults` | Empty URL → handler defaults applied |
| `TestBuildFilter` | Filter proto fields set from filterState |
| `TestBuildFilterNil` | Returns nil when no filter set |
| `TestDrillURL` | website→prefix, prefix→uri, status→website cycle |
| `TestBuildCrumbs` | Correct text and remove-URLs for active filters |
| `TestRenderSparkline` | 5 points → SVG with polyline |
| `TestRenderSparklineTooFewPoints` | nil/1 point → empty string |
| `TestRenderSparklineAllZero` | All-zero counts → empty string |
| `TestFmtCount` | Space-thousands formatting |
| `TestHandlerTopN` | Fake server; labels and formatted counts in HTML |
| `TestHandlerRaw` | `raw=1` → JSON with source/window/group_by/entries |
| `TestHandlerBadTarget` | Unreachable target → 502 + error message in body |
| `TestHandlerFilterPassedToServer` | `f_website` + `f_status` reach gRPC filter |
| `TestHandlerWindowPassedToServer` | `w=60m``pb.Window_W60M` in request |
| `TestHandlerBreadcrumbInHTML` | Active filter renders crumb with × link |
| `TestHandlerSparklineInHTML` | Trend points → `<svg><polyline>` in page |
| `TestHandlerPctBar` | 100% for rank-1, 50% for half-count entry |
| `TestHandlerWindowTabsInHTML` | All 6 window labels rendered as links |
| `TestHandlerNoData` | Empty entry list → "no data" message |
| `TestDialFake` | Test infrastructure sanity check |
---
## Deferred (not in v0)
- Dark mode (prefers-color-scheme media query)
- Per-row mini sparklines (one Trend RPC per table row — expensive; need batching first)
- WebSocket or SSE for live push instead of meta-refresh
- Pagination for large N
- `?format=csv` download
- OIDC/basic-auth gating
- ClickHouse-backed 7d/30d windows (tracked in README)