Add design document; Cross reference from existing docs, and add a section on maglevt

This commit is contained in:
2026-04-15 15:02:44 +02:00
parent 6d78921edd
commit 1664382d25
3 changed files with 1144 additions and 1 deletions

View File

@@ -2,7 +2,7 @@
Health checker, gRPC control plane, CLI, and web dashboard for the VPP
`lb` (load-balancer) plugin. Runs as a set of three binaries under one
Debian package:
Debian package, plus an out-of-band tester built alongside:
- **`maglevd`** — the long-running health-checker daemon. Probes backends
(HTTP, TCP, ICMP), tracks their aggregate state, programs the VPP
@@ -14,6 +14,12 @@ Debian package:
SolidJS Single-Page-App; connects to one or more maglevds over gRPC and
serves a live HTTP view (read-only `/view/` and optional basic-auth
`/admin/` with mutating commands).
- **`maglevt`** — optional out-of-band VIP probe TUI. Reads a
`maglev.yaml` and hits each frontend on a live HTTP path, reporting
latency and a configurable response-header tally so operators can see
failover as it happens. Does not talk gRPC; useful for validating a
`maglevd` restart end-to-end from a client perspective. Built by
`make` but not installed by the Debian package.
## Build and install
@@ -94,6 +100,9 @@ deployments.
## Documentation
- [docs/design.md](docs/design.md) — architecture, components, and
numbered functional / non-functional requirements. Start here if
you want the big picture before diving into the code.
- A minimal configuration file in
[debian/maglev.yaml](debian/maglev.yaml) shows every knob.
- [docs/user-guide.md](docs/user-guide.md) — flags, signals, and

1076
docs/design.md Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -535,3 +535,61 @@ Nginx, HAProxy, or any proxy in front of `maglevd-frontend` must:
the live-stream property.
See `maglevd-frontend(8)` for the full reference.
---
## maglevt
`maglevt` is an optional out-of-band VIP probe TUI. It reads one or
more `maglev.yaml` files, enumerates the configured TCP/HTTP frontends,
and probes each one on a configurable HTTP path at a configurable
interval. It does not talk gRPC and does not depend on a running
`maglevd` — it's a purely client-side view of the VIPs, driven entirely
from the config file on disk.
It's useful for a handful of things in particular:
- Validating a `maglevd` restart end-to-end from a client perspective:
the probe tally keeps running regardless of what the control plane
is doing, so a brief blip or a missed failover is visible directly.
- Debugging pool failover: with keep-alives off, every probe opens a
fresh TCP connection and is reshuffled by VPP's Maglev hash, so the
response-header tally visibly reshuffles the moment a standby pool
takes over.
- Sanity-checking VIP reachability across multi-site deployments,
especially when the gRPC control plane isn't reachable from the
machine you're debugging on.
`maglevt` is built by `make` alongside the other binaries but is not
shipped in the Debian package; run it from the `build/` tree or copy
it onto the host by hand.
### Flags
| Flag | Environment variable | Default | Description |
|---|---|---|---|
| `--config` | — | `/etc/vpp-maglev/maglev.yaml` | Path to a `maglev.yaml` file. Repeatable; also accepts a comma-separated list. Frontends are unioned across files and de-duplicated by `(address, protocol, port)`. |
| `--interval` | — | `100ms` | Probe interval per VIP, with ±10% jitter applied per probe to avoid phase-locking. |
| `--timeout` | — | `2s` | Per-request timeout. |
| `--host` | — | (VIP address) | Override for the HTTP `Host` header. Defaults to the VIP address literal. |
| `--uri` / `--path` | — | `/.well-known/ipng/healthz` | HTTP request path used in the GET. `--path` is an alias for `--uri`. |
| `--header` | — | `X-IPng-Frontend` | Response header whose value is extracted and tallied, so you can see which backend served each request. |
| `--insecure` | — | `true` | Skip TLS verification for HTTPS frontends. |
| `--keepalive` / `-k` | — | `false` | Enable HTTP keep-alives. Off by default so every probe opens a fresh connection — required for failover visibility, because a pinned keep-alive would mask a Maglev reshuffle. |
| `--filter` | — | — | Regular expression; only probe frontends whose name matches. |
| `--version` | — | — | Print version, commit hash, and build date, then exit. |
### UI
The TUI is built with Bubble Tea and shows a deterministic grid —
one tile per `(scheme, address, port)` VIP, IPv6 before IPv4 and
HTTPS before HTTP, so the layout is stable across runs and across
machines. Each tile carries a rolling latency summary (min, max,
average, plus a few percentiles), running success and failure
counts, and a tally of the configured response-header values seen
from that VIP. Press `d` to toggle reverse-DNS resolution on the
addresses shown in the tile headers; press `q` or `Ctrl-C` to
exit.
There is no machine-readable output. If you need metrics, scrape
Prometheus on `maglevd` instead.