Commit Graph

9 Commits

Author SHA1 Message Date
Pim van Pelt
6a48c12449 Add multi-arch Docker build and docker-compose stack
Introduce a multi-stage Alpine Dockerfile that cross-compiles via
buildx ($BUILDPLATFORM -> $TARGETARCH) so a single invocation produces
both linux/amd64 and linux/arm64 images without a qemu-emulated
builder. `make docker` loads the native-arch image locally for smoke
tests; `make docker-push` publishes a multi-arch manifest. Ship a
docker-compose.yaml with opt-in profiles for maglevd/frontend and a
.env.example template so operators can mirror /etc/default/vpp-maglev
muscle memory into containers.
2026-04-15 18:09:35 +02:00
Pim van Pelt
bc6ccaa844 v1.0.0 — first release
Bump VERSION to 1.0.0 and cut the first tagged release of vpp-maglev.

Also in this commit:

- maglevc: MAGLEV_SERVER env var as an alternative to the --server
  flag, matching the MAGLEV_CONFIG / MAGLEV_GRPC_ADDR convention on
  the other binaries. The flag takes precedence when both are set.
- Rename cmd/maglevd -> cmd/server and cmd/maglevc -> cmd/client so
  the source directory names are decoupled from binary names (the
  frontend and tester commands already followed this convention).
  Build outputs and the Debian packages are unchanged.
2026-04-15 15:29:31 +02:00
Pim van Pelt
1664382d25 Add design document; Cross reference from existing docs, and add a section on maglevt 2026-04-15 15:02:44 +02:00
Pim van Pelt
744b1cb3d2 install-deps Makefile target; docs refresh; golangci-lint v2 clean
Makefile:
- New install-deps umbrella target split into three sub-targets:
  install-deps-apt        — Debian/Trixie-packaged build deps
                            (nodejs, npm, protobuf-compiler, git, make,
                            dpkg-dev, ca-certificates, curl, tar). Uses
                            sudo when not already root.
  install-deps-go         — ensures a Go toolchain >= GO_VERSION (go.mod
                            floor, default 1.25.0). Short-circuits when
                            the system Go is already recent enough;
                            otherwise downloads the upstream tarball
                            from go.dev/dl/ into /usr/local/go. Trixie
                            only ships 1.24 so this step is load-bearing.
  install-deps-go-tools   — go install protoc-gen-go, protoc-gen-go-grpc,
                            and golangci-lint/v2/cmd/golangci-lint. Then
                            asserts the installed golangci-lint version
                            parses as >= GOLANGCI_LINT_VERSION (default
                            1.64.0, the floor that supports Go 1.25
                            syntax) to catch stale binaries in $GOPATH
                            /bin before they silently run against Go
                            1.25 code.
- Parser bug fixed: golangci-lint v1.x prints "has version v1.64.8" but
  v2.x dropped the 'v' prefix and prints "has version 2.11.4". The
  original sed regex required the 'v' and returned an empty match on
  v2.x, making the assertion explode with "could not parse version
  output". Fixed by switching to extended regex (sed -En) with 'v?' so
  both forms parse cleanly.
- GO_VERSION and GOLANGCI_LINT_VERSION exposed as Makefile variables
  so operators can override on the command line, e.g.
    make install-deps GO_VERSION=1.25.5 GOLANGCI_LINT_VERSION=2.0.0
- .PHONY extended with the four new target names.

Docs:
- README.md: capability note rewritten to cover CAP_NET_RAW (ICMP) and
  the new CAP_SYS_ADMIN requirement when healthchecker.netns is set,
  plus a paragraph explaining that the Debian systemd unit grants both
  automatically. Docker example gained a second variant that shows the
  additional --cap-add SYS_ADMIN and /var/run/netns bind mount for
  netns-scoped deployments. Also notes that maglevd-frontend ignores
  SIGHUP so controlling-terminal disconnects don't kill it.
- docs/user-guide.md: Capabilities section rewritten as a bulleted
  list covering both caps, with the EPERM error string and three
  different ways to grant them (systemd unit, setcap, systemd-run);
  'show vpp lb counters' command description updated to explain that
  per-backend packet counts are no longer shown (LB plugin's
  forwarding node bypasses ip{4,6}_lookup_inline, so /net/route/to at
  the backend's FIB entry never ticks for LB-forwarded traffic); new
  ~75-line "What the SPA shows" subsection covering the scope
  selector + maglev_scope cookie, the per-maglevd frontend cards, the
  health-cascade icon table (ok / bug-buckets / primary-drained /
  degraded / unknown), the lb buckets column semantics, the
  maglev_zippy_open cookie, the admin-mode lifecycle dialogs with
  their plain-English consequence text, and the debug panel.
- docs/config-guide.md: healthchecker.netns field gains a capability-
  requirement note spelling out setns(CLONE_NEWNET), the EPERM
  symptom string, and the /var/run/netns/ readability requirement.
- docs/healthchecks.md: new "Jitter" subsection explaining the +/-10%
  scaling on every computed interval, and a "Probe timing while a
  probe is in flight" subsection that explains why fast-interval alone
  doesn't give fast fault detection against hanging backends (the
  probe loop is synchronous, so each iteration is timeout +
  fast-interval; the advice is to lower timeout, not fast-interval).
- docs/maglevd.8: description paragraph corrected (dropped the
  per-backend stats claim and added a short note pointing at the LB
  plugin forwarding-path bypass); new CAPABILITIES section between
  SIGNALS and FILES covering both CAP_NET_RAW and CAP_SYS_ADMIN with
  the drop-in-override hint.
- docs/maglevd-frontend.8: new SIGNALS section documenting the
  explicit SIGHUP ignore (so a controlling-terminal disconnect doesn't
  kill the daemon); description extended with paragraphs on the two
  persistence cookies (maglev_scope, maglev_zippy_open) and on the
  health-cascade icon + lb buckets column.
- docs/maglevc.1: left untouched — intentionally minimal and delegates
  to docs/user-guide.md.

Lint (26 issues across 12 files, all errcheck / ineffassign / S1021):
- cmd/frontend/handlers.go: _, _ = fmt.Fprintf(...) for the SSE retry
  hint and resync control-event writes.
- cmd/maglevc/commands.go: bulk-prefix every fmt.Fprintf(w, ...) with
  _, _ =; also merged 'var watchEventsOptSlot *Node; ... = &Node{...}'
  into a single := declaration (staticcheck S1021) — the self-
  referencing pattern still works because the Children back-ref is
  assigned on the next statement, not inside the struct literal.
- cmd/maglevc/complete.go: _, _ = fmt.Fprintf(ql.rl.Stderr(), ...)
  for the banner and help writes; removed the ineffectual
  'partial = ""' assignment (nothing downstream reads partial after
  that branch, so setting it was dead code flagged by ineffassign).
- cmd/maglevc/shell.go: defer func() { _ = rl.Close() }() for the
  readline instance; _, _ = fmt.Fprintf(rl.Stderr(), ...) for error
  display in the REPL loop.
- cmd/maglevc/main.go: defer func() { _ = conn.Close() }() for the
  gRPC client connection.
- internal/grpcapi/server_test.go: _ = conn.Close() in the test
  teardown closure.
- internal/prober/http.go: _ = c.Close() in the TLS-handshake-failed
  path; defer func() { _ = conn.Close() }() and defer func() { _ =
  resp.Body.Close() }() for the two deferred cleanups.
- internal/prober/http_test.go: defer func() { _ = resp.Body.Close()
  }() plus three _, _ = fmt.Fprint(w, ...) in the httptest.Server
  handlers and _, _ = fmt.Sscanf(...) when parsing the test listener's
  port.
- internal/prober/icmp.go: defer func() { _ = pc.Close() }() for the
  ICMP packet conn.
- internal/prober/netns.go: defer func() { _ = origNs.Close() }(),
  defer func() { _ = netns.Set(origNs) }(), defer func() { _ =
  targetNs.Close() }() — also dropped a stray //nolint:errcheck that
  was no longer needed once the closure wrapping handled the discard.
- internal/prober/tcp.go: _ = conn.Close() in the L4-only path,
  _ = tlsConn.Close() in the failed and succeeded handshake branches,
  _ = tlsConn.SetDeadline(...) (also dropped a //nolint:errcheck
  previously covering it).

Iterative 'make lint' runs were needed because golangci-lint v2.x
caps same-linter reports per pass, so the first pass reported 21,
then 4, then 3, then 1, then 0. Final pass: 0 issues. make test is
green across every package, and make build produces all three
binaries cleanly.
2026-04-14 17:37:53 +02:00
Pim van Pelt
35643fd774 Rename maglev-frontend → maglevd-frontend; v0.9.1; API RX/TX pulse
Rename the web dashboard binary to maglevd-frontend and move it to
/usr/sbin (it's a daemon and belongs with maglevd). The systemd unit
name stays vpp-maglev-frontend.service since that prefix is the
package name. Manpage, README, user-guide, and debian packaging all
updated in lockstep; bump to 0.9.1 for the first real release.

All frontend env vars are now prefixed MAGLEV_FRONTEND_ so a single
/etc/default/vpp-maglev can be shared with maglevd without collisions.
Every flag has an env equivalent for Docker use. MAGLEV_FRONTEND_USER
and MAGLEV_FRONTEND_PASSWORD still gate the /admin surface.

VPPInfoPanel now pulses "API: ↑↓" indicators in the zippy title
whenever a vpp-api-send / vpp-api-recv log event arrives on the SSE
stream for the scoped maglevd — 250ms blue flash, re-triggerable,
with the two arrows tightly kerned via negative letter-spacing.
2026-04-13 00:13:52 +02:00
Pim van Pelt
25e9d79aba Frontend: live clocks, admin mode, backend actions; packaging polish
Builds on the maglev-frontend component introduced in 284b4cc with
quality-of-life improvements, an authenticated /admin surface, a
live-action control plane, and Debian packaging cleanup.

 - Backend state now renders live: maglevd's FrontendEvent synthetic
   from==to replay hydrates FrontendSnapshot.State on WatchEvents
   subscribe, and live transitions update both the in-process cache
   and every connected browser via a new applyFrontendTransition
   reducer. Shown as a StatusBadge next to the frontend name.
 - VPP connection state surfaces in the VPP zippy title as a
   green/red badge. Driven by vpp-connect / vpp-disconnect and by
   the steady stream of vpp-api-send/recv debug heartbeats so a
   silent VPP drop is caught within one debug-log tick.
 - Probe heartbeat dot becomes ❤️ while a probe is in flight and
   reverts to · on probe-done. Fixed-size wrapper so the emoji swap
   doesn't jiggle the row; both states share the same font-size.
 - Flash component replaced its subtle background-only fade with a
   scale-pop + yellow halo box-shadow + longer duration so
   weight/effective/state changes are unmissable on tiny numeric
   cells. Initial mount still skipped via defer so no flash on load.
 - Last-transition age is now a live countdown driven by a global
   1-second ticker signal (one timer, many subscribers). Two most
   significant units: 10m30s / 1h12m / 1d16h. Sub-second ages
   render as "now" to absorb clock skew between maglevd and the
   browser.
 - Event stream is now chronological (oldest at top) with tail-
   style auto-scroll, pause/resume, and the toolbar moved below the
   list. Row separators removed. Also shown only in /admin (see
   below) so /view stays a focused read-only surface.
 - Table nowrap so backend names like nginx0-frggh0 and the
   "last transition" header don't wrap. Frontends render in the
   order returned by ListFrontends instead of Go map iteration
   order so reload doesn't shuffle VIP order.
 - IPng logo in the header, clickable, links to the git repo.
   Header padding reduced so the logo can fill the bar up to the
   separator. Version + commit + build date shown in the brand area
   (fetched once from /view/api/version).
 - "view" / "admin" mode tag moved to sit just left of the admin
   toggle button so it reads as a pair.
 - Prettier wired in as the web-side fixstyle via a new
   fixstyle-web Make target that also runs from `make fixstyle`.
   Added .prettierrc.json and .prettierignore; 8 existing files
   were normalized in place.

 - Fixed a "20555d ago" rendering bug: maglevd's synthetic
   backend-replay events (from==to, at_unix_ns=0) were corrupting
   the local cache's LastTransition via applyBackendTransition.
   Backend synthetic events are now skipped entirely (refreshAll
   covers initial hydration for backends), while frontend synthetic
   events are still applied because FrontendInfo doesn't carry
   state — the event is the only source.

 - New MAGLEV_FRONTEND_USER / MAGLEV_FRONTEND_PASSWORD env vars.
   When both are set and non-empty, /admin/ becomes a basic-auth-
   protected SPA shell backed by the same embedded index.html as
   /view/. The SPA detects its base path via a new stores/mode.ts
   isAdmin constant and conditionally renders admin-only sections
   (currently: the Event Stream / DebugPanel). When disabled,
   /admin/ returns 404 (not 501) so operators who didn't configure
   it see no teasing affordance, and the SPA's admin-toggle button
   is hidden entirely via the admin_enabled flag on
   /view/api/version.
 - basicAuth uses crypto/subtle.ConstantTimeCompare for both user
   and password so timing can't distinguish a wrong username from
   a wrong password.

 - New POST /admin/api/{maglevd}/backend/{name}/{pause|resume|
   enable|disable} endpoint, gated by the same basic-auth
   middleware as the SPA shell. maglevClient.BackendAction wraps
   the four matching gRPC RPCs and returns a fresh BackendSnapshot;
   the same transition lands via WatchEvents so every connected
   browser converges through the normal reducer path.
 - BackendActionsMenu Solid component: kebab (⋮) button in a new
   trailing column rendered only in /admin. Click-outside and
   Escape close the popover (document listeners installed only
   while open). Actions are state-aware: up/down/unknown → pause,
   disable; paused → resume, disable; disabled → enable;
   removed → menu suppressed entirely. Busy indicator per action;
   errors render inline under the item list.
 - Structured audit log: every mutation logs an
   admin-backend-action record with maglevd / backend / action /
   resulting state.

 - Renamed debian/vpp-maglevd.service → debian/vpp-maglev.service
   to align naming with the new vpp-maglev-frontend.service
   sibling. postinst handles upgrades by stopping + disabling any
   lingering vpp-maglevd.service before enabling the renamed unit;
   prerm stops both (the frontend unit is installed but not
   enabled by default — operators opt in with systemctl enable).
 - New debian/vpp-maglev-frontend.service (hardened:
   NoNewPrivileges, ProtectSystem=strict, ProtectHome, PrivateTmp,
   no capabilities). Reads the same /etc/default/vpp-maglev
   conffile and expands MAGLEV_FRONTEND_ARGS via
   `ExecStart=/usr/bin/maglev-frontend $MAGLEV_FRONTEND_ARGS` so
   word-splitting works.
 - docs/maglev-frontend.8 manpage documenting flags, endpoints,
   and SSE reverse-proxy requirements.
 - build-deb.sh: drops the commit hash from the .deb filename
   (now vpp-maglev_<version>_<arch>.deb) and no longer takes the
   commit as a CLI arg. Binaries continue to carry the commit via
   -ldflags so `maglevd --version` et al are the authoritative
   "which build is running" answer.
2026-04-12 20:04:53 +02:00
Pim van Pelt
58391f5463 Add WatchEvents, enable/disable/weight RPCs, and config check
gRPC / proto
- Rename WatchBackendEvents → WatchEvents; return a stream of Event
  oneof (LogEvent, BackendEvent, FrontendEvent) with optional filter
  flags (log, log_level, backend, frontend)
- Add EnableBackend, DisableBackend, SetFrontendPoolBackendWeight RPCs
- Rename PauseResumeRequest → BackendRequest
- Add CheckConfig RPC returning ok/parse_error/semantic_error

maglevd
- Route slog through a LogBroadcaster (slog.Handler) so WatchEvents
  subscribers can receive structured log records independently of the
  daemon's own --log-level
- Add --reflection flag (default true) to toggle gRPC server reflection
- Add --check flag: validates config file and exits 0/1/2
- SIGHUP: use config.Check before applying reload; log parse vs semantic
  error separately; refuse reload on any error
- Rename default config path /etc/maglev → /etc/vpp-maglev

maglevc
- Add 'watch events [num <n>] [log [level <level>]] [backend] [frontend]'
  command; prints compact protojson, stops on any keypress or Ctrl-C;
  uses cbreak mode (not raw) so output post-processing is preserved
- Add 'set backend <name> enable|disable'
- Add 'set frontend <name> pool <pool> backend <name> weight <0-100>'
- Add 'config check' command

Debian packaging
- Rename service unit to vpp-maglevd.service
- Rename conffiles to /etc/default/vpp-maglev and /etc/vpp-maglev/
- Create maglevd system user/group in postinst; add to vpp group if present
- Add postrm; add adduser to Depends
2026-04-11 16:42:11 +02:00
Pim van Pelt
d612086a5f Pools, CLI, versioning, Debian packaging, HTTPS fix
- Replaced flat `backends: [...]` list on frontends with an ordered `pools:`
  list; each pool has a name and a map of backends with per-pool weights (0–100,
  default 100). Pools express priority: first pool with a healthy backend wins.
- Removed global backend weight (was on the backend, now lives in the pool).
- Config validation enforces non-empty pools, non-empty pool names, weight
  range, and consistent address families across all pools of a frontend.

- Added `PoolBackendInfo { name, weight }` and changed `PoolInfo.backends` from
  `repeated string` to `repeated PoolBackendInfo` so weights are visible over
  the API.

- Full interactive shell with readline, tab completion, and `?` inline help.
- Command tree parser (Walk) handles fixed keywords and dynamic slot nodes;
  prefix matching with exact-match priority.
- Commands: `show version/frontends/frontend/backends/backend/healthchecks/
  healthcheck`, `set backend <name> pause|resume`, `quit`/`exit`.
- `show frontend` output is hierarchical (pools → backends) with per-backend
  weights and `[disabled]` notation; pool section uses fixed-width formatting
  so ANSI color codes don't corrupt tabwriter alignment.
- `-color` flag (default true) wraps static field labels in dark-blue ANSI;
  works correctly with tabwriter because all labels carry identical-length
  escape sequences.

- `cmd/version.go` package holds `version`, `commit`, `date` vars set at build
  time via `-ldflags -X`.
- `make build` / `make build-amd64` / `make build-arm64` all inject
  `VERSION=0.1.1`, `COMMIT_HASH` (from `git rev-parse --short HEAD`), and
  `DATE` (UTC ISO-8601).
- `maglevc` prints version on interactive startup and exposes `show version`.
- `maglevd` logs version/commit/date at startup; `-version` flag prints and exits.

- `doHTTPProbe` was building a `https://` target URL even though TLS was already
  applied to the connection inside `inNetns`. `http.Transport` then wrapped the
  connection in a second TLS layer, producing "http: server gave HTTP response
  to HTTPS client". Fixed by always using `http://` in the target URL.
- Added `TestHTTPSProbe` using `httptest.NewTLSServer` to cover the full path.

- New `docs/user-guide.md`: maglevd flags/signals, maglevc commands, shell
  completion, and command-tree parser walkthrough.
- New `docs/healthchecks.md`: state machine, rise/fall model, probe intervals,
  all transition events with log examples.
- Updated `docs/config-guide.md`: pools design, removed global weight from
  backends, updated all examples.
- Updated `README.md`: packaging table, build paths, corrected binary locations
  (`/usr/sbin/maglevd`), config filename (`.yaml`).

- `debian/` directory contains `control.in`, `maglevd.service`, `default.maglev`,
  `maglev.yaml` (example config), `conffiles`, `postinst`, `prerm`.
- `debian/build-deb.sh` stages a package tree and calls `dpkg-deb`; emits
  `build/vpp-maglev_<version>~<commit>_<arch>.deb`.
- Cross-compiles for amd64 and arm64 in one `make pkg-deb` invocation.
- `maglevd` installed to `/usr/sbin/`, `maglevc` to `/usr/bin/`.
- Service reads `MAGLEV_CONFIG` from `/etc/default/maglev`
  (default: `/etc/maglev/maglev.yaml`).
- Man pages `maglevd(8)` and `maglevc(1)` live in `docs/` and are gzip'd into
  the package.
- All build output goes to `build/<arch>/`; `build/` is gitignored.
2026-04-11 12:18:17 +02:00
Pim van Pelt
530d85740e Add LICENSE and README + config-guide 2026-04-10 22:22:56 +02:00