Commit Graph

2 Commits

Author SHA1 Message Date
Pim van Pelt
224167ce39 Dataplane reconcile fixes; LB counters cleanup; SPA scope cookie
Checker / reload:
- Reload's update-in-place branch now mirrors b.Address onto the
  runtime health.Backend. Without this, GetBackend kept returning
  the pre-reload address indefinitely after a config edit that
  touched addresses but not healthcheck settings — the VPP sync
  path reads cfg.Backends directly so the dataplane moved on
  while the gRPC and SPA view stayed wedged on the old IPv4/IPv6.

Sync (internal/vpp/lbsync.go):
- reconcileVIP now detects encap mismatch in addition to
  src-ip-sticky mismatch and takes the full tear-down / re-add
  path via a new shared recreateVIP helper. Triggered when every
  backend flips address family (gre4 <-> gre6) and the existing
  VIP can no longer accept new ASes — previously the sync wedged
  with 'Invalid address family' until a full maglevd restart.
- setASWeight is issued whenever the state machine requests
  flush (a.Flush=true), not only on the weight-value transition
  edge. Fixes the case where a backend reached StateDisabled
  after its effective weight had already been drained to 0 by
  pool failover — the sticky-cache entries pointing at it were
  previously never cleared.

maglev-frontend:
- signal.Ignore(SIGHUP) so a controlling-terminal disconnect
  doesn't kill the daemon.
- debian/vpp-maglev.service grants CAP_SYS_ADMIN in addition to
  CAP_NET_RAW so setns(CLONE_NEWNET) can join the healthcheck
  netns. Comment documents the 'operation not permitted' symptom
  and notes the knob can be dropped if the deployment doesn't use
  the 'netns:' healthcheck option.

LB plugin counters (internal/vpp/lbstats.go + friends):
- Fix the VIP counter regex: the LB plugin registers
  vlib_simple_counter_main_t names without a leading '/'
  (vlib_validate_simple_counter in counter.c:50 uses cm->name
  verbatim; only entries that set cm->stat_segment_name get a
  slash). first/next/untracked/no-server now read through as
  live values instead of zero.
- Drop the per-backend FIB counter block end-to-end (proto,
  grpcapi, metrics, vpp.Client, lbstats, maglevc). Traced from
  lb/node.c:558 into ip{4,6}_forward.h:141 — the LB plugin
  forwards by writing adj_index[VLIB_TX] directly and bypassing
  ip{4,6}_lookup_inline, which is the only path that increments
  lbm_to_counters. The backend's FIB load_balance stats_index
  literally never ticks for LB-forwarded traffic, so the column
  was always zero and misleading. docs/implementation/TODO
  records the full investigation and the recommended upstream
  path (new lb_as_stats_dump API message) for when we're ready
  to carry that VPP patch.
- maglevc show vpp lb counters: plain-text tabular headers.
  label() wraps strings in ANSI escapes (~11 bytes of overhead),
  but tabwriter counts bytes, not rendered width — so a header
  row with label()'d cells and data rows with plain cells drifts
  column alignment on every row. color.go comment now spells
  out the constraint: label() only works when column N is
  wrapped identically in every row (key-value layouts are fine,
  multi-column tables with header-only labelling are not).

SPA:
- stores/scope.ts is cookie-backed (maglev_scope, 1 year,
  SameSite=Lax). App.tsx hydrates from the cookie then validates
  against the fetched snapshots: a cookie referencing a maglevd
  that no longer exists falls through to snaps[0] instead of
  leaving the user on a ghost selection.
- components/Flash.tsx wraps props.value in createMemo. Solid's
  on() fires its callback on every dep notification, not on
  value change — source is right in solid-js/dist/solid.js:460,
  no equality check. Without the memo, flipping scope between
  two 'connected' maglevds (or any other cross-store reactive
  re-eval that doesn't actually change the concrete string)
  replays the animation every time. createMemo's default ===
  dedupe fixes it in one place for every Flash consumer,
  superseding the local createMemo workaround we'd added in
  BackendRow earlier.
2026-04-14 14:40:16 +02:00
Pim van Pelt
25e9d79aba Frontend: live clocks, admin mode, backend actions; packaging polish
Builds on the maglev-frontend component introduced in 284b4cc with
quality-of-life improvements, an authenticated /admin surface, a
live-action control plane, and Debian packaging cleanup.

 - Backend state now renders live: maglevd's FrontendEvent synthetic
   from==to replay hydrates FrontendSnapshot.State on WatchEvents
   subscribe, and live transitions update both the in-process cache
   and every connected browser via a new applyFrontendTransition
   reducer. Shown as a StatusBadge next to the frontend name.
 - VPP connection state surfaces in the VPP zippy title as a
   green/red badge. Driven by vpp-connect / vpp-disconnect and by
   the steady stream of vpp-api-send/recv debug heartbeats so a
   silent VPP drop is caught within one debug-log tick.
 - Probe heartbeat dot becomes ❤️ while a probe is in flight and
   reverts to · on probe-done. Fixed-size wrapper so the emoji swap
   doesn't jiggle the row; both states share the same font-size.
 - Flash component replaced its subtle background-only fade with a
   scale-pop + yellow halo box-shadow + longer duration so
   weight/effective/state changes are unmissable on tiny numeric
   cells. Initial mount still skipped via defer so no flash on load.
 - Last-transition age is now a live countdown driven by a global
   1-second ticker signal (one timer, many subscribers). Two most
   significant units: 10m30s / 1h12m / 1d16h. Sub-second ages
   render as "now" to absorb clock skew between maglevd and the
   browser.
 - Event stream is now chronological (oldest at top) with tail-
   style auto-scroll, pause/resume, and the toolbar moved below the
   list. Row separators removed. Also shown only in /admin (see
   below) so /view stays a focused read-only surface.
 - Table nowrap so backend names like nginx0-frggh0 and the
   "last transition" header don't wrap. Frontends render in the
   order returned by ListFrontends instead of Go map iteration
   order so reload doesn't shuffle VIP order.
 - IPng logo in the header, clickable, links to the git repo.
   Header padding reduced so the logo can fill the bar up to the
   separator. Version + commit + build date shown in the brand area
   (fetched once from /view/api/version).
 - "view" / "admin" mode tag moved to sit just left of the admin
   toggle button so it reads as a pair.
 - Prettier wired in as the web-side fixstyle via a new
   fixstyle-web Make target that also runs from `make fixstyle`.
   Added .prettierrc.json and .prettierignore; 8 existing files
   were normalized in place.

 - Fixed a "20555d ago" rendering bug: maglevd's synthetic
   backend-replay events (from==to, at_unix_ns=0) were corrupting
   the local cache's LastTransition via applyBackendTransition.
   Backend synthetic events are now skipped entirely (refreshAll
   covers initial hydration for backends), while frontend synthetic
   events are still applied because FrontendInfo doesn't carry
   state — the event is the only source.

 - New MAGLEV_FRONTEND_USER / MAGLEV_FRONTEND_PASSWORD env vars.
   When both are set and non-empty, /admin/ becomes a basic-auth-
   protected SPA shell backed by the same embedded index.html as
   /view/. The SPA detects its base path via a new stores/mode.ts
   isAdmin constant and conditionally renders admin-only sections
   (currently: the Event Stream / DebugPanel). When disabled,
   /admin/ returns 404 (not 501) so operators who didn't configure
   it see no teasing affordance, and the SPA's admin-toggle button
   is hidden entirely via the admin_enabled flag on
   /view/api/version.
 - basicAuth uses crypto/subtle.ConstantTimeCompare for both user
   and password so timing can't distinguish a wrong username from
   a wrong password.

 - New POST /admin/api/{maglevd}/backend/{name}/{pause|resume|
   enable|disable} endpoint, gated by the same basic-auth
   middleware as the SPA shell. maglevClient.BackendAction wraps
   the four matching gRPC RPCs and returns a fresh BackendSnapshot;
   the same transition lands via WatchEvents so every connected
   browser converges through the normal reducer path.
 - BackendActionsMenu Solid component: kebab (⋮) button in a new
   trailing column rendered only in /admin. Click-outside and
   Escape close the popover (document listeners installed only
   while open). Actions are state-aware: up/down/unknown → pause,
   disable; paused → resume, disable; disabled → enable;
   removed → menu suppressed entirely. Busy indicator per action;
   errors render inline under the item list.
 - Structured audit log: every mutation logs an
   admin-backend-action record with maglevd / backend / action /
   resulting state.

 - Renamed debian/vpp-maglevd.service → debian/vpp-maglev.service
   to align naming with the new vpp-maglev-frontend.service
   sibling. postinst handles upgrades by stopping + disabling any
   lingering vpp-maglevd.service before enabling the renamed unit;
   prerm stops both (the frontend unit is installed but not
   enabled by default — operators opt in with systemctl enable).
 - New debian/vpp-maglev-frontend.service (hardened:
   NoNewPrivileges, ProtectSystem=strict, ProtectHome, PrivateTmp,
   no capabilities). Reads the same /etc/default/vpp-maglev
   conffile and expands MAGLEV_FRONTEND_ARGS via
   `ExecStart=/usr/bin/maglev-frontend $MAGLEV_FRONTEND_ARGS` so
   word-splitting works.
 - docs/maglev-frontend.8 manpage documenting flags, endpoints,
   and SSE reverse-proxy requirements.
 - build-deb.sh: drops the commit hash from the .deb filename
   (now vpp-maglev_<version>_<arch>.deb) and no longer takes the
   commit as a CLI arg. Binaries continue to carry the commit via
   -ldflags so `maglevd --version` et al are the authoritative
   "which build is running" answer.
2026-04-12 20:04:53 +02:00