224167ce396c4441070dc3705e74579933c16e3a
2 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
224167ce39 |
Dataplane reconcile fixes; LB counters cleanup; SPA scope cookie
Checker / reload:
- Reload's update-in-place branch now mirrors b.Address onto the
runtime health.Backend. Without this, GetBackend kept returning
the pre-reload address indefinitely after a config edit that
touched addresses but not healthcheck settings — the VPP sync
path reads cfg.Backends directly so the dataplane moved on
while the gRPC and SPA view stayed wedged on the old IPv4/IPv6.
Sync (internal/vpp/lbsync.go):
- reconcileVIP now detects encap mismatch in addition to
src-ip-sticky mismatch and takes the full tear-down / re-add
path via a new shared recreateVIP helper. Triggered when every
backend flips address family (gre4 <-> gre6) and the existing
VIP can no longer accept new ASes — previously the sync wedged
with 'Invalid address family' until a full maglevd restart.
- setASWeight is issued whenever the state machine requests
flush (a.Flush=true), not only on the weight-value transition
edge. Fixes the case where a backend reached StateDisabled
after its effective weight had already been drained to 0 by
pool failover — the sticky-cache entries pointing at it were
previously never cleared.
maglev-frontend:
- signal.Ignore(SIGHUP) so a controlling-terminal disconnect
doesn't kill the daemon.
- debian/vpp-maglev.service grants CAP_SYS_ADMIN in addition to
CAP_NET_RAW so setns(CLONE_NEWNET) can join the healthcheck
netns. Comment documents the 'operation not permitted' symptom
and notes the knob can be dropped if the deployment doesn't use
the 'netns:' healthcheck option.
LB plugin counters (internal/vpp/lbstats.go + friends):
- Fix the VIP counter regex: the LB plugin registers
vlib_simple_counter_main_t names without a leading '/'
(vlib_validate_simple_counter in counter.c:50 uses cm->name
verbatim; only entries that set cm->stat_segment_name get a
slash). first/next/untracked/no-server now read through as
live values instead of zero.
- Drop the per-backend FIB counter block end-to-end (proto,
grpcapi, metrics, vpp.Client, lbstats, maglevc). Traced from
lb/node.c:558 into ip{4,6}_forward.h:141 — the LB plugin
forwards by writing adj_index[VLIB_TX] directly and bypassing
ip{4,6}_lookup_inline, which is the only path that increments
lbm_to_counters. The backend's FIB load_balance stats_index
literally never ticks for LB-forwarded traffic, so the column
was always zero and misleading. docs/implementation/TODO
records the full investigation and the recommended upstream
path (new lb_as_stats_dump API message) for when we're ready
to carry that VPP patch.
- maglevc show vpp lb counters: plain-text tabular headers.
label() wraps strings in ANSI escapes (~11 bytes of overhead),
but tabwriter counts bytes, not rendered width — so a header
row with label()'d cells and data rows with plain cells drifts
column alignment on every row. color.go comment now spells
out the constraint: label() only works when column N is
wrapped identically in every row (key-value layouts are fine,
multi-column tables with header-only labelling are not).
SPA:
- stores/scope.ts is cookie-backed (maglev_scope, 1 year,
SameSite=Lax). App.tsx hydrates from the cookie then validates
against the fetched snapshots: a cookie referencing a maglevd
that no longer exists falls through to snaps[0] instead of
leaving the user on a ghost selection.
- components/Flash.tsx wraps props.value in createMemo. Solid's
on() fires its callback on every dep notification, not on
value change — source is right in solid-js/dist/solid.js:460,
no equality check. Without the memo, flipping scope between
two 'connected' maglevds (or any other cross-store reactive
re-eval that doesn't actually change the concrete string)
replays the animation every time. createMemo's default ===
dedupe fixes it in one place for every Flash consumer,
superseding the local createMemo workaround we'd added in
BackendRow earlier.
|
||
|
|
1a1c48ef54 |
LB buckets column + health cascade; VPP dump fix; maglevc strictness
SPA (cmd/frontend/web): - New "lb buckets" column backed by a 1s-debounced GetVPPLBState fetch loop with leading+trailing edge coalesce. - Per-frontend health icon (✅/⚠️/❗/‼️/❓) in the Zippy header, gated by a settling flag that suppresses ‼️ until the next lb-state reconciliation after a backend transition or weight change. - In-place leaf merge on lb-state so stable bucket values (e.g. "0") don't retrigger the Flash animation on every refresh. - Zippy cards remember open state in a cookie, default closed on fresh load; fixed-width frontend-title-name + reserved icon slot so headers line up across all cards. - Clock-drift watchdog in sse.ts that forces a fresh EventSource on laptop-wake so the broker emits a resync instead of hanging on a dead half-open socket. Frontend service (cmd/frontend): - maglevClient.lbStateLoop, trigger on backend transitions + vpp-connect, best-effort fetch on refreshAll. - Admin handlers explicitly wake the lb-state loop after lifecycle ops and set-weight (the latter emits no transition event on the maglevd side, so the WatchEvents path wouldn't have caught it). - /favicon.ico served from embedded web/public IPng logo. VPP integration: - internal/vpp/lbstate.go: dumpASesForVIP drops Pfx from the dump request (setting it silently wipes IPv4 replies in the LB plugin) and filters results by prefix on the response side instead, which also demuxes multi-VIP-on-same-port cases correctly. maglevc: - Walk now returns the unconsumed token tail; dispatch and the question listener reject unknown commands with a targeted error instead of dumping the full command tree prefixed with garbage. - On '?', echo the current line (including the '?') before the help list so the output reads like birdc. Checker / prober: - internal/checker: ±10% jitter on NextInterval so probes across restart don't all fire on the same tick. - internal/prober: HTTP User-Agent now carries the build version and project URL. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |