Frontend aggregate state: SPA-side derive + checker fixes

The web UI showed the wrong up/down state for frontends whose pool
composition had been touched by a mix of runtime disable/enable and
weight changes: a frontend with every backend at effective_weight=0
would still display "up", while a sibling frontend with a serving
fallback backend would display "down". Two independent bugs, each
fixed on its own layer.

On the fast path (healthCheckEqual returns true), Reload did
`w.entry = b`, blindly replacing the runtime worker entry with the
fresh YAML record. YAML's default for Enabled is true, so any
backend the operator had runtime-disabled would have its Enabled
flag silently reset while the worker's backend.State stayed at
StateDisabled. Subsequent EnableBackend calls then early-returned
on `if w.entry.Enabled` and never transitioned the state machine
— the CLI reported "enabled, state is 'disabled'" and the backend
was permanently stuck.

Fix: preserve w.entry.Enabled across the fast-path replacement.

    runtimeEnabled := w.entry.Enabled
    w.entry = b
    w.entry.Enabled = runtimeEnabled

Runtime operator state now outlives config reloads. On the worker-
restart path (different health check) the new worker is
structurally fresh and the YAML's Enabled is still authoritative.

Both methods used `w.entry.Enabled` as their idempotency check,
which meant a stuck `Enabled=true, State=disabled` combo couldn't
be repaired even after the Reload fix (existing bad state had to
survive the upgrade). Switched both methods to key on
`w.backend.State`:

 - DisableBackend: if state == StateDisabled, sync the flag but
   don't emit a redundant transition; otherwise do the full
   state transition + flag flip + worker cancel.
 - EnableBackend: if state != StateDisabled, sync the flag but
   don't emit a redundant transition; otherwise do the full
   transition + flag flip + probe-goroutine restart.

Either method will now unstick any inconsistency between the
flag and the state machine — future drift from a panic, a new
code path we haven't thought of, or existing already-stuck
backends from before this commit are all repaired on the next
enable/disable call.

Changing a backend's weight can flip a frontend between up and
down (e.g. zeroing the last non-zero-weighted backend in the
active pool), but SetFrontendPoolBackendWeight never called
updateFrontendState, so the checker's cached frontend state
would drift from reality until the next genuine backend
transition happened to trigger a recompute. The symptom was
"show frontends nginx-ip4-http" reporting up even with every
effective_weight=0.

Fix: call c.updateFrontendState(frontendName, fe) after the
weight mutation, under the same lock. The recompute emits a
FrontendEvent transition if the aggregate flipped, so any
WatchEvents consumer picks up the change live.

stores/state.ts recomputeEffectiveWeights is renamed and
extended to recomputeDerivedState, which now also writes
fe.state using the same rule as health.ComputeFrontendState:
unknown if no backends or all unknown, up if any effective
weight > 0, down otherwise. Called from every mutation path
(replaceAll, replaceSnapshot, applyBackendTransition,
applyConfiguredWeight) so the SPA is authoritative for *display*
state and doesn't inherit any staleness the server's cached
frontendStates map might have.

applyFrontendTransition is now a no-op for the state field —
the server's `to` value is no longer trusted because
recomputeDerivedState walks the local backends array on every
update and produces a fresh, correct answer. The reducer is kept
as a named function so sse.ts's dispatch table still has a
landing spot for "frontend" events (they still feed the
DebugPanel via pushEvent); the empty body is deliberate, not a
bug — a comment at the top spells it out.
This commit is contained in:
2026-04-12 23:50:22 +02:00
parent 4347bb9b05
commit 1191b3d994
5 changed files with 125 additions and 34 deletions

View File

@@ -8,22 +8,36 @@ import type {
} from "../types";
import { tick } from "./tick";
// recomputeEffectiveWeights mirrors the server-side
// health.EffectiveWeights / ActivePoolIndex logic so the SPA can keep
// pool.effective_weight correct the moment a backend transitions,
// without waiting for the 30s refresh. Walking every frontend is cheap
// — O(frontends × pools × backends-per-pool) with tiny constants —
// and it's strictly a function of the backend state map, so there's no
// risk of drift vs. the server as long as the rule stays the same.
// recomputeDerivedState mirrors the server-side
// health.EffectiveWeights / ActivePoolIndex / ComputeFrontendState
// logic so the SPA can keep pool.effective_weight AND the
// per-frontend aggregate state correct the moment any backend
// transitions or any configured weight changes, without waiting for
// the 30s refresh. Walking every frontend is cheap — O(frontends ×
// pools × backends-per-pool) with tiny constants — and it's
// strictly a function of the backend state map + configured
// weights, so there's no risk of drift vs. the server as long as
// the rules stay identical. The SPA is the authoritative source of
// truth for *display* state: the server's cached frontendStates
// field can be stale (e.g. after a SetFrontendPoolBackendWeight
// call that doesn't re-run updateFrontendState, or after a long-
// lived WatchEvents stream where a past transition corrupted the
// client's cache) and the SPA recomputes from its own live
// backends array to avoid inheriting any staleness.
//
// Rule: a backend gets its configured pool weight iff it is up AND
// belongs to the currently-active pool; everything else is 0. The
// active pool is the first pool containing a backend that is both
// up AND has a non-zero configured weight — a pool whose up backends
// are all weight=0 contributes no serving capacity and gets skipped
// over in priority failover. Kept in lock-step with
// internal/health/weights.go.
function recomputeEffectiveWeights(snap: StateSnapshot) {
// Effective weight rule: a backend gets its configured pool weight
// iff it is up AND belongs to the currently-active pool; everything
// else is 0. The active pool is the first pool containing a backend
// that is both up AND has a non-zero configured weight — a pool
// whose up backends are all weight=0 contributes no serving
// capacity and gets skipped over in priority failover. Kept in
// lock-step with internal/health/weights.go ActivePoolIndex.
//
// Frontend state rule: unknown if no backends or every referenced
// backend is still in StateUnknown; up if any backend in any pool
// has effective_weight > 0; otherwise down. Kept in lock-step with
// internal/health/weights.go ComputeFrontendState.
function recomputeDerivedState(snap: StateSnapshot) {
const stateOf: Record<string, string> = {};
for (const b of snap.backends) stateOf[b.name] = b.state;
for (const fe of snap.frontends) {
@@ -41,12 +55,29 @@ function recomputeEffectiveWeights(snap: StateSnapshot) {
break;
}
}
let anyEffective = false;
let seenAny = false;
let allUnknown = true;
const seen = new Set<string>();
for (let i = 0; i < fe.pools.length; i++) {
for (const pb of fe.pools[i].backends) {
const st = stateOf[pb.name];
pb.effective_weight = st === "up" && i === activePool ? pb.weight : 0;
if (pb.effective_weight > 0) anyEffective = true;
if (!seen.has(pb.name)) {
seen.add(pb.name);
seenAny = true;
if (st !== "unknown") allUnknown = false;
}
}
}
if (!seenAny || allUnknown) {
fe.state = "unknown";
} else if (anyEffective) {
fe.state = "up";
} else {
fe.state = "down";
}
}
}
@@ -61,6 +92,14 @@ const [state, setState] = createStore<FrontendState>({ byName: {} });
export { state };
export function replaceSnapshot(snap: StateSnapshot) {
// Recompute effective weights + aggregate frontend state locally
// from the snapshot's backends array, rather than trusting the
// server's state field verbatim. The server can be stale (the
// checker's frontendStates map is only updated on backend
// transitions, not on weight changes), so deriving from our own
// backend data is the only way to guarantee the display stays
// consistent with reality.
recomputeDerivedState(snap);
setState(
produce((s) => {
s.byName[snap.maglevd.name] = snap;
@@ -70,7 +109,10 @@ export function replaceSnapshot(snap: StateSnapshot) {
export function replaceAll(snaps: StateSnapshot[]) {
const byName: Record<string, StateSnapshot> = {};
for (const s of snaps) byName[s.maglevd.name] = s;
for (const s of snaps) {
recomputeDerivedState(s);
byName[s.maglevd.name] = s;
}
setState({ byName });
}
@@ -96,25 +138,26 @@ export function applyBackendTransition(maglevd: string, p: BackendEventPayload)
}
// A backend state change can shift which pool is active and
// therefore which pool-memberships get non-zero effective
// weights. Recompute for every frontend — not just the one
// weights, and in turn can flip the frontend's aggregate
// state. Recompute for every frontend — not just the one
// pointed at by this backend — because pool-failover is a
// per-frontend computation and the same backend can appear in
// multiple frontends with different pool placements.
recomputeEffectiveWeights(snap);
recomputeDerivedState(snap);
}),
);
}
export function applyFrontendTransition(maglevd: string, p: FrontendEventPayload) {
setState(
produce((s) => {
const snap = s.byName[maglevd];
if (!snap) return;
const fe = snap.frontends.find((x) => x.name === p.frontend);
if (!fe) return;
fe.state = p.transition.to;
}),
);
// Frontend-transition events arrive from the server's checker, but
// the SPA no longer trusts their `to` field — recomputeDerivedState
// walks the local backends array on every backend event and every
// hydration to produce an up-to-date frontend state that the server
// can't make stale. Kept as a named reducer so sse.ts's dispatch
// table still has a landing spot for "frontend" events (they also
// flow into the DebugPanel via pushEvent); the body is deliberately
// empty — not a bug.
export function applyFrontendTransition(_maglevd: string, _p: FrontendEventPayload) {
// no-op — state is derived client-side, see recomputeDerivedState
}
export function applyVPPStatus(maglevd: string, state: string) {
@@ -165,7 +208,7 @@ export function applyConfiguredWeight(
const pb = p.backends.find((x) => x.name === backend);
if (!pb) return;
pb.weight = weight;
recomputeEffectiveWeights(snap);
recomputeDerivedState(snap);
}),
);
}