Frontend flush-on-down policy; v0.9.3

Adds a per-frontend flush-on-down flag (default true) that causes
maglevd to set is_flush=true on lb_as_set_weight when a backend
transitions to StateDown, tearing down existing flows pinned to
the dead AS instead of just draining them. rise/fall debouncing
in the health checker already absorbs single-probe flaps, so a
fall-counted down is almost always a real outage — and during a
real outage the client-visible "connection refused" oscillation
window (where VPP keeps steering existing flows at a dead AS
until retry) is a reliability regression worth closing by default.
Operators who want the pre-flag drain-only behaviour can set
flush-on-down: false per frontend.

BackendEffectiveWeight's truth table grows one axis: StateDown
now returns (0, flushOnDown); StateDisabled still unconditionally
flushes; StateUnknown / StatePaused still never flush. The unit
test pins all four combinations.

The flag surfaces in the gRPC FrontendInfo message and in
`maglevc show frontend <name>` right next to src-ip-sticky.
This commit is contained in:
2026-04-15 01:42:46 +02:00
parent 6293521157
commit 6b2b04b2d1
9 changed files with 78 additions and 36 deletions

View File

@@ -212,6 +212,7 @@ message FrontendInfo {
repeated PoolInfo pools = 5;
string description = 6;
bool src_ip_sticky = 7; // VPP LB uses src-IP-based stickiness for this VIP
bool flush_on_down = 8; // tear down existing flows when a backend goes down
}
message ListBackendsResponse {