VPP LB counters, src-ip-sticky, and frontend state aggregation

New feature: per-VIP / per-backend runtime counters
  * New GetVPPLBCounters RPC serving an in-process snapshot refreshed
    by a 5s scrape loop (internal/vpp/lbstats.go). Each cycle pulls
    the LB plugin's four SimpleCounters (next, first, untracked,
    no-server) plus the FIB /net/route/to CombinedCounter for every
    VIP and every backend host prefix via a single DumpStats call.
  * FIB stats-index discovery via ip_route_lookup (internal/vpp/
    fibstats.go); per-worker reduction happens in the collector.
  * Prometheus collector exports vip_packets_total (kind label),
    vip_route_{packets,bytes}_total, and backend_route_{packets,
    bytes}_total. Metrics source interface extended with VIPStats /
    BackendRouteStats; vpp.Client publishes snapshots via
    atomic.Pointer and clears them on disconnect.
  * New 'show vpp lb counters' CLI command. The 'show vpp lbstate'
    and 'sync vpp lbstate' commands are restructured under 'show
    vpp lb {state,counters}' / 'sync vpp lb state' to make room
    for the new verb.

New feature: src-ip-sticky frontends
  * New frontend YAML key 'src-ip-sticky' (bool). Plumbed through
    config.Frontend, desiredVIP, and the lb_add_del_vip_v2 call.
  * Reflected in gRPC FrontendInfo.src_ip_sticky and VPPLBVIP.
    src_ip_sticky, and shown in 'show vpp lb state' output.
  * Scraped back from VPP by parsing 'show lb vips verbose' through
    cli_inband — lb_vip_details does not expose the flag. The same
    scrape also recovers the LB pool index for each VIP, which the
    stats-segment counters are keyed on. This is a documented
    temporary workaround until VPP ships an lb_vip_v2_dump.
  * src_ip_sticky cannot be mutated on a live VIP, so a flipped flag
    triggers a tear-down-and-recreate in reconcileVIP (ASes deleted
    with flush, VIP deleted, then re-added). Flip is logged.

New feature: frontend state aggregation and events
  * New health.FrontendState (unknown/up/down) and FrontendTransition
    types. A frontend is 'up' iff at least one backend has a nonzero
    effective weight, 'unknown' iff no backend has real state yet,
    and 'down' otherwise.
  * Checker tracks per-frontend aggregate state, recomputing after
    each backend transition and emitting a frontend-transition Event
    on change. Reload drops entries for removed frontends.
  * checker.Event gains an optional FrontendTransition pointer;
    backend- vs. frontend-transition events are demultiplexed on
    that field.
  * WatchEvents now sends an initial snapshot of frontend state on
    connect (mirroring the existing backend snapshot), subscribes
    once to the checker stream, and fans out to backend/frontend
    handlers based on the client's filter flags. The proto
    FrontendEvent message grows name + transition fields.
  * New Checker.FrontendState accessor.

Refactor: pure health helpers
  * Moved the priority-failover selector and the (pool idx, active
    pool, state, cfg weight) → (vpp weight, flush) mapping out of
    internal/vpp/lbsync.go into a new internal/health/weights.go so
    the checker can reuse them for frontend-state computation
    without importing internal/vpp.
  * New functions: health.ActivePoolIndex, BackendEffectiveWeight,
    EffectiveWeights, ComputeFrontendState. lbsync.go now calls
    these directly; vpp.EffectiveWeights is a thin wrapper over
    health.EffectiveWeights retained for the gRPC observability
    path. Fully unit-tested in internal/health/weights_test.go.

maglevc polish
  * --color default is now mode-aware: on in the interactive shell,
    off in one-shot mode so piped output is script-safe. Explicit
    --color=true/false still overrides.
  * New stripHostMask helper drops /32 and /128 from VIP display;
    non-host prefixes pass through unchanged.
  * Counter table column order fixed (first before next) and
    packets/bytes columns renamed to fib-packets/fib-bytes to
    clarify they come from the FIB, not the LB plugin.

Docs
  * config-guide: document src-ip-sticky, including the VIP
    recreate-on-change caveat.
  * user-guide, maglevc.1, maglevd.8: updated command tree, new
    counters command, color defaults, and the src-ip-sticky field.
This commit is contained in:
2026-04-12 15:59:02 +02:00
parent d5fbf5c640
commit fb62532fd5
25 changed files with 2163 additions and 549 deletions

View File

@@ -23,13 +23,26 @@ type BackendSnapshot struct {
Config config.Backend
}
// Event is emitted on every backend state transition, once per frontend that
// references the backend.
// Event is emitted on every state transition the checker observes. There are
// two kinds, distinguished by which of BackendName or FrontendTransition is
// populated:
//
// - Backend transition: FrontendName is the frontend that references the
// backend (one event per frontend per backend transition), BackendName
// and Backend are set, and Transition carries the health.Transition.
// FrontendTransition is nil.
// - Frontend transition: FrontendName is the frontend whose aggregate state
// changed, FrontendTransition is non-nil. BackendName and Backend are
// empty, Transition is the zero value.
//
// Consumers dispatch on FrontendTransition != nil.
type Event struct {
FrontendName string
BackendName string
Backend net.IP
Transition health.Transition
FrontendTransition *health.FrontendTransition
}
type worker struct {
@@ -49,6 +62,13 @@ type Checker struct {
mu sync.RWMutex
workers map[string]*worker // keyed by backend name
// frontendStates tracks the aggregated state of every configured frontend
// (unknown/up/down). Updated whenever a backend transition happens; a
// change emits a frontend-transition Event. The zero value for a missing
// key is FrontendStateUnknown, so initial-reference accesses behave
// correctly even without explicit seeding.
frontendStates map[string]health.FrontendState
subsMu sync.Mutex
nextID int
subs map[int]chan Event
@@ -58,10 +78,11 @@ type Checker struct {
// New creates a Checker. Call Run to start probing.
func New(cfg *config.Config) *Checker {
return &Checker{
cfg: cfg,
workers: make(map[string]*worker),
subs: make(map[int]chan Event),
eventCh: make(chan Event, 256),
cfg: cfg,
workers: make(map[string]*worker),
frontendStates: make(map[string]health.FrontendState),
subs: make(map[int]chan Event),
eventCh: make(chan Event, 256),
}
}
@@ -131,6 +152,13 @@ func (c *Checker) Reload(ctx context.Context, cfg *config.Config) error {
c.emitForBackend(name, c.workers[name].backend.Address, c.workers[name].backend.Transitions[0], cfg.Frontends)
}
// Drop frontendStates entries for frontends no longer in config.
for feName := range c.frontendStates {
if _, ok := cfg.Frontends[feName]; !ok {
delete(c.frontendStates, feName)
}
}
c.cfg = cfg
return nil
}
@@ -174,6 +202,18 @@ func (c *Checker) BackendState(name string) (health.State, bool) {
return w.backend.State, true
}
// FrontendState returns the current aggregate state of a frontend (unknown,
// up, or down). Returns (FrontendStateUnknown, false) when the frontend is
// not known to the checker.
func (c *Checker) FrontendState(name string) (health.FrontendState, bool) {
c.mu.RLock()
defer c.mu.RUnlock()
if _, ok := c.cfg.Frontends[name]; !ok {
return health.FrontendStateUnknown, false
}
return c.frontendStates[name], true
}
// ListFrontends returns the names of all configured frontends.
func (c *Checker) ListFrontends() []string {
c.mu.RLock()
@@ -575,24 +615,60 @@ func (c *Checker) runProbe(ctx context.Context, name string, pos, total int) {
}
}
// emitForBackend emits one Event per frontend that references backendName
// (in any pool), using the provided frontends map. Must be called with c.mu held.
// emitForBackend emits one backend-transition Event per frontend that
// references backendName (in any pool), using the provided frontends map.
// After emitting the backend event for a frontend, it also re-computes that
// frontend's aggregate state and emits a frontend-transition Event if the
// state has changed. Must be called with c.mu held.
func (c *Checker) emitForBackend(backendName string, addr net.IP, t health.Transition, frontends map[string]config.Frontend) {
for feName, fe := range frontends {
emitted := false
for _, pool := range fe.Pools {
if emitted {
break
}
for name := range pool.Backends {
if name == backendName {
c.emit(Event{FrontendName: feName, BackendName: backendName, Backend: addr, Transition: t})
emitted = true
break
}
if !frontendReferencesBackend(fe, backendName) {
continue
}
c.emit(Event{FrontendName: feName, BackendName: backendName, Backend: addr, Transition: t})
c.updateFrontendState(feName, fe)
}
}
// frontendReferencesBackend reports whether fe has the named backend in any
// of its pools.
func frontendReferencesBackend(fe config.Frontend, backendName string) bool {
for _, pool := range fe.Pools {
if _, ok := pool.Backends[backendName]; ok {
return true
}
}
return false
}
// updateFrontendState recomputes the aggregate state of fe, compares against
// the last known state, and emits a frontend-transition Event on change.
// Must be called with c.mu held. The current state is read from the worker
// map — so the caller (who already holds c.mu) sees a consistent view.
func (c *Checker) updateFrontendState(feName string, fe config.Frontend) {
states := make(map[string]health.State)
for _, pool := range fe.Pools {
for bName := range pool.Backends {
if w, ok := c.workers[bName]; ok {
states[bName] = w.backend.State
} else {
states[bName] = health.StateUnknown
}
}
}
newState := health.ComputeFrontendState(fe, states)
old := c.frontendStates[feName] // zero value (Unknown) on first access
if old == newState {
return
}
c.frontendStates[feName] = newState
ft := health.FrontendTransition{From: old, To: newState, At: time.Now()}
slog.Info("frontend-transition",
"frontend", feName,
"from", old.String(),
"to", newState.String(),
)
c.emit(Event{FrontendName: feName, FrontendTransition: &ft})
}
// emit sends an event to the internal fan-out channel (non-blocking).