VPP LB counters, src-ip-sticky, and frontend state aggregation

New feature: per-VIP / per-backend runtime counters
  * New GetVPPLBCounters RPC serving an in-process snapshot refreshed
    by a 5s scrape loop (internal/vpp/lbstats.go). Each cycle pulls
    the LB plugin's four SimpleCounters (next, first, untracked,
    no-server) plus the FIB /net/route/to CombinedCounter for every
    VIP and every backend host prefix via a single DumpStats call.
  * FIB stats-index discovery via ip_route_lookup (internal/vpp/
    fibstats.go); per-worker reduction happens in the collector.
  * Prometheus collector exports vip_packets_total (kind label),
    vip_route_{packets,bytes}_total, and backend_route_{packets,
    bytes}_total. Metrics source interface extended with VIPStats /
    BackendRouteStats; vpp.Client publishes snapshots via
    atomic.Pointer and clears them on disconnect.
  * New 'show vpp lb counters' CLI command. The 'show vpp lbstate'
    and 'sync vpp lbstate' commands are restructured under 'show
    vpp lb {state,counters}' / 'sync vpp lb state' to make room
    for the new verb.

New feature: src-ip-sticky frontends
  * New frontend YAML key 'src-ip-sticky' (bool). Plumbed through
    config.Frontend, desiredVIP, and the lb_add_del_vip_v2 call.
  * Reflected in gRPC FrontendInfo.src_ip_sticky and VPPLBVIP.
    src_ip_sticky, and shown in 'show vpp lb state' output.
  * Scraped back from VPP by parsing 'show lb vips verbose' through
    cli_inband — lb_vip_details does not expose the flag. The same
    scrape also recovers the LB pool index for each VIP, which the
    stats-segment counters are keyed on. This is a documented
    temporary workaround until VPP ships an lb_vip_v2_dump.
  * src_ip_sticky cannot be mutated on a live VIP, so a flipped flag
    triggers a tear-down-and-recreate in reconcileVIP (ASes deleted
    with flush, VIP deleted, then re-added). Flip is logged.

New feature: frontend state aggregation and events
  * New health.FrontendState (unknown/up/down) and FrontendTransition
    types. A frontend is 'up' iff at least one backend has a nonzero
    effective weight, 'unknown' iff no backend has real state yet,
    and 'down' otherwise.
  * Checker tracks per-frontend aggregate state, recomputing after
    each backend transition and emitting a frontend-transition Event
    on change. Reload drops entries for removed frontends.
  * checker.Event gains an optional FrontendTransition pointer;
    backend- vs. frontend-transition events are demultiplexed on
    that field.
  * WatchEvents now sends an initial snapshot of frontend state on
    connect (mirroring the existing backend snapshot), subscribes
    once to the checker stream, and fans out to backend/frontend
    handlers based on the client's filter flags. The proto
    FrontendEvent message grows name + transition fields.
  * New Checker.FrontendState accessor.

Refactor: pure health helpers
  * Moved the priority-failover selector and the (pool idx, active
    pool, state, cfg weight) → (vpp weight, flush) mapping out of
    internal/vpp/lbsync.go into a new internal/health/weights.go so
    the checker can reuse them for frontend-state computation
    without importing internal/vpp.
  * New functions: health.ActivePoolIndex, BackendEffectiveWeight,
    EffectiveWeights, ComputeFrontendState. lbsync.go now calls
    these directly; vpp.EffectiveWeights is a thin wrapper over
    health.EffectiveWeights retained for the gRPC observability
    path. Fully unit-tested in internal/health/weights_test.go.

maglevc polish
  * --color default is now mode-aware: on in the interactive shell,
    off in one-shot mode so piped output is script-safe. Explicit
    --color=true/false still overrides.
  * New stripHostMask helper drops /32 and /128 from VIP display;
    non-host prefixes pass through unchanged.
  * Counter table column order fixed (first before next) and
    packets/bytes columns renamed to fib-packets/fib-bytes to
    clarify they come from the FIB, not the LB plugin.

Docs
  * config-guide: document src-ip-sticky, including the VIP
    recreate-on-change caveat.
  * user-guide, maglevc.1, maglevd.8: updated command tree, new
    counters command, color defaults, and the src-ip-sticky field.
This commit is contained in:
2026-04-12 15:59:02 +02:00
parent d5fbf5c640
commit fb62532fd5
25 changed files with 2163 additions and 549 deletions

View File

@@ -288,6 +288,15 @@ ordered list of backend pools. The gRPC API exposes frontends by name.
cascades across further tiers. See [healthchecks.md](healthchecks.md#pool-failover)
for the full failover semantics. All backends across all pools in a frontend must
have addresses of the same address family (all IPv4 or all IPv6).
* ***src-ip-sticky***: Boolean, default `false`. When `true`, the VPP load-balancer
programs this VIP with source-IP-based stickiness — all flows from the same client
source IP hash to the same backend (subject to the Maglev consistent-hash bucket
assignment). Use this for protocols that require session affinity at the L3 level,
or when clients open many short flows that should land on one backend. Changing this
field in a running config and reloading causes maglevd to tear down the VIP (all
application servers are deleted with flush, then the VIP itself is deleted) and
recreate it with the new value; VPP has no API to mutate `src_ip_sticky` on an
existing VIP, and existing flow state cannot be preserved across the flip.
Each pool has:
@@ -324,6 +333,7 @@ frontends:
address: 2001:db8::1
protocol: tcp
port: 993
src-ip-sticky: true
pools:
- name: primary
backends:

View File

@@ -12,15 +12,21 @@ is an interactive CLI client for
.BR maglevd (8).
Without arguments it opens a readline shell with tab completion and
inline help.
A command may also be passed directly on the command line for one\-shot use,
which is useful for scripting (use
.B \-color=false
to suppress ANSI codes).
A command may also be passed directly on the command line for one\-shot
use, which is useful for scripting; in that mode ANSI color is
disabled by default so the output is script\-safe. Pass
.B \-color=true
explicitly if you want color in one\-shot mode.
.PP
When the shell starts it prints the build version and connects to the
.B maglevd
gRPC server specified by
.BR \-server .
All static tokens support tab completion; dynamic names (frontend,
backend, health\-check names) are completed by querying the server.
Type
.B ?
at any point to list completions without advancing the input.
.SH OPTIONS
.TP
.BI \-server " addr"
@@ -31,81 +37,59 @@ gRPC server.
.IR localhost:9090 )
.TP
.BR \-color [=\fIbool\fR]
Colorize static field labels in output using ANSI dark blue.
(default: true)
Pass
Colorize static field labels in output using ANSI dark blue. The
default is mode\-aware: enabled (true) in the interactive shell, and
disabled (false) in one\-shot mode so that output piped into scripts
or files stays free of escape codes. Pass
.B \-color=true
or
.B \-color=false
to disable, e.g.\& when piping output.
.SH COMMANDS
Commands are entered at the
.B maglevc>
prompt or passed as arguments on the command line.
All static tokens support tab completion; dynamic names (frontend, backend,
health\-check names) are completed by querying the server.
Type
.B ?
at any point to list completions without advancing the input.
.SS Show commands
.TP
.B show version
Print build version, commit hash, and build date.
.TP
.B show frontends
List all configured frontends.
.TP
.BI "show frontend " name
Show address, protocol, port, description, and pools (with weights and
disabled\-backend notation) for the named frontend.
.TP
.B show backends
List all active backends.
.TP
.BI "show backend " name
Show address, current health state (with duration), enabled flag,
health\-check name, and recent state transitions with timestamps.
.TP
.B show healthchecks
List all configured health checks.
.TP
.BI "show healthcheck " name
Show the full configuration of the named health check.
.SS Set commands
.TP
.BI "set backend " "name " pause
Pause health checking for a backend, freezing its state.
.TP
.BI "set backend " "name " resume
Resume health checking for a backend; state resets to
.BR unknown .
.SS Shell commands
.TP
.BR quit ", " exit
Exit the interactive shell.
.SH COMPLETION
In interactive mode, press
.B Tab
to complete the current token.
If more than one completion is possible, all candidates are listed.
Type
.B ?
anywhere on the line to list candidates at that position without consuming
the character or advancing the cursor.
explicitly to override the default for either mode.
.SH EXAMPLES
One\-shot query (no color, suitable for scripts):
Open the interactive shell (no command on the command line). Tab
completes the current token; typing
.B ?
lists candidates at the cursor.
.B quit
or
.B exit
(or Ctrl\-D) leaves the shell:
.PP
.RS
.EX
maglevc \-color=false show backends
$ maglevc
maglevc> show frontends
\&...
maglevc> quit
.EE
.RE
.PP
Interactive session:
One\-shot query passed on the command line (color is off by default
in this mode so the output is script\-safe):
.PP
.RS
.EX
maglevc \-server 10.0.0.1:9090
$ maglevc show frontends
.EE
.RE
.PP
Query VPP version and connection status, forcing color on:
.PP
.RS
.EX
$ maglevc \-color=true show vpp info
.EE
.RE
.SH "FULL DOCUMENTATION"
This manpage documents only the invocation of
.BR maglevc .
For the complete command reference — every
.BR show ", " set ", " sync ", " config ", and " watch
command, with examples and operational notes — see the user guide at:
.PP
.RS
https://git.ipng.ch/ipng/vpp-maglev/docs/user-guide.md
.RE
.SH SEE ALSO
.BR maglevd (8)
.SH AUTHOR

View File

@@ -1,6 +1,6 @@
.TH MAGLEVD 8 "April 2026" "vpp\-maglev" "System Administration"
.SH NAME
maglevd \- Maglev health\-checker daemon
maglevd \- Maglev health\-checker daemon and VPP load\-balancer controller
.SH SYNOPSIS
.B maglevd
[\fB\-config\fR \fIfile\fR]
@@ -10,23 +10,44 @@ maglevd \- Maglev health\-checker daemon
[\fB\-version\fR]
.SH DESCRIPTION
.B maglevd
is a health\-checker daemon that monitors backends (HTTP, TCP, ICMP) and
exposes their aggregated state via a gRPC API.
Configuration is loaded from a YAML file.
A running daemon reloads its configuration when it receives
.BR SIGHUP .
has two responsibilities:
.IP \(bu 2
It monitors backends with active health checks (HTTP, TCP, ICMP) and
aggregates the results into a state machine per backend. Probe
intervals adapt automatically: a faster interval is used while a
backend is degraded, and a slower one once it has been continuously
down. Operators can also
.B pause/resume
or
.B disable/enable
a backend out of band.
.IP \(bu 2
It programs the VPP dataplane via the
.B lb
(load\-balancer) plugin. Each frontend in the config becomes a VPP
load\-balancer VIP; each healthy backend becomes an application server
under its frontend's VIPs. State transitions drive
application\-server weight updates over the VPP binary API so that
unhealthy backends stop receiving new flows. Drift between the
running config and VPP's view is reconciled periodically
.RB ( maglev.vpp.lb.sync-interval ,
default 30s), on
.B SIGHUP
reloads, and on operator request via
.BR maglevc .
.PP
Backends are tracked with a rise/fall counter model.
Each backend cycles through the states
.BR unknown ,
.BR up ,
.BR down ,
and
.B paused
(operator\-set).
Health\-check intervals adapt automatically: a faster interval is used when
a backend is not fully healthy, and a slower interval when it has been
continuously down.
The aggregated backend state, VPP dataplane state, and per\-VIP /
per\-backend stats\-segment counters are exposed via a gRPC API (and
scraped into Prometheus when the
.B /metrics
endpoint is enabled).
See
.BR maglevc (1)
for the interactive CLI client.
.PP
Configuration is loaded from a YAML file. A running daemon reloads
its configuration when it receives
.BR SIGHUP .
.SH OPTIONS
Each flag may also be supplied via an environment variable (shown in
parentheses); the flag takes precedence.
@@ -63,12 +84,16 @@ Print version, commit hash, and build date, then exit.
.SH SIGNALS
.TP
.B SIGHUP
Reload the configuration file without restarting.
New backends are added, removed backends are stopped, and unchanged
backend workers are left running.
Reload the configuration file without restarting. New backends are
added, removed backends are stopped, and unchanged backend workers
keep running. After the reload, the VPP dataplane is reconciled so
added/removed frontends and application servers are programmed
immediately.
.TP
.BR SIGTERM ", " SIGINT
Gracefully shut down: drain active gRPC streams, then exit.
Gracefully shut down: drain active gRPC streams, then exit. VPP
dataplane state is left in place so that existing VIPs continue to
forward traffic during a restart.
.SH FILES
.TP
.I /etc/vpp-maglev/maglev.yaml
@@ -77,20 +102,34 @@ Default configuration file (YAML).
.I /etc/default/vpp-maglev
Environment file sourced by the systemd unit before starting
.BR maglevd .
.SH CONFIGURATION
The configuration file uses YAML and has four top\-level sections under the
.B maglev
key:
.BR healthchecker ,
.BR healthchecks ,
.BR backends ,
and
.BR frontends .
.TP
.I /run/vpp/api.sock
VPP binary\-API socket
.BR maglevd
connects to when programming the
.B lb
plugin.
.TP
.I /run/vpp/stats.sock
VPP stats\-segment socket used to scrape per\-VIP and per\-backend
packet/byte counters.
.SH "FULL DOCUMENTATION"
This manpage documents only the invocation of
.BR maglevd .
For the configuration file reference (including the
.B maglev.vpp.lb
section controlling the VPP integration), the health\-check state
machine, and the full operational guide, see:
.PP
See the example at
.I /etc/vpp-maglev/maglev.yaml
and the full reference in the project documentation.
.RS
https://git.ipng.ch/ipng/vpp-maglev/docs/config-guide.md
.br
https://git.ipng.ch/ipng/vpp-maglev/docs/healthchecks.md
.br
https://git.ipng.ch/ipng/vpp-maglev/docs/user-guide.md
.RE
.SH SEE ALSO
.BR maglevc (1)
.BR maglevc (1),
.BR vpp (8)
.SH AUTHOR
Pim van Pelt <pim@ipng.ch>

View File

@@ -100,11 +100,12 @@ maglevc [--server host:port] [--color[=bool]] [command...]
| Flag | Default | Description |
|---|---|---|
| `--server` | `localhost:9090` | Address of the `maglevd` gRPC server. |
| `--color` | `true` | Colorize static field labels in output (dark blue ANSI). Pass `--color=false` to disable, e.g. when piping. |
| `--color` | mode-aware | Colorize static field labels (dark blue ANSI). Defaults to `true` in the interactive shell and `false` in one-shot mode, so output piped into scripts stays free of escape codes. Pass `--color=true` or `--color=false` explicitly to override either default. |
When `command` arguments are supplied the command is executed and `maglevc`
exits. When no arguments are given an interactive shell is started and the
build version is printed on entry.
exits; in this mode ANSI color is off by default so the output is script-safe.
When no arguments are given an interactive shell is started, the build version
is printed on entry, and color is on by default.
### Commands
@@ -112,9 +113,9 @@ build version is printed on entry.
show version Print build version, commit hash, and build date.
show frontends [<name>] Without name: list all frontend names.
With name: show address, protocol, port, description,
and pools. Each pool lists its backends with two
weight columns:
With name: show address, protocol, port, src-ip-sticky,
description, and pools. Each pool lists its backends
with two weight columns:
weight — configured weight from the YAML
effective — state-aware weight after pool failover
(what gets programmed into VPP)
@@ -131,12 +132,20 @@ show healthchecks [<name>] Without name: list all health-check names.
show vpp info Show VPP version, build date, PID, uptime, and when
maglevd connected. Returns an error if VPP is not
connected.
show vpp lbstate Show the VPP load-balancer plugin state: global
show vpp lb state Show the VPP load-balancer plugin state: global
configuration, configured VIPs, and their attached
application servers (address, weight, bucket count).
Returns an error if VPP is not connected.
show vpp lb counters Show per-VIP and per-backend packet/byte counters
from the VPP stats segment, refreshed roughly every
five seconds by maglevd. Each VIP row reports the LB
plugin counters (next, first, untracked, no-server)
and the FIB packets/bytes at the VIP's host prefix.
Each backend row reports FIB packets/bytes at the
backend's /32 or /128 prefix. Use Prometheus for
live rates; this command shows absolute values.
sync vpp lbstate [<name>] Reconcile the VPP load-balancer dataplane from the
sync vpp lb state [<name>] Reconcile the VPP load-balancer dataplane from the
running config. Without a name: runs a full sync —
creates missing VIPs, removes stale VIPs, and adjusts
application-server membership and weights across all