VPP load-balancer dataplane integration: state, sync, and global conf

This commit wires maglevd through to VPP's LB plugin end-to-end, using
locally-generated GoVPP bindings for the newer v2 API messages.

VPP binapi (vendored)
- New package internal/vpp/binapi/ containing lb, lb_types, ip_types, and
  interface_types, generated from a local VPP build (~/src/vpp) via a new
  'make vpp-binapi' target. GoVPP v0.12.0 upstream lacks the v2 messages we
  need (lb_conf_get, lb_add_del_vip_v2, lb_add_del_as_v2, lb_as_v2_dump,
  lb_as_set_weight), so we commit the generated output in-tree.
- All generated files go through our loggedChannel wrapper; every VPP API
  send/receive is recorded at DEBUG via slog (vpp-api-send / vpp-api-recv /
  vpp-api-send-multi / vpp-api-recv-multi) so the full wire-level trail is
  auditable. NewAPIChannel is unexported — callers must use c.apiChannel().

Read path: GetLBState{All,VIP}
- GetLBStateAll returns a full snapshot (global conf + every VIP with its
  attached application servers).
- GetLBStateVIP looks up a single VIP by (prefix, protocol, port) and
  returns (nil, nil) when the VIP doesn't exist in VPP. This is the
  efficient path for targeted updates on a busy LB.
- Helpers factored out: getLBConf, dumpAllVIPs, dumpASesForVIP, lookupVIP,
  vipFromDetails.

Write path: SyncLBState{All,VIP}
- SyncLBStateAll reconciles every configured frontend with VPP: creates
  missing VIPs, removes stale ones (with AS flush), and reconciles AS
  membership and weights within VIPs that exist on both sides.
- SyncLBStateVIP targets a single frontend by name. Never removes VIPs.
  Returns ErrFrontendNotFound (wrapped with the name) when the frontend
  isn't in config, so callers can use errors.Is.
- Shared reconcileVIP helper does the per-VIP AS diff; removeVIP is used
  only by the full-sync pass.
- LbAddDelVipV2 requests always set NewFlowsTableLength=1024. The .api
  default=1024 annotation is only applied by VAT/CLI parsers, not wire-
  level marshalling — sending 0 caused VPP to vec_validate with mask
  0xFFFFFFFF and OOM-panic.
- Pool semantics: backends in the primary (first) pool of a frontend get
  their configured weight; backends in secondary pools get weight 0. All
  backends are installed so higher layers can flip weights on failover
  without add/remove churn.
- Every individual change emits a DEBUG slog (vpp-lbsync-vip-add/del,
  vpp-lbsync-as-add/del, vpp-lbsync-as-weight). Start/done INFO logs
  carry a scope=all|vip label plus aggregate counts.

Global conf push: SetLBConf
- New SetLBConf(cfg) sends lb_conf with ipv4-src, ipv6-src, sticky-buckets,
  and flow-timeout. Called automatically on VPP (re)connect and after
  every config reload (via doReloadConfig). Results are cached on the
  Client so redundant pushes are silently skipped — only actual changes
  produce a vpp-lb-conf-set INFO log line.

Periodic drift reconciliation
- vpp.Client.lbSyncLoop runs in a goroutine tied to each VPP connection's
  lifetime. Its first tick is immediate (startup and post-reconnect
  sync quickly); subsequent ticks fire every vpp.lb.sync-interval from
  config (default 30s). Purpose: catch drift if something/someone
  modifies VPP state by hand. The loop uses a ConfigSource interface
  (satisfied by checker.Checker via its new Config() accessor) to avoid
  an import cycle with the checker package.

Config schema additions (maglev.vpp.lb)
- sync-interval: positive Go duration, default 30s.
- ipv4-src-address: REQUIRED. Used as the outer source for GRE4 encap
  to application servers. Missing this is a hard semantic error —
  maglevd --check exits 2 and the daemon refuses to start. VPP GRE
  needs a source address and every VIP we program uses GRE, so there
  is no meaningful config without it.
- ipv6-src-address: REQUIRED. Same treatment as ipv4-src-address.
- sticky-buckets-per-core: default 65536, must be a power of 2.
- flow-timeout: default 40s, must be a whole number of seconds in [1s, 120s].
- VPP validation runs at the end of convert() so structural errors in
  healthchecks/backends/frontends surface first — operators fix those,
  then get the VPP-specific requirements.

gRPC API
- New GetVPPLBState RPC returning VPPLBState: global conf + VIPs with
  ASes. Mirrors the read-path but strips fields irrelevant to our
  GRE-only deployment (srv_type, dscp, target_port).
- New SyncVPPLBState RPC with optional frontend_name. Unset → full sync
  (may remove stale VIPs). Set → single-VIP sync (never removes).
  Returns codes.NotFound for unknown frontends, codes.Unavailable when
  VPP integration is disabled or disconnected.

maglevc (CLI)
- New 'show vpp lbstate' command displaying the LB plugin state. VPP-only
  fields the dataplane irrelevant to GRE are suppressed. Per-AS lines use
  a key-value format ("address X  weight Y  flow-table-buckets Z")
  instead of a tabwriter column, which avoids the ANSI-color alignment
  issue we hit with mixed label/data rows.
- New 'sync vpp lbstate [<name>]' command. Without a name, triggers a
  full reconciliation; with a name, targets one frontend.
- Previous 'show vpp lb' renamed to 'show vpp lbstate' for consistency
  with the new sync command.

Test fixtures
- validConfig and all ad-hoc config_test.go fixtures that reach the end
  of convert() now include the two required vpp.lb src addresses.
- tests/01-maglevd/maglevd-lab/maglev.yaml gains a vpp.lb section so the
  robot integration tests can still load the config.
- cmd/maglevc/tree_test.go gains expected paths for the new commands.

Docs
- config-guide.md: new 'vpp' section in the basic structure, detailed
  vpp.lb field reference, noting ipv4/ipv6 src addresses as REQUIRED
  (hard error) with no defaults; example config updated.
- user-guide.md: documented 'show vpp info', 'show vpp lbstate',
  'sync vpp lbstate [<name>]', new --vpp-api-addr and --vpp-stats-addr
  flags, the vpp-lb-conf-set log line, and corrected the pause/resume
  description to reflect that pause cancels the probe goroutine.
- debian/maglev.yaml: example config gains a vpp.lb block with src
  addresses and commented optional overrides.
This commit is contained in:
2026-04-12 10:58:39 +02:00
parent 3227263d68
commit d3c5c86037
24 changed files with 4900 additions and 161 deletions

View File

@@ -11,7 +11,7 @@ in two stages:
ensuring that every backend referenced by a frontend exists, that address families are
consistent within a frontend, and that IP source addresses are the correct family.
If you want to get started quickly, take a look at the [[example config](../debian/mavleg.yaml)].
If you want to get started quickly, take a look at the [example config](../debian/maglev.yaml).
## Basic structure
@@ -22,6 +22,10 @@ maglev:
healthchecker:
[ Global health checker settings ]
vpp:
lb:
[ VPP load-balancer integration settings ]
healthchecks:
my-check:
[ Health check definition ]
@@ -35,9 +39,11 @@ maglev:
[ Frontend (VIP) definition ]
```
All four sections live under the top-level `maglev:` key. The `healthchecks`, `backends`,
All five sections live under the top-level `maglev:` key. The `healthchecks`, `backends`,
and `frontends` sections are maps keyed by an arbitrary name of your choosing. Names must be
unique within their section and are case-sensitive.
unique within their section and are case-sensitive. The `vpp` section is required when
`maglevd` has a working VPP connection — its `lb.ipv4-src-address` and `lb.ipv6-src-address`
fields are mandatory and `maglevd` will refuse to start without them.
---
@@ -61,6 +67,50 @@ maglev:
---
## vpp
Settings controlling the integration with a locally running VPP instance. The
`vpp` section is a map with a single sub-section, `lb`. Both `lb.ipv4-src-address`
and `lb.ipv6-src-address` are **required**`maglevd --check` exits with a
semantic error and the daemon refuses to start when either is missing, because
VPP's GRE encap needs a source address and every VIP `maglevd` programs uses GRE.
* ***lb.ipv4-src-address***: Required. The IPv4 source address VPP uses when
encapsulating IPv4 traffic into GRE4 tunnels to application servers. Must
be a valid IPv4 address. No default.
* ***lb.ipv6-src-address***: Required. The IPv6 source address VPP uses when
encapsulating IPv6 traffic into GRE6 tunnels. Must be a valid IPv6 address.
No default.
* ***lb.sync-interval***: A positive Go duration (e.g. `30s`, `1m`) controlling
how often `maglevd` reconciles the VPP load-balancer dataplane against its
running configuration. On startup, an immediate full sync runs; subsequent
syncs fire at this interval as long as the VPP connection is up. Defaults
to `30s`. The purpose is to catch drift — for example, a VIP added to VPP
by hand — and bring VPP back in line with the maglev config.
* ***lb.sticky-buckets-per-core***: The number of buckets per worker thread in
the established-flow table. Must be a power of 2. Defaults to `65536` (64k).
* ***lb.flow-timeout***: Idle time after which an established flow is removed
from the table. Must be a whole number of seconds between `1s` and `120s`
inclusive. Defaults to `40s`.
These four values are pushed to VPP via `lb_conf` when `maglevd` connects to
VPP and again after every config reload (whenever they change). A log line
`vpp-lb-conf-set` records the effective values.
Example:
```yaml
maglev:
vpp:
lb:
sync-interval: 60s
ipv4-src-address: 10.0.0.1
ipv6-src-address: 2001:db8::1
sticky-buckets-per-core: 65536
flow-timeout: 40s
```
---
## healthchecks
A named map of health check definitions. Each health check describes *how* to probe a backend.
@@ -278,6 +328,6 @@ frontends:
---
For a detailed description of the health state machine, probe intervals, and all transition events,
see [[healthchecks.md](healthchecks.md)]. For a user guide on how to use the maglev daemon and client,
see the [[user-guide.md](user-guide.md)].
see [healthchecks.md](healthchecks.md). For a user guide on how to use the maglev daemon and client,
see the [user-guide.md](user-guide.md).

View File

@@ -88,10 +88,26 @@ show healthchecks [<name>] Without name: list all health-check names.
show vpp info Show VPP version, build date, PID, uptime, and when
maglevd connected. Returns an error if VPP is not
connected.
show vpp lbstate Show the VPP load-balancer plugin state: global
configuration, configured VIPs, and their attached
application servers (address, weight, bucket count).
Returns an error if VPP is not connected.
set backend <name> pause Suspend health checking for a backend, freezing its state.
set backend <name> resume Resume health checking; backend re-enters unknown state
and is probed immediately.
sync vpp lbstate [<name>] Reconcile the VPP load-balancer dataplane from the
running config. Without a name: runs a full sync —
creates missing VIPs, removes stale VIPs, and adjusts
application-server membership and weights across all
frontends. With a name: only the named frontend's VIP
is reconciled, and no VIPs are removed. A full sync
also runs automatically every
maglev.vpp.lb.sync-interval (default 30s) to catch
drift, and once on startup.
set backend <name> pause Stop health checking for a backend. Cancels the probe
goroutine so no further traffic is sent, and freezes
the state at whatever it was when paused.
set backend <name> resume Resume health checking. A fresh probe goroutine is
started and the backend re-enters unknown state.
set backend <name> disable Stop probing entirely and remove the backend from rotation.
The backend remains visible (state: disabled) and can be
re-enabled without reloading configuration.