This commit wires maglevd through to VPP's LB plugin end-to-end, using
locally-generated GoVPP bindings for the newer v2 API messages.
VPP binapi (vendored)
- New package internal/vpp/binapi/ containing lb, lb_types, ip_types, and
interface_types, generated from a local VPP build (~/src/vpp) via a new
'make vpp-binapi' target. GoVPP v0.12.0 upstream lacks the v2 messages we
need (lb_conf_get, lb_add_del_vip_v2, lb_add_del_as_v2, lb_as_v2_dump,
lb_as_set_weight), so we commit the generated output in-tree.
- All generated files go through our loggedChannel wrapper; every VPP API
send/receive is recorded at DEBUG via slog (vpp-api-send / vpp-api-recv /
vpp-api-send-multi / vpp-api-recv-multi) so the full wire-level trail is
auditable. NewAPIChannel is unexported — callers must use c.apiChannel().
Read path: GetLBState{All,VIP}
- GetLBStateAll returns a full snapshot (global conf + every VIP with its
attached application servers).
- GetLBStateVIP looks up a single VIP by (prefix, protocol, port) and
returns (nil, nil) when the VIP doesn't exist in VPP. This is the
efficient path for targeted updates on a busy LB.
- Helpers factored out: getLBConf, dumpAllVIPs, dumpASesForVIP, lookupVIP,
vipFromDetails.
Write path: SyncLBState{All,VIP}
- SyncLBStateAll reconciles every configured frontend with VPP: creates
missing VIPs, removes stale ones (with AS flush), and reconciles AS
membership and weights within VIPs that exist on both sides.
- SyncLBStateVIP targets a single frontend by name. Never removes VIPs.
Returns ErrFrontendNotFound (wrapped with the name) when the frontend
isn't in config, so callers can use errors.Is.
- Shared reconcileVIP helper does the per-VIP AS diff; removeVIP is used
only by the full-sync pass.
- LbAddDelVipV2 requests always set NewFlowsTableLength=1024. The .api
default=1024 annotation is only applied by VAT/CLI parsers, not wire-
level marshalling — sending 0 caused VPP to vec_validate with mask
0xFFFFFFFF and OOM-panic.
- Pool semantics: backends in the primary (first) pool of a frontend get
their configured weight; backends in secondary pools get weight 0. All
backends are installed so higher layers can flip weights on failover
without add/remove churn.
- Every individual change emits a DEBUG slog (vpp-lbsync-vip-add/del,
vpp-lbsync-as-add/del, vpp-lbsync-as-weight). Start/done INFO logs
carry a scope=all|vip label plus aggregate counts.
Global conf push: SetLBConf
- New SetLBConf(cfg) sends lb_conf with ipv4-src, ipv6-src, sticky-buckets,
and flow-timeout. Called automatically on VPP (re)connect and after
every config reload (via doReloadConfig). Results are cached on the
Client so redundant pushes are silently skipped — only actual changes
produce a vpp-lb-conf-set INFO log line.
Periodic drift reconciliation
- vpp.Client.lbSyncLoop runs in a goroutine tied to each VPP connection's
lifetime. Its first tick is immediate (startup and post-reconnect
sync quickly); subsequent ticks fire every vpp.lb.sync-interval from
config (default 30s). Purpose: catch drift if something/someone
modifies VPP state by hand. The loop uses a ConfigSource interface
(satisfied by checker.Checker via its new Config() accessor) to avoid
an import cycle with the checker package.
Config schema additions (maglev.vpp.lb)
- sync-interval: positive Go duration, default 30s.
- ipv4-src-address: REQUIRED. Used as the outer source for GRE4 encap
to application servers. Missing this is a hard semantic error —
maglevd --check exits 2 and the daemon refuses to start. VPP GRE
needs a source address and every VIP we program uses GRE, so there
is no meaningful config without it.
- ipv6-src-address: REQUIRED. Same treatment as ipv4-src-address.
- sticky-buckets-per-core: default 65536, must be a power of 2.
- flow-timeout: default 40s, must be a whole number of seconds in [1s, 120s].
- VPP validation runs at the end of convert() so structural errors in
healthchecks/backends/frontends surface first — operators fix those,
then get the VPP-specific requirements.
gRPC API
- New GetVPPLBState RPC returning VPPLBState: global conf + VIPs with
ASes. Mirrors the read-path but strips fields irrelevant to our
GRE-only deployment (srv_type, dscp, target_port).
- New SyncVPPLBState RPC with optional frontend_name. Unset → full sync
(may remove stale VIPs). Set → single-VIP sync (never removes).
Returns codes.NotFound for unknown frontends, codes.Unavailable when
VPP integration is disabled or disconnected.
maglevc (CLI)
- New 'show vpp lbstate' command displaying the LB plugin state. VPP-only
fields the dataplane irrelevant to GRE are suppressed. Per-AS lines use
a key-value format ("address X weight Y flow-table-buckets Z")
instead of a tabwriter column, which avoids the ANSI-color alignment
issue we hit with mixed label/data rows.
- New 'sync vpp lbstate [<name>]' command. Without a name, triggers a
full reconciliation; with a name, targets one frontend.
- Previous 'show vpp lb' renamed to 'show vpp lbstate' for consistency
with the new sync command.
Test fixtures
- validConfig and all ad-hoc config_test.go fixtures that reach the end
of convert() now include the two required vpp.lb src addresses.
- tests/01-maglevd/maglevd-lab/maglev.yaml gains a vpp.lb section so the
robot integration tests can still load the config.
- cmd/maglevc/tree_test.go gains expected paths for the new commands.
Docs
- config-guide.md: new 'vpp' section in the basic structure, detailed
vpp.lb field reference, noting ipv4/ipv6 src addresses as REQUIRED
(hard error) with no defaults; example config updated.
- user-guide.md: documented 'show vpp info', 'show vpp lbstate',
'sync vpp lbstate [<name>]', new --vpp-api-addr and --vpp-stats-addr
flags, the vpp-lb-conf-set log line, and corrected the pause/resume
description to reflect that pause cancels the probe goroutine.
- debian/maglev.yaml: example config gains a vpp.lb block with src
addresses and commented optional overrides.
347 lines
9.1 KiB
Go
347 lines
9.1 KiB
Go
// Copyright (c) 2026, Pim van Pelt <pim@ipng.ch>
|
|
|
|
// Package vpp manages the connection to a local VPP instance over its
|
|
// binary API and stats sockets. The Client reconnects automatically when
|
|
// VPP restarts.
|
|
package vpp
|
|
|
|
import (
|
|
"context"
|
|
"log/slog"
|
|
"sync"
|
|
"time"
|
|
|
|
"go.fd.io/govpp/adapter"
|
|
"go.fd.io/govpp/adapter/socketclient"
|
|
"go.fd.io/govpp/adapter/statsclient"
|
|
"go.fd.io/govpp/binapi/vpe"
|
|
"go.fd.io/govpp/core"
|
|
|
|
"git.ipng.ch/ipng/vpp-maglev/internal/config"
|
|
lb "git.ipng.ch/ipng/vpp-maglev/internal/vpp/binapi/lb"
|
|
)
|
|
|
|
// ConfigSource provides a snapshot of the current maglev config to the VPP
|
|
// sync loop. checker.Checker satisfies this interface via its Config() method.
|
|
// Decoupling via an interface avoids an import cycle with the checker package.
|
|
type ConfigSource interface {
|
|
Config() *config.Config
|
|
}
|
|
|
|
const retryInterval = 5 * time.Second
|
|
const pingInterval = 10 * time.Second
|
|
const defaultLBSyncInterval = 30 * time.Second
|
|
|
|
// Info holds VPP version and connection metadata, populated on connect.
|
|
type Info struct {
|
|
Version string
|
|
BuildDate string
|
|
BuildDirectory string
|
|
PID uint32
|
|
BootTime time.Time // when VPP started (from /sys/boottime stats counter)
|
|
ConnectedSince time.Time // when maglevd connected to VPP
|
|
}
|
|
|
|
// Client manages connections to both the VPP API and stats sockets.
|
|
// Both connections are treated as a unit: if either drops, both are
|
|
// torn down and re-established together.
|
|
type Client struct {
|
|
apiAddr string
|
|
statsAddr string
|
|
|
|
mu sync.Mutex
|
|
apiConn *core.Connection
|
|
statsConn *core.StatsConnection
|
|
statsClient adapter.StatsAPI // raw adapter for DumpStats
|
|
info Info // populated on successful connect
|
|
cfgSrc ConfigSource // optional; enables periodic LB sync
|
|
lastLBConf *lb.LbConf // cached last-pushed lb_conf (dedup)
|
|
}
|
|
|
|
// SetConfigSource attaches a live config source. When set, the VPP client
|
|
// runs a periodic SyncLBStateAll loop (at the interval from cfg.VPP.LB.SyncInterval)
|
|
// for as long as the VPP connection is up. Must be called before Run.
|
|
func (c *Client) SetConfigSource(src ConfigSource) {
|
|
c.mu.Lock()
|
|
defer c.mu.Unlock()
|
|
c.cfgSrc = src
|
|
}
|
|
|
|
// New creates a Client for the given socket paths.
|
|
func New(apiAddr, statsAddr string) *Client {
|
|
return &Client{apiAddr: apiAddr, statsAddr: statsAddr}
|
|
}
|
|
|
|
// Run connects to VPP and maintains the connection until ctx is cancelled.
|
|
// If VPP is unavailable or restarts, Run reconnects automatically.
|
|
func (c *Client) Run(ctx context.Context) {
|
|
for {
|
|
if err := c.connect(); err != nil {
|
|
slog.Debug("vpp-connect-failed", "err", err)
|
|
select {
|
|
case <-ctx.Done():
|
|
return
|
|
case <-time.After(retryInterval):
|
|
continue
|
|
}
|
|
}
|
|
|
|
// Fetch version info and record connect time.
|
|
// fetchInfo uses NewAPIChannel and statsClient which both take c.mu,
|
|
// so we must not hold c.mu here.
|
|
info := c.fetchInfo()
|
|
c.mu.Lock()
|
|
c.info = info
|
|
c.mu.Unlock()
|
|
slog.Info("vpp-connect", "version", c.info.Version,
|
|
"build-date", c.info.BuildDate,
|
|
"pid", c.info.PID,
|
|
"api", c.apiAddr, "stats", c.statsAddr)
|
|
|
|
// Read the current LB plugin state so we can log what's programmed.
|
|
if state, err := c.GetLBStateAll(); err != nil {
|
|
slog.Warn("vpp-lb-read-failed", "err", err)
|
|
} else {
|
|
totalAS := 0
|
|
for _, v := range state.VIPs {
|
|
totalAS += len(v.ASes)
|
|
}
|
|
slog.Info("vpp-lb-state",
|
|
"vips", len(state.VIPs),
|
|
"application-servers", totalAS,
|
|
"sticky-buckets-per-core", state.Conf.StickyBucketsPerCore,
|
|
"flow-timeout", state.Conf.FlowTimeout)
|
|
}
|
|
|
|
// Push global LB conf (src addresses, buckets, timeout) from the
|
|
// running config. On startup this is the initial set; on reconnect
|
|
// (VPP restart) VPP has forgotten everything, so we set it again.
|
|
c.mu.Lock()
|
|
src := c.cfgSrc
|
|
c.mu.Unlock()
|
|
if src != nil {
|
|
if cfg := src.Config(); cfg != nil {
|
|
if err := c.SetLBConf(cfg); err != nil {
|
|
slog.Warn("vpp-lb-conf-set-failed", "err", err)
|
|
}
|
|
}
|
|
}
|
|
|
|
// Start the LB sync loop for as long as the connection is up.
|
|
// It exits when connCtx is cancelled (on disconnect or shutdown).
|
|
connCtx, connCancel := context.WithCancel(ctx)
|
|
go c.lbSyncLoop(connCtx)
|
|
|
|
// Hold the connection, pinging periodically to detect VPP restarts.
|
|
c.monitor(ctx)
|
|
connCancel()
|
|
|
|
// If ctx is done we're shutting down; otherwise VPP dropped and we retry.
|
|
c.disconnect()
|
|
if ctx.Err() != nil {
|
|
return
|
|
}
|
|
slog.Warn("vpp-disconnect", "msg", "connection lost, reconnecting")
|
|
}
|
|
}
|
|
|
|
// lbSyncLoop periodically runs SyncLBStateAll to catch drift between the
|
|
// maglev config and the VPP dataplane. The first run happens immediately
|
|
// on loop start (VPP has just connected, so any pre-existing state needs
|
|
// reconciliation). Subsequent runs fire every cfg.VPP.LB.SyncInterval.
|
|
// Exits when ctx is cancelled.
|
|
func (c *Client) lbSyncLoop(ctx context.Context) {
|
|
c.mu.Lock()
|
|
src := c.cfgSrc
|
|
c.mu.Unlock()
|
|
if src == nil {
|
|
return // no config source registered; nothing to sync
|
|
}
|
|
|
|
// next-run timestamp starts at "now" so the first tick is immediate.
|
|
next := time.Now()
|
|
for {
|
|
wait := time.Until(next)
|
|
if wait < 0 {
|
|
wait = 0
|
|
}
|
|
select {
|
|
case <-ctx.Done():
|
|
return
|
|
case <-time.After(wait):
|
|
}
|
|
|
|
cfg := src.Config()
|
|
if cfg == nil {
|
|
next = time.Now().Add(defaultLBSyncInterval)
|
|
continue
|
|
}
|
|
interval := cfg.VPP.LB.SyncInterval
|
|
if interval <= 0 {
|
|
interval = defaultLBSyncInterval
|
|
}
|
|
|
|
if err := c.SyncLBStateAll(cfg); err != nil {
|
|
slog.Warn("vpp-lbsync-error", "err", err)
|
|
}
|
|
next = time.Now().Add(interval)
|
|
}
|
|
}
|
|
|
|
// IsConnected returns true if both API and stats connections are active.
|
|
func (c *Client) IsConnected() bool {
|
|
c.mu.Lock()
|
|
defer c.mu.Unlock()
|
|
return c.apiConn != nil && c.statsConn != nil
|
|
}
|
|
|
|
// GetInfo returns the VPP version and connection metadata, or an error
|
|
// if VPP is not connected.
|
|
func (c *Client) GetInfo() (Info, error) {
|
|
c.mu.Lock()
|
|
defer c.mu.Unlock()
|
|
if c.apiConn == nil {
|
|
return Info{}, errNotConnected
|
|
}
|
|
return c.info, nil
|
|
}
|
|
|
|
// connect establishes both API and stats connections. If either fails,
|
|
// both are torn down.
|
|
func (c *Client) connect() error {
|
|
sc := socketclient.NewVppClient(c.apiAddr)
|
|
sc.SetClientName("vpp-maglev")
|
|
apiConn, err := core.Connect(sc)
|
|
if err != nil {
|
|
return err
|
|
}
|
|
|
|
stc := statsclient.NewStatsClient(c.statsAddr)
|
|
statsConn, err := core.ConnectStats(stc)
|
|
if err != nil {
|
|
safeDisconnectAPI(apiConn)
|
|
return err
|
|
}
|
|
|
|
c.mu.Lock()
|
|
c.apiConn = apiConn
|
|
c.statsConn = statsConn
|
|
c.statsClient = stc
|
|
c.mu.Unlock()
|
|
return nil
|
|
}
|
|
|
|
// disconnect tears down both connections.
|
|
func (c *Client) disconnect() {
|
|
c.mu.Lock()
|
|
apiConn := c.apiConn
|
|
statsConn := c.statsConn
|
|
c.apiConn = nil
|
|
c.statsConn = nil
|
|
c.statsClient = nil
|
|
c.info = Info{}
|
|
c.lastLBConf = nil // force re-push of lb_conf on reconnect
|
|
c.mu.Unlock()
|
|
|
|
safeDisconnectAPI(apiConn)
|
|
safeDisconnectStats(statsConn)
|
|
}
|
|
|
|
// monitor blocks until the context is cancelled or a liveness ping fails.
|
|
func (c *Client) monitor(ctx context.Context) {
|
|
ticker := time.NewTicker(pingInterval)
|
|
defer ticker.Stop()
|
|
for {
|
|
select {
|
|
case <-ctx.Done():
|
|
return
|
|
case <-ticker.C:
|
|
if !c.ping() {
|
|
return
|
|
}
|
|
}
|
|
}
|
|
}
|
|
|
|
// ping sends a control_ping to VPP and returns true if it succeeds.
|
|
func (c *Client) ping() bool {
|
|
ch, err := c.apiChannel()
|
|
if err != nil {
|
|
return false
|
|
}
|
|
defer ch.Close()
|
|
|
|
req := &core.ControlPing{}
|
|
reply := &core.ControlPingReply{}
|
|
if err := ch.SendRequest(req).ReceiveReply(reply); err != nil {
|
|
slog.Debug("vpp-ping-failed", "err", err)
|
|
return false
|
|
}
|
|
return true
|
|
}
|
|
|
|
// fetchInfo queries VPP for version info, PID, and boot time.
|
|
// Must be called after connect succeeds (apiConn and statsClient are set).
|
|
func (c *Client) fetchInfo() Info {
|
|
info := Info{ConnectedSince: time.Now()}
|
|
|
|
ch, err := c.apiChannel()
|
|
if err != nil {
|
|
return info
|
|
}
|
|
defer ch.Close()
|
|
|
|
ver := &vpe.ShowVersionReply{}
|
|
if err := ch.SendRequest(&vpe.ShowVersion{}).ReceiveReply(ver); err == nil {
|
|
info.Version = ver.Version
|
|
info.BuildDate = ver.BuildDate
|
|
info.BuildDirectory = ver.BuildDirectory
|
|
}
|
|
|
|
ping := &core.ControlPingReply{}
|
|
if err := ch.SendRequest(&core.ControlPing{}).ReceiveReply(ping); err == nil {
|
|
info.PID = ping.VpePID
|
|
}
|
|
|
|
// Read VPP boot time from the stats segment.
|
|
c.mu.Lock()
|
|
sc := c.statsClient
|
|
c.mu.Unlock()
|
|
if sc != nil {
|
|
if entries, err := sc.DumpStats("/sys/boottime"); err == nil {
|
|
for _, e := range entries {
|
|
if s, ok := e.Data.(adapter.ScalarStat); ok && s != 0 {
|
|
info.BootTime = time.Unix(int64(s), 0)
|
|
}
|
|
}
|
|
}
|
|
}
|
|
|
|
return info
|
|
}
|
|
|
|
// safeDisconnectAPI disconnects an API connection, recovering from any panic
|
|
// that GoVPP may raise on a stale connection.
|
|
func safeDisconnectAPI(conn *core.Connection) {
|
|
if conn == nil {
|
|
return
|
|
}
|
|
defer func() { recover() }() //nolint:errcheck
|
|
conn.Disconnect()
|
|
}
|
|
|
|
// safeDisconnectStats disconnects a stats connection, recovering from panics.
|
|
func safeDisconnectStats(conn *core.StatsConnection) {
|
|
if conn == nil {
|
|
return
|
|
}
|
|
defer func() { recover() }() //nolint:errcheck
|
|
conn.Disconnect()
|
|
}
|
|
|
|
type vppError struct{ msg string }
|
|
|
|
func (e *vppError) Error() string { return e.msg }
|
|
|
|
var errNotConnected = &vppError{msg: "VPP API connection not established"}
|