Files
vpp-maglev/internal/config/config_test.go
Pim van Pelt d3c5c86037 VPP load-balancer dataplane integration: state, sync, and global conf
This commit wires maglevd through to VPP's LB plugin end-to-end, using
locally-generated GoVPP bindings for the newer v2 API messages.

VPP binapi (vendored)
- New package internal/vpp/binapi/ containing lb, lb_types, ip_types, and
  interface_types, generated from a local VPP build (~/src/vpp) via a new
  'make vpp-binapi' target. GoVPP v0.12.0 upstream lacks the v2 messages we
  need (lb_conf_get, lb_add_del_vip_v2, lb_add_del_as_v2, lb_as_v2_dump,
  lb_as_set_weight), so we commit the generated output in-tree.
- All generated files go through our loggedChannel wrapper; every VPP API
  send/receive is recorded at DEBUG via slog (vpp-api-send / vpp-api-recv /
  vpp-api-send-multi / vpp-api-recv-multi) so the full wire-level trail is
  auditable. NewAPIChannel is unexported — callers must use c.apiChannel().

Read path: GetLBState{All,VIP}
- GetLBStateAll returns a full snapshot (global conf + every VIP with its
  attached application servers).
- GetLBStateVIP looks up a single VIP by (prefix, protocol, port) and
  returns (nil, nil) when the VIP doesn't exist in VPP. This is the
  efficient path for targeted updates on a busy LB.
- Helpers factored out: getLBConf, dumpAllVIPs, dumpASesForVIP, lookupVIP,
  vipFromDetails.

Write path: SyncLBState{All,VIP}
- SyncLBStateAll reconciles every configured frontend with VPP: creates
  missing VIPs, removes stale ones (with AS flush), and reconciles AS
  membership and weights within VIPs that exist on both sides.
- SyncLBStateVIP targets a single frontend by name. Never removes VIPs.
  Returns ErrFrontendNotFound (wrapped with the name) when the frontend
  isn't in config, so callers can use errors.Is.
- Shared reconcileVIP helper does the per-VIP AS diff; removeVIP is used
  only by the full-sync pass.
- LbAddDelVipV2 requests always set NewFlowsTableLength=1024. The .api
  default=1024 annotation is only applied by VAT/CLI parsers, not wire-
  level marshalling — sending 0 caused VPP to vec_validate with mask
  0xFFFFFFFF and OOM-panic.
- Pool semantics: backends in the primary (first) pool of a frontend get
  their configured weight; backends in secondary pools get weight 0. All
  backends are installed so higher layers can flip weights on failover
  without add/remove churn.
- Every individual change emits a DEBUG slog (vpp-lbsync-vip-add/del,
  vpp-lbsync-as-add/del, vpp-lbsync-as-weight). Start/done INFO logs
  carry a scope=all|vip label plus aggregate counts.

Global conf push: SetLBConf
- New SetLBConf(cfg) sends lb_conf with ipv4-src, ipv6-src, sticky-buckets,
  and flow-timeout. Called automatically on VPP (re)connect and after
  every config reload (via doReloadConfig). Results are cached on the
  Client so redundant pushes are silently skipped — only actual changes
  produce a vpp-lb-conf-set INFO log line.

Periodic drift reconciliation
- vpp.Client.lbSyncLoop runs in a goroutine tied to each VPP connection's
  lifetime. Its first tick is immediate (startup and post-reconnect
  sync quickly); subsequent ticks fire every vpp.lb.sync-interval from
  config (default 30s). Purpose: catch drift if something/someone
  modifies VPP state by hand. The loop uses a ConfigSource interface
  (satisfied by checker.Checker via its new Config() accessor) to avoid
  an import cycle with the checker package.

Config schema additions (maglev.vpp.lb)
- sync-interval: positive Go duration, default 30s.
- ipv4-src-address: REQUIRED. Used as the outer source for GRE4 encap
  to application servers. Missing this is a hard semantic error —
  maglevd --check exits 2 and the daemon refuses to start. VPP GRE
  needs a source address and every VIP we program uses GRE, so there
  is no meaningful config without it.
- ipv6-src-address: REQUIRED. Same treatment as ipv4-src-address.
- sticky-buckets-per-core: default 65536, must be a power of 2.
- flow-timeout: default 40s, must be a whole number of seconds in [1s, 120s].
- VPP validation runs at the end of convert() so structural errors in
  healthchecks/backends/frontends surface first — operators fix those,
  then get the VPP-specific requirements.

gRPC API
- New GetVPPLBState RPC returning VPPLBState: global conf + VIPs with
  ASes. Mirrors the read-path but strips fields irrelevant to our
  GRE-only deployment (srv_type, dscp, target_port).
- New SyncVPPLBState RPC with optional frontend_name. Unset → full sync
  (may remove stale VIPs). Set → single-VIP sync (never removes).
  Returns codes.NotFound for unknown frontends, codes.Unavailable when
  VPP integration is disabled or disconnected.

maglevc (CLI)
- New 'show vpp lbstate' command displaying the LB plugin state. VPP-only
  fields the dataplane irrelevant to GRE are suppressed. Per-AS lines use
  a key-value format ("address X  weight Y  flow-table-buckets Z")
  instead of a tabwriter column, which avoids the ANSI-color alignment
  issue we hit with mixed label/data rows.
- New 'sync vpp lbstate [<name>]' command. Without a name, triggers a
  full reconciliation; with a name, targets one frontend.
- Previous 'show vpp lb' renamed to 'show vpp lbstate' for consistency
  with the new sync command.

Test fixtures
- validConfig and all ad-hoc config_test.go fixtures that reach the end
  of convert() now include the two required vpp.lb src addresses.
- tests/01-maglevd/maglevd-lab/maglev.yaml gains a vpp.lb section so the
  robot integration tests can still load the config.
- cmd/maglevc/tree_test.go gains expected paths for the new commands.

Docs
- config-guide.md: new 'vpp' section in the basic structure, detailed
  vpp.lb field reference, noting ipv4/ipv6 src addresses as REQUIRED
  (hard error) with no defaults; example config updated.
- user-guide.md: documented 'show vpp info', 'show vpp lbstate',
  'sync vpp lbstate [<name>]', new --vpp-api-addr and --vpp-stats-addr
  flags, the vpp-lb-conf-set log line, and corrected the pause/resume
  description to reflect that pause cancels the probe goroutine.
- debian/maglev.yaml: example config gains a vpp.lb block with src
  addresses and commented optional overrides.
2026-04-12 10:58:44 +02:00

593 lines
12 KiB
Go

// Copyright (c) 2026, Pim van Pelt <pim@ipng.ch>
package config
import (
"testing"
"time"
)
const validConfig = `
maglev:
healthchecker:
transition-history: 5
netns: dataplane
vpp:
lb:
ipv4-src-address: 10.0.0.1
ipv6-src-address: 2001:db8::1
healthchecks:
http-check:
type: http
port: 80
probe-ipv4-src: 10.0.0.1
params:
path: /healthz
host: example.com
response-code: "200"
interval: 2s
timeout: 3s
rise: 2
fall: 3
icmp-check:
type: icmp
probe-ipv6-src: 2001:db8:1::1
interval: 1s
timeout: 3s
fall: 5
backends:
be-v4:
address: 192.0.2.10
healthcheck: http-check
be-v6a:
address: 2001:db8:2::1
healthcheck: icmp-check
be-v6b:
address: 2001:db8:2::2
healthcheck: icmp-check
enabled: true
frontends:
web4:
description: "IPv4 VIP"
address: 192.0.2.1
protocol: tcp
port: 80
pools:
- name: primary
backends:
be-v4: {}
web6:
description: "IPv6 VIP"
address: 2001:db8::1
protocol: tcp
port: 443
pools:
- name: primary
backends:
be-v6a:
weight: 100
be-v6b:
weight: 50
`
func TestValidConfig(t *testing.T) {
cfg, err := parse([]byte(validConfig))
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if cfg.HealthChecker.Netns != "dataplane" {
t.Errorf("healthchecker.netns: got %q, want dataplane", cfg.HealthChecker.Netns)
}
if cfg.HealthChecker.TransitionHistory != 5 {
t.Errorf("transition-history: got %d, want 5", cfg.HealthChecker.TransitionHistory)
}
if len(cfg.Frontends) != 2 {
t.Fatalf("frontends: got %d, want 2", len(cfg.Frontends))
}
hc := cfg.HealthChecks["http-check"]
if hc.Type != "http" {
t.Errorf("http-check type: got %q, want http", hc.Type)
}
if hc.Fall != 3 || hc.Rise != 2 {
t.Errorf("http-check fall/rise: got %d/%d, want 3/2", hc.Fall, hc.Rise)
}
if hc.ProbeIPv4Src.String() != "10.0.0.1" {
t.Errorf("http-check probe-ipv4-src: got %s, want 10.0.0.1", hc.ProbeIPv4Src)
}
if hc.HTTP == nil {
t.Fatal("http-check HTTP params should not be nil")
}
if hc.HTTP.Path != "/healthz" {
t.Errorf("http-check path: got %q, want /healthz", hc.HTTP.Path)
}
if hc.HTTP.Host != "example.com" {
t.Errorf("http-check host: got %q, want example.com", hc.HTTP.Host)
}
if hc.HTTP.ResponseCodeMin != 200 || hc.HTTP.ResponseCodeMax != 200 {
t.Errorf("http-check response-code: got %d-%d, want 200-200",
hc.HTTP.ResponseCodeMin, hc.HTTP.ResponseCodeMax)
}
icmp := cfg.HealthChecks["icmp-check"]
if icmp.Fall != 5 {
t.Errorf("icmp-check fall: got %d, want 5", icmp.Fall)
}
if icmp.ProbeIPv6Src.String() != "2001:db8:1::1" {
t.Errorf("icmp-check probe-ipv6-src: got %s, want 2001:db8:1::1", icmp.ProbeIPv6Src)
}
// Backend fields.
beV4 := cfg.Backends["be-v4"]
if beV4.Address.String() != "192.0.2.10" {
t.Errorf("be-v4 address: got %s", beV4.Address)
}
if beV4.HealthCheck != "http-check" {
t.Errorf("be-v4 healthcheck: got %q", beV4.HealthCheck)
}
if !beV4.Enabled {
t.Error("be-v4 enabled: want true (default)")
}
// Pool structure.
web4 := cfg.Frontends["web4"]
if len(web4.Pools) != 1 || web4.Pools[0].Name != "primary" {
t.Errorf("web4 pools: got %v", web4.Pools)
}
if _, ok := web4.Pools[0].Backends["be-v4"]; !ok {
t.Error("web4 primary pool missing be-v4")
}
if web4.Pools[0].Backends["be-v4"].Weight != 100 {
t.Errorf("web4 be-v4 weight: got %d, want 100 (default)", web4.Pools[0].Backends["be-v4"].Weight)
}
web6 := cfg.Frontends["web6"]
if len(web6.Pools) != 1 || len(web6.Pools[0].Backends) != 2 {
t.Errorf("web6 pools[0] backends: got %d, want 2", len(web6.Pools[0].Backends))
}
if web6.Pools[0].Backends["be-v6b"].Weight != 50 {
t.Errorf("web6 be-v6b weight: got %d, want 50", web6.Pools[0].Backends["be-v6b"].Weight)
}
}
func TestDefaults(t *testing.T) {
raw := `
maglev:
vpp:
lb:
ipv4-src-address: 10.0.0.1
ipv6-src-address: 2001:db8::1
healthchecks:
icmp:
type: icmp
interval: 1s
timeout: 2s
backends:
be:
address: 10.0.0.2
healthcheck: icmp
frontends:
v:
address: 192.0.2.1
pools:
- name: primary
backends:
be: {}
`
cfg, err := parse([]byte(raw))
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if cfg.HealthChecker.Netns != "" {
t.Errorf("default netns: got %q, want empty", cfg.HealthChecker.Netns)
}
if cfg.HealthChecker.TransitionHistory != 5 {
t.Errorf("default transition-history: got %d, want 5", cfg.HealthChecker.TransitionHistory)
}
hc := cfg.HealthChecks["icmp"]
if hc.Rise != 2 || hc.Fall != 3 {
t.Errorf("defaults rise/fall: got %d/%d, want 2/3", hc.Rise, hc.Fall)
}
be := cfg.Backends["be"]
if !be.Enabled {
t.Errorf("backend default enabled: got false, want true")
}
// Pool backend weight defaults to 100.
v := cfg.Frontends["v"]
if v.Pools[0].Backends["be"].Weight != 100 {
t.Errorf("pool backend default weight: got %d, want 100", v.Pools[0].Backends["be"].Weight)
}
}
func TestBackendNoHealthcheck(t *testing.T) {
// A backend with no healthcheck reference is valid; probe is skipped.
raw := `
maglev:
vpp:
lb:
ipv4-src-address: 10.0.0.1
ipv6-src-address: 2001:db8::1
healthchecks: {}
backends:
be:
address: 10.0.0.2
frontends:
v:
address: 192.0.2.1
pools:
- name: primary
backends:
be: {}
`
cfg, err := parse([]byte(raw))
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if cfg.Backends["be"].HealthCheck != "" {
t.Error("expected empty healthcheck")
}
}
func TestOptionalIntervals(t *testing.T) {
raw := `
maglev:
vpp:
lb:
ipv4-src-address: 10.0.0.1
ipv6-src-address: 2001:db8::1
healthchecks:
icmp:
type: icmp
interval: 2s
fast-interval: 500ms
down-interval: 30s
timeout: 1s
backends:
be:
address: 10.0.0.2
healthcheck: icmp
frontends:
v:
address: 192.0.2.1
pools:
- name: primary
backends:
be: {}
`
cfg, err := parse([]byte(raw))
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
hc := cfg.HealthChecks["icmp"]
if hc.Interval != 2*time.Second {
t.Errorf("interval: got %v, want 2s", hc.Interval)
}
if hc.FastInterval != 500*time.Millisecond {
t.Errorf("fast-interval: got %v, want 500ms", hc.FastInterval)
}
if hc.DownInterval != 30*time.Second {
t.Errorf("down-interval: got %v, want 30s", hc.DownInterval)
}
}
func TestValidationErrors(t *testing.T) {
base := func(hcExtra, beExtra, feExtra string) string {
return `
maglev:
vpp:
lb:
ipv4-src-address: 10.0.0.1
ipv6-src-address: 2001:db8::1
healthchecks:
c:
type: icmp
interval: 1s
timeout: 2s
` + hcExtra + `
backends:
be:
address: 10.0.0.2
healthcheck: c
` + beExtra + `
frontends:
v:
address: 192.0.2.1
pools:
- name: primary
backends:
be: {}
` + feExtra
}
tests := []struct {
name string
yaml string
errSub string
}{
{
name: "wrong family probe-ipv4-src",
yaml: base(" probe-ipv4-src: 2001:db8::1\n", "", ""),
errSub: "probe-ipv4-src",
},
{
name: "mixed backend address families in pool",
yaml: `
maglev:
vpp:
lb:
ipv4-src-address: 10.0.0.1
ipv6-src-address: 2001:db8::1
healthchecks:
c:
type: icmp
interval: 1s
timeout: 2s
backends:
v4: {address: 10.0.0.2, healthcheck: c}
v6: {address: 2001:db8::1, healthcheck: c}
frontends:
v:
address: 192.0.2.1
pools:
- name: primary
backends:
v4: {}
v6: {}
`,
errSub: "address family",
},
{
name: "port without protocol",
yaml: base("", "", " port: 80\n"),
errSub: "port requires protocol",
},
{
name: "protocol without port",
yaml: `
maglev:
healthchecks:
c:
type: icmp
interval: 1s
timeout: 2s
backends:
be: {address: 10.0.0.2, healthcheck: c}
frontends:
v:
address: 192.0.2.1
protocol: tcp
pools:
- name: primary
backends:
be: {}
`,
errSub: "requires port",
},
{
name: "invalid healthcheck type",
yaml: `
maglev:
healthchecks:
c:
type: dns
interval: 1s
timeout: 2s
backends:
be: {address: 10.0.0.2, healthcheck: c}
frontends:
v:
address: 192.0.2.1
pools:
- name: primary
backends:
be: {}
`,
errSub: "type must be",
},
{
name: "http missing path",
yaml: `
maglev:
healthchecks:
c:
type: http
port: 80
interval: 1s
timeout: 2s
backends:
be: {address: 10.0.0.2, healthcheck: c}
frontends:
v:
address: 192.0.2.1
pools:
- name: primary
backends:
be: {}
`,
errSub: "params.path",
},
{
name: "no error case",
yaml: base("", "", ""),
errSub: "",
},
{
name: "undefined healthcheck reference",
yaml: `
maglev:
healthchecks: {}
backends:
be: {address: 10.0.0.2, healthcheck: missing}
frontends:
v:
address: 192.0.2.1
pools:
- name: primary
backends:
be: {}
`,
errSub: "not defined",
},
{
name: "undefined backend reference in pool",
yaml: `
maglev:
healthchecks:
c:
type: icmp
interval: 1s
timeout: 2s
backends: {}
frontends:
v:
address: 192.0.2.1
pools:
- name: primary
backends:
missing: {}
`,
errSub: "not defined",
},
{
name: "pool weight out of range",
yaml: `
maglev:
healthchecks:
c:
type: icmp
interval: 1s
timeout: 2s
backends:
be: {address: 10.0.0.2, healthcheck: c}
frontends:
v:
address: 192.0.2.1
pools:
- name: primary
backends:
be:
weight: 150
`,
errSub: "out of range",
},
{
name: "fall zero becomes default",
yaml: base(" fall: 0\n", "", ""),
errSub: "",
},
{
name: "tcp missing port",
yaml: `
maglev:
healthchecks:
c:
type: tcp
interval: 1s
timeout: 2s
backends:
be: {address: 10.0.0.2, healthcheck: c}
frontends:
v:
address: 192.0.2.1
pools:
- name: primary
backends:
be: {}
`,
errSub: "requires port",
},
{
name: "http missing port",
yaml: `
maglev:
healthchecks:
c:
type: http
interval: 1s
timeout: 2s
params:
path: /
backends:
be: {address: 10.0.0.2, healthcheck: c}
frontends:
v:
address: 192.0.2.1
pools:
- name: primary
backends:
be: {}
`,
errSub: "requires port",
},
{
name: "empty pools",
yaml: `
maglev:
healthchecks:
c:
type: icmp
interval: 1s
timeout: 2s
backends:
be: {address: 10.0.0.2, healthcheck: c}
frontends:
v:
address: 192.0.2.1
pools: []
`,
errSub: "pools must not be empty",
},
{
name: "pool missing name",
yaml: `
maglev:
healthchecks:
c:
type: icmp
interval: 1s
timeout: 2s
backends:
be: {address: 10.0.0.2, healthcheck: c}
frontends:
v:
address: 192.0.2.1
pools:
- backends:
be: {}
`,
errSub: "name must not be empty",
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
_, err := parse([]byte(tt.yaml))
if tt.errSub == "" {
if err != nil {
t.Fatalf("expected no error, got: %v", err)
}
return
}
if err == nil {
t.Fatalf("expected error containing %q, got nil", tt.errSub)
}
if !contains(err.Error(), tt.errSub) {
t.Errorf("error %q does not contain %q", err.Error(), tt.errSub)
}
})
}
}
func contains(s, sub string) bool {
return len(s) >= len(sub) && (s == sub || len(sub) == 0 ||
func() bool {
for i := 0; i <= len(s)-len(sub); i++ {
if s[i:i+len(sub)] == sub {
return true
}
}
return false
}())
}