Makefile:
- New install-deps umbrella target split into three sub-targets:
install-deps-apt — Debian/Trixie-packaged build deps
(nodejs, npm, protobuf-compiler, git, make,
dpkg-dev, ca-certificates, curl, tar). Uses
sudo when not already root.
install-deps-go — ensures a Go toolchain >= GO_VERSION (go.mod
floor, default 1.25.0). Short-circuits when
the system Go is already recent enough;
otherwise downloads the upstream tarball
from go.dev/dl/ into /usr/local/go. Trixie
only ships 1.24 so this step is load-bearing.
install-deps-go-tools — go install protoc-gen-go, protoc-gen-go-grpc,
and golangci-lint/v2/cmd/golangci-lint. Then
asserts the installed golangci-lint version
parses as >= GOLANGCI_LINT_VERSION (default
1.64.0, the floor that supports Go 1.25
syntax) to catch stale binaries in $GOPATH
/bin before they silently run against Go
1.25 code.
- Parser bug fixed: golangci-lint v1.x prints "has version v1.64.8" but
v2.x dropped the 'v' prefix and prints "has version 2.11.4". The
original sed regex required the 'v' and returned an empty match on
v2.x, making the assertion explode with "could not parse version
output". Fixed by switching to extended regex (sed -En) with 'v?' so
both forms parse cleanly.
- GO_VERSION and GOLANGCI_LINT_VERSION exposed as Makefile variables
so operators can override on the command line, e.g.
make install-deps GO_VERSION=1.25.5 GOLANGCI_LINT_VERSION=2.0.0
- .PHONY extended with the four new target names.
Docs:
- README.md: capability note rewritten to cover CAP_NET_RAW (ICMP) and
the new CAP_SYS_ADMIN requirement when healthchecker.netns is set,
plus a paragraph explaining that the Debian systemd unit grants both
automatically. Docker example gained a second variant that shows the
additional --cap-add SYS_ADMIN and /var/run/netns bind mount for
netns-scoped deployments. Also notes that maglevd-frontend ignores
SIGHUP so controlling-terminal disconnects don't kill it.
- docs/user-guide.md: Capabilities section rewritten as a bulleted
list covering both caps, with the EPERM error string and three
different ways to grant them (systemd unit, setcap, systemd-run);
'show vpp lb counters' command description updated to explain that
per-backend packet counts are no longer shown (LB plugin's
forwarding node bypasses ip{4,6}_lookup_inline, so /net/route/to at
the backend's FIB entry never ticks for LB-forwarded traffic); new
~75-line "What the SPA shows" subsection covering the scope
selector + maglev_scope cookie, the per-maglevd frontend cards, the
health-cascade icon table (ok / bug-buckets / primary-drained /
degraded / unknown), the lb buckets column semantics, the
maglev_zippy_open cookie, the admin-mode lifecycle dialogs with
their plain-English consequence text, and the debug panel.
- docs/config-guide.md: healthchecker.netns field gains a capability-
requirement note spelling out setns(CLONE_NEWNET), the EPERM
symptom string, and the /var/run/netns/ readability requirement.
- docs/healthchecks.md: new "Jitter" subsection explaining the +/-10%
scaling on every computed interval, and a "Probe timing while a
probe is in flight" subsection that explains why fast-interval alone
doesn't give fast fault detection against hanging backends (the
probe loop is synchronous, so each iteration is timeout +
fast-interval; the advice is to lower timeout, not fast-interval).
- docs/maglevd.8: description paragraph corrected (dropped the
per-backend stats claim and added a short note pointing at the LB
plugin forwarding-path bypass); new CAPABILITIES section between
SIGNALS and FILES covering both CAP_NET_RAW and CAP_SYS_ADMIN with
the drop-in-override hint.
- docs/maglevd-frontend.8: new SIGNALS section documenting the
explicit SIGHUP ignore (so a controlling-terminal disconnect doesn't
kill the daemon); description extended with paragraphs on the two
persistence cookies (maglev_scope, maglev_zippy_open) and on the
health-cascade icon + lb buckets column.
- docs/maglevc.1: left untouched — intentionally minimal and delegates
to docs/user-guide.md.
Lint (26 issues across 12 files, all errcheck / ineffassign / S1021):
- cmd/frontend/handlers.go: _, _ = fmt.Fprintf(...) for the SSE retry
hint and resync control-event writes.
- cmd/maglevc/commands.go: bulk-prefix every fmt.Fprintf(w, ...) with
_, _ =; also merged 'var watchEventsOptSlot *Node; ... = &Node{...}'
into a single := declaration (staticcheck S1021) — the self-
referencing pattern still works because the Children back-ref is
assigned on the next statement, not inside the struct literal.
- cmd/maglevc/complete.go: _, _ = fmt.Fprintf(ql.rl.Stderr(), ...)
for the banner and help writes; removed the ineffectual
'partial = ""' assignment (nothing downstream reads partial after
that branch, so setting it was dead code flagged by ineffassign).
- cmd/maglevc/shell.go: defer func() { _ = rl.Close() }() for the
readline instance; _, _ = fmt.Fprintf(rl.Stderr(), ...) for error
display in the REPL loop.
- cmd/maglevc/main.go: defer func() { _ = conn.Close() }() for the
gRPC client connection.
- internal/grpcapi/server_test.go: _ = conn.Close() in the test
teardown closure.
- internal/prober/http.go: _ = c.Close() in the TLS-handshake-failed
path; defer func() { _ = conn.Close() }() and defer func() { _ =
resp.Body.Close() }() for the two deferred cleanups.
- internal/prober/http_test.go: defer func() { _ = resp.Body.Close()
}() plus three _, _ = fmt.Fprint(w, ...) in the httptest.Server
handlers and _, _ = fmt.Sscanf(...) when parsing the test listener's
port.
- internal/prober/icmp.go: defer func() { _ = pc.Close() }() for the
ICMP packet conn.
- internal/prober/netns.go: defer func() { _ = origNs.Close() }(),
defer func() { _ = netns.Set(origNs) }(), defer func() { _ =
targetNs.Close() }() — also dropped a stray //nolint:errcheck that
was no longer needed once the closure wrapping handled the discard.
- internal/prober/tcp.go: _ = conn.Close() in the L4-only path,
_ = tlsConn.Close() in the failed and succeeded handshake branches,
_ = tlsConn.SetDeadline(...) (also dropped a //nolint:errcheck
previously covering it).
Iterative 'make lint' runs were needed because golangci-lint v2.x
caps same-linter reports per pass, so the first pass reported 21,
then 4, then 3, then 1, then 0. Final pass: 0 issues. make test is
green across every package, and make build produces all three
binaries cleanly.
181 lines
5.1 KiB
Groff
181 lines
5.1 KiB
Groff
.TH MAGLEVD 8 "April 2026" "vpp\-maglev" "System Administration"
|
|
.SH NAME
|
|
maglevd \- Maglev health\-checker daemon and VPP load\-balancer controller
|
|
.SH SYNOPSIS
|
|
.B maglevd
|
|
[\fB\-config\fR \fIfile\fR]
|
|
[\fB\-grpc\-addr\fR \fIaddr\fR]
|
|
[\fB\-log\-level\fR \fIlevel\fR]
|
|
[\fB\-reflection\fR[=\fIbool\fR]]
|
|
[\fB\-version\fR]
|
|
.SH DESCRIPTION
|
|
.B maglevd
|
|
has two responsibilities:
|
|
.IP \(bu 2
|
|
It monitors backends with active health checks (HTTP, TCP, ICMP) and
|
|
aggregates the results into a state machine per backend. Probe
|
|
intervals adapt automatically: a faster interval is used while a
|
|
backend is degraded, and a slower one once it has been continuously
|
|
down. Operators can also
|
|
.B pause/resume
|
|
or
|
|
.B disable/enable
|
|
a backend out of band.
|
|
.IP \(bu 2
|
|
It programs the VPP dataplane via the
|
|
.B lb
|
|
(load\-balancer) plugin. Each frontend in the config becomes a VPP
|
|
load\-balancer VIP; each healthy backend becomes an application server
|
|
under its frontend's VIPs. State transitions drive
|
|
application\-server weight updates over the VPP binary API so that
|
|
unhealthy backends stop receiving new flows. Drift between the
|
|
running config and VPP's view is reconciled periodically
|
|
.RB ( maglev.vpp.lb.sync-interval ,
|
|
default 30s), on
|
|
.B SIGHUP
|
|
reloads, and on operator request via
|
|
.BR maglevc .
|
|
.PP
|
|
The aggregated backend state, VPP dataplane state, and per\-VIP
|
|
stats\-segment counters are exposed via a gRPC API (and scraped
|
|
into Prometheus when the
|
|
.B /metrics
|
|
endpoint is enabled). Per\-backend packet counters are intentionally
|
|
not exposed: VPP's LB plugin forwards by writing
|
|
.B adj_index[VLIB_TX]
|
|
directly and bypassing
|
|
.BR ip4_lookup_inline " / " ip6_lookup_inline ,
|
|
which is the only path that increments
|
|
.BR /net/route/to ,
|
|
so the backend's FIB entry stats index never ticks for LB\-forwarded
|
|
traffic.
|
|
See
|
|
.BR maglevc (1)
|
|
for the interactive CLI client.
|
|
.PP
|
|
Configuration is loaded from a YAML file. A running daemon reloads
|
|
its configuration when it receives
|
|
.BR SIGHUP .
|
|
.SH OPTIONS
|
|
Each flag may also be supplied via an environment variable (shown in
|
|
parentheses); the flag takes precedence.
|
|
.TP
|
|
.BI \-config " file"
|
|
Path to the YAML configuration file.
|
|
.RI "(default: " /etc/vpp-maglev/maglev.yaml "; env: " MAGLEV_CONFIG )
|
|
.TP
|
|
.BI \-grpc\-addr " addr"
|
|
TCP address on which the gRPC server listens.
|
|
.RI "(default: " :9090 "; env: " MAGLEV_GRPC_ADDR )
|
|
.TP
|
|
.BI \-log\-level " level"
|
|
Structured\-log verbosity:
|
|
.BR debug ,
|
|
.BR info ,
|
|
.BR warn ,
|
|
or
|
|
.BR error .
|
|
.RI "(default: " info "; env: " MAGLEV_LOG_LEVEL )
|
|
.TP
|
|
.B \-reflection
|
|
Enable gRPC server reflection so that clients such as
|
|
.BR grpcurl (1)
|
|
can introspect the API without access to the
|
|
.I .proto
|
|
file.
|
|
Enabled by default; pass
|
|
.B \-reflection=false
|
|
to disable.
|
|
.TP
|
|
.B \-version
|
|
Print version, commit hash, and build date, then exit.
|
|
.SH SIGNALS
|
|
.TP
|
|
.B SIGHUP
|
|
Reload the configuration file without restarting. New backends are
|
|
added, removed backends are stopped, and unchanged backend workers
|
|
keep running. After the reload, the VPP dataplane is reconciled so
|
|
added/removed frontends and application servers are programmed
|
|
immediately.
|
|
.TP
|
|
.BR SIGTERM ", " SIGINT
|
|
Gracefully shut down: drain active gRPC streams, then exit. VPP
|
|
dataplane state is left in place so that existing VIPs continue to
|
|
forward traffic during a restart.
|
|
.SH CAPABILITIES
|
|
.TP
|
|
.B CAP_NET_RAW
|
|
Required when any health check uses
|
|
.BR "type: icmp" .
|
|
Raw sockets for ICMP echo. TCP and HTTP(S) checks use normal TCP
|
|
sockets and need no special capability.
|
|
.TP
|
|
.B CAP_SYS_ADMIN
|
|
Required when the
|
|
.B healthchecker.netns
|
|
field is set in the YAML configuration. The probe loop calls
|
|
.BR setns (2)
|
|
with
|
|
.B CLONE_NEWNET
|
|
to enter the target network namespace before each probe; the
|
|
kernel only permits that to processes holding
|
|
.B CAP_SYS_ADMIN
|
|
in the target namespace's user namespace. Without it, every probe
|
|
fails with
|
|
.B enter netns "<name>": operation not permitted
|
|
and every backend flips to
|
|
.B down
|
|
on its first probe. Omit the capability when the deployment doesn't
|
|
use namespace\-scoped health checks \(em the Debian systemd unit
|
|
ships with both
|
|
.B CAP_NET_RAW
|
|
and
|
|
.B CAP_SYS_ADMIN
|
|
in its
|
|
.B AmbientCapabilities
|
|
and
|
|
.B CapabilityBoundingSet
|
|
by default, and operators can drop
|
|
.B CAP_SYS_ADMIN
|
|
via a drop\-in override if they prefer the narrower surface.
|
|
.SH FILES
|
|
.TP
|
|
.I /etc/vpp-maglev/maglev.yaml
|
|
Default configuration file (YAML).
|
|
.TP
|
|
.I /etc/default/vpp-maglev
|
|
Environment file sourced by the systemd unit before starting
|
|
.BR maglevd .
|
|
.TP
|
|
.I /run/vpp/api.sock
|
|
VPP binary\-API socket
|
|
.BR maglevd
|
|
connects to when programming the
|
|
.B lb
|
|
plugin.
|
|
.TP
|
|
.I /run/vpp/stats.sock
|
|
VPP stats\-segment socket used to scrape per\-VIP and per\-backend
|
|
packet/byte counters.
|
|
.SH "FULL DOCUMENTATION"
|
|
This manpage documents only the invocation of
|
|
.BR maglevd .
|
|
For the configuration file reference (including the
|
|
.B maglev.vpp.lb
|
|
section controlling the VPP integration), the health\-check state
|
|
machine, and the full operational guide, see:
|
|
.PP
|
|
.RS
|
|
https://git.ipng.ch/ipng/vpp-maglev/docs/config-guide.md
|
|
.br
|
|
https://git.ipng.ch/ipng/vpp-maglev/docs/healthchecks.md
|
|
.br
|
|
https://git.ipng.ch/ipng/vpp-maglev/docs/user-guide.md
|
|
.RE
|
|
.SH SEE ALSO
|
|
.BR maglevc (1),
|
|
.BR maglevd\-frontend (8),
|
|
.BR vpp (8)
|
|
.SH AUTHOR
|
|
Pim van Pelt <pim@ipng.ch>
|