Files
vpp-maglev/internal/grpcapi/server_test.go
Pim van Pelt 744b1cb3d2 install-deps Makefile target; docs refresh; golangci-lint v2 clean
Makefile:
- New install-deps umbrella target split into three sub-targets:
  install-deps-apt        — Debian/Trixie-packaged build deps
                            (nodejs, npm, protobuf-compiler, git, make,
                            dpkg-dev, ca-certificates, curl, tar). Uses
                            sudo when not already root.
  install-deps-go         — ensures a Go toolchain >= GO_VERSION (go.mod
                            floor, default 1.25.0). Short-circuits when
                            the system Go is already recent enough;
                            otherwise downloads the upstream tarball
                            from go.dev/dl/ into /usr/local/go. Trixie
                            only ships 1.24 so this step is load-bearing.
  install-deps-go-tools   — go install protoc-gen-go, protoc-gen-go-grpc,
                            and golangci-lint/v2/cmd/golangci-lint. Then
                            asserts the installed golangci-lint version
                            parses as >= GOLANGCI_LINT_VERSION (default
                            1.64.0, the floor that supports Go 1.25
                            syntax) to catch stale binaries in $GOPATH
                            /bin before they silently run against Go
                            1.25 code.
- Parser bug fixed: golangci-lint v1.x prints "has version v1.64.8" but
  v2.x dropped the 'v' prefix and prints "has version 2.11.4". The
  original sed regex required the 'v' and returned an empty match on
  v2.x, making the assertion explode with "could not parse version
  output". Fixed by switching to extended regex (sed -En) with 'v?' so
  both forms parse cleanly.
- GO_VERSION and GOLANGCI_LINT_VERSION exposed as Makefile variables
  so operators can override on the command line, e.g.
    make install-deps GO_VERSION=1.25.5 GOLANGCI_LINT_VERSION=2.0.0
- .PHONY extended with the four new target names.

Docs:
- README.md: capability note rewritten to cover CAP_NET_RAW (ICMP) and
  the new CAP_SYS_ADMIN requirement when healthchecker.netns is set,
  plus a paragraph explaining that the Debian systemd unit grants both
  automatically. Docker example gained a second variant that shows the
  additional --cap-add SYS_ADMIN and /var/run/netns bind mount for
  netns-scoped deployments. Also notes that maglevd-frontend ignores
  SIGHUP so controlling-terminal disconnects don't kill it.
- docs/user-guide.md: Capabilities section rewritten as a bulleted
  list covering both caps, with the EPERM error string and three
  different ways to grant them (systemd unit, setcap, systemd-run);
  'show vpp lb counters' command description updated to explain that
  per-backend packet counts are no longer shown (LB plugin's
  forwarding node bypasses ip{4,6}_lookup_inline, so /net/route/to at
  the backend's FIB entry never ticks for LB-forwarded traffic); new
  ~75-line "What the SPA shows" subsection covering the scope
  selector + maglev_scope cookie, the per-maglevd frontend cards, the
  health-cascade icon table (ok / bug-buckets / primary-drained /
  degraded / unknown), the lb buckets column semantics, the
  maglev_zippy_open cookie, the admin-mode lifecycle dialogs with
  their plain-English consequence text, and the debug panel.
- docs/config-guide.md: healthchecker.netns field gains a capability-
  requirement note spelling out setns(CLONE_NEWNET), the EPERM
  symptom string, and the /var/run/netns/ readability requirement.
- docs/healthchecks.md: new "Jitter" subsection explaining the +/-10%
  scaling on every computed interval, and a "Probe timing while a
  probe is in flight" subsection that explains why fast-interval alone
  doesn't give fast fault detection against hanging backends (the
  probe loop is synchronous, so each iteration is timeout +
  fast-interval; the advice is to lower timeout, not fast-interval).
- docs/maglevd.8: description paragraph corrected (dropped the
  per-backend stats claim and added a short note pointing at the LB
  plugin forwarding-path bypass); new CAPABILITIES section between
  SIGNALS and FILES covering both CAP_NET_RAW and CAP_SYS_ADMIN with
  the drop-in-override hint.
- docs/maglevd-frontend.8: new SIGNALS section documenting the
  explicit SIGHUP ignore (so a controlling-terminal disconnect doesn't
  kill the daemon); description extended with paragraphs on the two
  persistence cookies (maglev_scope, maglev_zippy_open) and on the
  health-cascade icon + lb buckets column.
- docs/maglevc.1: left untouched — intentionally minimal and delegates
  to docs/user-guide.md.

Lint (26 issues across 12 files, all errcheck / ineffassign / S1021):
- cmd/frontend/handlers.go: _, _ = fmt.Fprintf(...) for the SSE retry
  hint and resync control-event writes.
- cmd/maglevc/commands.go: bulk-prefix every fmt.Fprintf(w, ...) with
  _, _ =; also merged 'var watchEventsOptSlot *Node; ... = &Node{...}'
  into a single := declaration (staticcheck S1021) — the self-
  referencing pattern still works because the Children back-ref is
  assigned on the next statement, not inside the struct literal.
- cmd/maglevc/complete.go: _, _ = fmt.Fprintf(ql.rl.Stderr(), ...)
  for the banner and help writes; removed the ineffectual
  'partial = ""' assignment (nothing downstream reads partial after
  that branch, so setting it was dead code flagged by ineffassign).
- cmd/maglevc/shell.go: defer func() { _ = rl.Close() }() for the
  readline instance; _, _ = fmt.Fprintf(rl.Stderr(), ...) for error
  display in the REPL loop.
- cmd/maglevc/main.go: defer func() { _ = conn.Close() }() for the
  gRPC client connection.
- internal/grpcapi/server_test.go: _ = conn.Close() in the test
  teardown closure.
- internal/prober/http.go: _ = c.Close() in the TLS-handshake-failed
  path; defer func() { _ = conn.Close() }() and defer func() { _ =
  resp.Body.Close() }() for the two deferred cleanups.
- internal/prober/http_test.go: defer func() { _ = resp.Body.Close()
  }() plus three _, _ = fmt.Fprint(w, ...) in the httptest.Server
  handlers and _, _ = fmt.Sscanf(...) when parsing the test listener's
  port.
- internal/prober/icmp.go: defer func() { _ = pc.Close() }() for the
  ICMP packet conn.
- internal/prober/netns.go: defer func() { _ = origNs.Close() }(),
  defer func() { _ = netns.Set(origNs) }(), defer func() { _ =
  targetNs.Close() }() — also dropped a stray //nolint:errcheck that
  was no longer needed once the closure wrapping handled the discard.
- internal/prober/tcp.go: _ = conn.Close() in the L4-only path,
  _ = tlsConn.Close() in the failed and succeeded handshake branches,
  _ = tlsConn.SetDeadline(...) (also dropped a //nolint:errcheck
  previously covering it).

Iterative 'make lint' runs were needed because golangci-lint v2.x
caps same-linter reports per pass, so the first pass reported 21,
then 4, then 3, then 1, then 0. Final pass: 0 issues. make test is
green across every package, and make build produces all three
binaries cleanly.
2026-04-14 17:37:53 +02:00

448 lines
12 KiB
Go

// Copyright (c) 2026, Pim van Pelt <pim@ipng.ch>
package grpcapi
import (
"context"
"net"
"testing"
"time"
"google.golang.org/grpc"
"google.golang.org/grpc/credentials/insecure"
"git.ipng.ch/ipng/vpp-maglev/internal/checker"
"git.ipng.ch/ipng/vpp-maglev/internal/config"
"git.ipng.ch/ipng/vpp-maglev/internal/health"
)
func makeTestChecker(ctx context.Context) *checker.Checker {
cfg := &config.Config{
HealthChecker: config.HealthCheckerConfig{TransitionHistory: 5},
HealthChecks: map[string]config.HealthCheck{
"icmp": {
Type: "icmp",
Interval: time.Hour, // long interval: probes won't fire during tests
Timeout: time.Second,
Fall: 3,
Rise: 2,
},
},
Backends: map[string]config.Backend{
"be0": {
Address: net.ParseIP("10.0.0.2"),
HealthCheck: "icmp",
Enabled: true,
},
},
Frontends: map[string]config.Frontend{
"web": {
Address: net.ParseIP("192.0.2.1"),
Protocol: "tcp",
Port: 80,
Pools: []config.Pool{
{Name: "primary", Backends: map[string]config.PoolBackend{
"be0": {Weight: 100},
}},
},
},
},
}
c := checker.New(cfg)
go c.Run(ctx) //nolint:errcheck
// Allow the Run goroutine to initialize workers.
time.Sleep(10 * time.Millisecond)
return c
}
func startTestServer(t *testing.T, ctx context.Context, c *checker.Checker) (MaglevClient, func()) {
t.Helper()
lis, err := net.Listen("tcp", "127.0.0.1:0")
if err != nil {
t.Fatalf("listen: %v", err)
}
srv := grpc.NewServer()
RegisterMaglevServer(srv, NewServer(ctx, c, nil, "", nil))
go srv.Serve(lis) //nolint:errcheck
conn, err := grpc.NewClient(lis.Addr().String(),
grpc.WithTransportCredentials(insecure.NewCredentials()))
if err != nil {
t.Fatalf("dial: %v", err)
}
return NewMaglevClient(conn), func() {
_ = conn.Close()
srv.Stop()
}
}
func TestListFrontends(t *testing.T) {
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
c := makeTestChecker(ctx)
client, cleanup := startTestServer(t, ctx, c)
defer cleanup()
resp, err := client.ListFrontends(ctx, &ListFrontendsRequest{})
if err != nil {
t.Fatalf("ListFrontends: %v", err)
}
if len(resp.FrontendNames) != 1 || resp.FrontendNames[0] != "web" {
t.Errorf("ListFrontends: got %v, want [web]", resp.FrontendNames)
}
}
func TestGetFrontend(t *testing.T) {
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
c := makeTestChecker(ctx)
client, cleanup := startTestServer(t, ctx, c)
defer cleanup()
info, err := client.GetFrontend(ctx, &GetFrontendRequest{Name: "web"})
if err != nil {
t.Fatalf("GetFrontend: %v", err)
}
if info.Address != "192.0.2.1" {
t.Errorf("GetFrontend address: got %q, want 192.0.2.1", info.Address)
}
if info.Port != 80 {
t.Errorf("GetFrontend port: got %d, want 80", info.Port)
}
if len(info.Pools) != 1 || info.Pools[0].Name != "primary" {
t.Errorf("GetFrontend pools: got %v, want [{primary [be0]}]", info.Pools)
}
if len(info.Pools[0].Backends) != 1 || info.Pools[0].Backends[0].Name != "be0" {
t.Errorf("GetFrontend pools[0].backends: got %v, want [{be0 100}]", info.Pools[0].Backends)
}
if info.Pools[0].Backends[0].Weight != 100 {
t.Errorf("GetFrontend pools[0].backends[0].weight: got %d, want 100", info.Pools[0].Backends[0].Weight)
}
}
func TestGetFrontendNotFound(t *testing.T) {
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
c := makeTestChecker(ctx)
client, cleanup := startTestServer(t, ctx, c)
defer cleanup()
_, err := client.GetFrontend(ctx, &GetFrontendRequest{Name: "nope"})
if err == nil {
t.Error("expected error for unknown frontend")
}
}
func TestListBackends(t *testing.T) {
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
c := makeTestChecker(ctx)
client, cleanup := startTestServer(t, ctx, c)
defer cleanup()
resp, err := client.ListBackends(ctx, &ListBackendsRequest{})
if err != nil {
t.Fatalf("ListBackends: %v", err)
}
if len(resp.BackendNames) != 1 || resp.BackendNames[0] != "be0" {
t.Errorf("ListBackends: got %v, want [be0]", resp.BackendNames)
}
}
func TestGetBackend(t *testing.T) {
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
c := makeTestChecker(ctx)
client, cleanup := startTestServer(t, ctx, c)
defer cleanup()
info, err := client.GetBackend(ctx, &GetBackendRequest{Name: "be0"})
if err != nil {
t.Fatalf("GetBackend: %v", err)
}
if info.State != health.StateUnknown.String() {
t.Errorf("initial state: got %q, want unknown", info.State)
}
if !info.Enabled {
t.Error("expected enabled=true")
}
if info.Healthcheck != "icmp" {
t.Errorf("healthcheck: got %q, want icmp", info.Healthcheck)
}
}
func TestGetBackendNotFound(t *testing.T) {
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
c := makeTestChecker(ctx)
client, cleanup := startTestServer(t, ctx, c)
defer cleanup()
_, err := client.GetBackend(ctx, &GetBackendRequest{Name: "nope"})
if err == nil {
t.Error("expected error for unknown backend")
}
}
func TestPauseResumeBackend(t *testing.T) {
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
c := makeTestChecker(ctx)
client, cleanup := startTestServer(t, ctx, c)
defer cleanup()
info, err := client.PauseBackend(ctx, &BackendRequest{Name: "be0"})
if err != nil {
t.Fatalf("PauseBackend: %v", err)
}
if info.State != health.StatePaused.String() {
t.Errorf("after pause: got %q, want paused", info.State)
}
info, err = client.ResumeBackend(ctx, &BackendRequest{Name: "be0"})
if err != nil {
t.Fatalf("ResumeBackend: %v", err)
}
if info.State != health.StateUnknown.String() {
t.Errorf("after resume: got %q, want unknown", info.State)
}
}
func TestSetFrontendPoolBackendWeight(t *testing.T) {
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
c := makeTestChecker(ctx)
client, cleanup := startTestServer(t, ctx, c)
defer cleanup()
info, err := client.SetFrontendPoolBackendWeight(ctx, &SetWeightRequest{
Frontend: "web",
Pool: "primary",
Backend: "be0",
Weight: 42,
})
if err != nil {
t.Fatalf("SetFrontendPoolBackendWeight: %v", err)
}
if len(info.Pools) == 0 || len(info.Pools[0].Backends) == 0 {
t.Fatal("response missing pools/backends")
}
if info.Pools[0].Backends[0].Weight != 42 {
t.Errorf("weight: got %d, want 42", info.Pools[0].Backends[0].Weight)
}
// Invalid weight.
_, err = client.SetFrontendPoolBackendWeight(ctx, &SetWeightRequest{
Frontend: "web", Pool: "primary", Backend: "be0", Weight: 101,
})
if err == nil {
t.Error("expected error for weight 101")
}
// Unknown frontend.
_, err = client.SetFrontendPoolBackendWeight(ctx, &SetWeightRequest{
Frontend: "nope", Pool: "primary", Backend: "be0", Weight: 50,
})
if err == nil {
t.Error("expected error for unknown frontend")
}
}
func TestEnableDisableBackend(t *testing.T) {
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
c := makeTestChecker(ctx)
client, cleanup := startTestServer(t, ctx, c)
defer cleanup()
info, err := client.DisableBackend(ctx, &BackendRequest{Name: "be0"})
if err != nil {
t.Fatalf("DisableBackend: %v", err)
}
if info.State != "disabled" {
t.Errorf("after disable: got %q, want disabled", info.State)
}
if info.Enabled {
t.Error("after disable: Enabled should be false")
}
info, err = client.EnableBackend(ctx, &BackendRequest{Name: "be0"})
if err != nil {
t.Fatalf("EnableBackend: %v", err)
}
if info.State != "unknown" {
t.Errorf("after enable: got %q, want unknown", info.State)
}
if !info.Enabled {
t.Error("after enable: Enabled should be true")
}
}
func TestListHealthChecks(t *testing.T) {
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
c := makeTestChecker(ctx)
client, cleanup := startTestServer(t, ctx, c)
defer cleanup()
resp, err := client.ListHealthChecks(ctx, &ListHealthChecksRequest{})
if err != nil {
t.Fatalf("ListHealthChecks: %v", err)
}
if len(resp.Names) != 1 || resp.Names[0] != "icmp" {
t.Errorf("ListHealthChecks: got %v, want [icmp]", resp.Names)
}
}
func TestGetHealthCheck(t *testing.T) {
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
c := makeTestChecker(ctx)
client, cleanup := startTestServer(t, ctx, c)
defer cleanup()
info, err := client.GetHealthCheck(ctx, &GetHealthCheckRequest{Name: "icmp"})
if err != nil {
t.Fatalf("GetHealthCheck: %v", err)
}
if info.Type != "icmp" {
t.Errorf("type: got %q, want icmp", info.Type)
}
if info.Fall != 3 || info.Rise != 2 {
t.Errorf("fall/rise: got %d/%d, want 3/2", info.Fall, info.Rise)
}
}
func TestGetHealthCheckNotFound(t *testing.T) {
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
c := makeTestChecker(ctx)
client, cleanup := startTestServer(t, ctx, c)
defer cleanup()
_, err := client.GetHealthCheck(ctx, &GetHealthCheckRequest{Name: "nope"})
if err == nil {
t.Error("expected error for unknown healthcheck")
}
}
func TestWatchEventsServerShutdown(t *testing.T) {
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
c := makeTestChecker(ctx)
// Use a separate server context so we can cancel it independently.
srvCtx, srvCancel := context.WithCancel(ctx)
client, cleanup := startTestServer(t, srvCtx, c)
defer cleanup()
stream, err := client.WatchEvents(ctx, &WatchRequest{})
if err != nil {
t.Fatalf("WatchEvents: %v", err)
}
// Drain the initial synthetic snapshots (one per backend, one per frontend).
for i := 0; i < 2; i++ {
if _, err := stream.Recv(); err != nil {
t.Fatalf("initial Recv %d: %v", i, err)
}
}
// Cancel the server context; the stream must terminate.
srvCancel()
_, err = stream.Recv()
if err == nil {
t.Fatal("expected stream to close after server shutdown, got nil error")
}
}
func TestWatchEventsBackend(t *testing.T) {
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
c := makeTestChecker(ctx)
client, cleanup := startTestServer(t, ctx, c)
defer cleanup()
stream, err := client.WatchEvents(ctx, &WatchRequest{})
if err != nil {
t.Fatalf("WatchEvents: %v", err)
}
// Should receive the current state for be0 immediately as a BackendEvent.
ev, err := stream.Recv()
if err != nil {
t.Fatalf("Recv: %v", err)
}
be, ok := ev.Event.(*Event_Backend)
if !ok {
t.Fatalf("expected BackendEvent, got %T", ev.Event)
}
if be.Backend.BackendName != "be0" {
t.Errorf("initial event: backend=%q, want be0", be.Backend.BackendName)
}
}
func TestWatchEventsLogOnly(t *testing.T) {
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
c := makeTestChecker(ctx)
client, cleanup := startTestServer(t, ctx, c)
defer cleanup()
f := false
stream, err := client.WatchEvents(ctx, &WatchRequest{Backend: &f, Frontend: &f})
if err != nil {
t.Fatalf("WatchEvents: %v", err)
}
// No initial snapshot should arrive (backend disabled). Verify by checking
// that the stream has no immediately-readable event.
recvCh := make(chan *Event, 1)
go func() {
ev, _ := stream.Recv()
recvCh <- ev
}()
select {
case ev := <-recvCh:
if _, isLog := ev.Event.(*Event_Log); !isLog {
t.Errorf("expected only LogEvents, got %T", ev.Event)
}
case <-time.After(50 * time.Millisecond):
// expected: no backend snapshot arrived
}
}
func TestWatchEventsInvalidLogLevel(t *testing.T) {
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
c := makeTestChecker(ctx)
client, cleanup := startTestServer(t, ctx, c)
defer cleanup()
// For streaming RPCs the server error arrives on the first Recv, not on the
// initial call.
stream, err := client.WatchEvents(ctx, &WatchRequest{LogLevel: "verbose"})
if err != nil {
t.Fatalf("WatchEvents: %v", err)
}
_, err = stream.Recv()
if err == nil {
t.Fatal("expected error for invalid log_level")
}
}