Files
vpp-maglev/docs/healthchecks.md
Pim van Pelt d612086a5f Pools, CLI, versioning, Debian packaging, HTTPS fix
- Replaced flat `backends: [...]` list on frontends with an ordered `pools:`
  list; each pool has a name and a map of backends with per-pool weights (0–100,
  default 100). Pools express priority: first pool with a healthy backend wins.
- Removed global backend weight (was on the backend, now lives in the pool).
- Config validation enforces non-empty pools, non-empty pool names, weight
  range, and consistent address families across all pools of a frontend.

- Added `PoolBackendInfo { name, weight }` and changed `PoolInfo.backends` from
  `repeated string` to `repeated PoolBackendInfo` so weights are visible over
  the API.

- Full interactive shell with readline, tab completion, and `?` inline help.
- Command tree parser (Walk) handles fixed keywords and dynamic slot nodes;
  prefix matching with exact-match priority.
- Commands: `show version/frontends/frontend/backends/backend/healthchecks/
  healthcheck`, `set backend <name> pause|resume`, `quit`/`exit`.
- `show frontend` output is hierarchical (pools → backends) with per-backend
  weights and `[disabled]` notation; pool section uses fixed-width formatting
  so ANSI color codes don't corrupt tabwriter alignment.
- `-color` flag (default true) wraps static field labels in dark-blue ANSI;
  works correctly with tabwriter because all labels carry identical-length
  escape sequences.

- `cmd/version.go` package holds `version`, `commit`, `date` vars set at build
  time via `-ldflags -X`.
- `make build` / `make build-amd64` / `make build-arm64` all inject
  `VERSION=0.1.1`, `COMMIT_HASH` (from `git rev-parse --short HEAD`), and
  `DATE` (UTC ISO-8601).
- `maglevc` prints version on interactive startup and exposes `show version`.
- `maglevd` logs version/commit/date at startup; `-version` flag prints and exits.

- `doHTTPProbe` was building a `https://` target URL even though TLS was already
  applied to the connection inside `inNetns`. `http.Transport` then wrapped the
  connection in a second TLS layer, producing "http: server gave HTTP response
  to HTTPS client". Fixed by always using `http://` in the target URL.
- Added `TestHTTPSProbe` using `httptest.NewTLSServer` to cover the full path.

- New `docs/user-guide.md`: maglevd flags/signals, maglevc commands, shell
  completion, and command-tree parser walkthrough.
- New `docs/healthchecks.md`: state machine, rise/fall model, probe intervals,
  all transition events with log examples.
- Updated `docs/config-guide.md`: pools design, removed global weight from
  backends, updated all examples.
- Updated `README.md`: packaging table, build paths, corrected binary locations
  (`/usr/sbin/maglevd`), config filename (`.yaml`).

- `debian/` directory contains `control.in`, `maglevd.service`, `default.maglev`,
  `maglev.yaml` (example config), `conffiles`, `postinst`, `prerm`.
- `debian/build-deb.sh` stages a package tree and calls `dpkg-deb`; emits
  `build/vpp-maglev_<version>~<commit>_<arch>.deb`.
- Cross-compiles for amd64 and arm64 in one `make pkg-deb` invocation.
- `maglevd` installed to `/usr/sbin/`, `maglevc` to `/usr/bin/`.
- Service reads `MAGLEV_CONFIG` from `/etc/default/maglev`
  (default: `/etc/maglev/maglev.yaml`).
- Man pages `maglevd(8)` and `maglevc(1)` live in `docs/` and are gzip'd into
  the package.
- All build output goes to `build/<arch>/`; `build/` is gitignored.
2026-04-11 12:18:17 +02:00

5.8 KiB
Raw Blame History

Health Checking

maglevd probes each backend independently of how many frontends reference it. Every backend runs exactly one probe goroutine. State changes are broadcast as gRPC events to all connected WatchBackendEvents subscribers.


States

State Meaning
unknown Initial state; also entered after a resume or backend restart.
up Backend is healthy and eligible to receive traffic.
down Backend has failed enough consecutive probes to be considered offline.
paused Health checking suspended by an operator. Probes fire but results are discarded.
removed Backend was removed from configuration. No further probes are accepted.

Rise / fall counter

The state machine is driven by HAProxy's single-integer health counter.

counter ∈ [0, rise + fall  1]   (called Max below)

backend is UP   when counter ≥ rise
backend is DOWN when counter < rise

On each probe:

  • pass — counter increments, ceiling at Max.
  • fail — counter decrements, floor at 0.

This gives hysteresis: a backend that is barely up (counter = rise) needs fall consecutive failures before it transitions to down. A backend that is fully down (counter = 0) needs rise consecutive passes to come back up. A backend that oscillates between passing and failing stays in the degraded range without bouncing between up and down.

Expedited unknown resolution

When a backend enters unknown state (new, restarted, or resumed) its counter is pre-loaded to rise 1. This means a single probe result is enough to resolve the state:

  • 1 passup
  • 1 faildown (also via the special unknown shortcut below)

In addition, any failure while state is unknown transitions immediately to down, regardless of the counter value.

Example: rise=2, fall=3 (Max=4)

counter:  0   1   2   3   4
state:   DOWN DOWN  UP  UP  UP
                ^
                rise boundary

A backend starting from unknown has counter=1 (rise1). One pass → counter=2 → up. One fail while unknown → down immediately.

A backend that just became up sits at counter=2. It needs 3 failures to go down (2→1→0, crossing the rise boundary at 2→1).

A backend that has been fully healthy for a while sits at counter=4. It needs 3 failures to go down (4→3→2→1, crossing the rise boundary at 2→1).


Probe intervals

The interval used between probes depends on the backend's counter state:

Condition Interval used
State is unknown fast-interval (falls back to interval)
Counter = Max (fully healthy) interval
Counter = 0 (fully down) down-interval (falls back to interval)
Counter between 0 and Max (degraded) fast-interval (falls back to interval)

Using fast-interval in degraded and unknown states means a flapping or recovering backend is re-evaluated quickly without waiting a full interval. Using down-interval for fully down backends reduces probe traffic to servers that are known to be offline.


Transition events

Every state change is logged as backend-transition and emitted as a gRPC BackendEvent to all active WatchBackendEvents streams.

Backend added (config load or reload)

unknown → unknown  (code: start)

The counter is pre-loaded to rise 1. The first probe fires immediately at fast-interval (or interval if not configured). One pass produces unknown → up; one fail produces unknown → down.

If multiple backends start together they are staggered across the first interval to avoid probe bursts.

Probe pass

  • Counter increments.
  • If counter reaches rise from below: down → up (or unknown → up).
  • If already up: no transition. Next probe at fast-interval if degraded, interval if fully healthy.

Probe fail

  • Counter decrements.
  • If counter drops below rise from above: up → down.
  • If state is unknown: transition immediately to down regardless of counter.
  • Next probe at down-interval if fully down, fast-interval if degraded.

Pause

<any> → paused  (operator action)

The counter is reset to 0. Probes continue to fire on their normal schedule but all results are discarded. The backend stays paused until explicitly resumed.

Resume

paused → unknown  (operator action)

The counter is reset to rise 1. The probe goroutine is woken immediately (no wait for the next scheduled probe). One subsequent pass produces unknown → up; one fail produces unknown → down.

Backend removed (config reload)

<any> → removed  (code: removed)

The probe goroutine stops. No further state changes occur. The removed event is emitted using the frontend map from before the reload so that consumers can correlate it to the correct frontend.

Backend healthcheck config changed (config reload)

The old probe goroutine is stopped (<any> → removed) and a new one started (unknown → unknown, code: start). The new goroutine resolves state on the first probe as described under Backend added above.

Backend metadata changed without healthcheck change (config reload)

Weight, enabled flag, and similar fields are updated in place. The probe goroutine is not restarted and no transition event is emitted.


Log lines

All state changes produce a structured log line at INFO level:

{"level":"INFO","msg":"backend-transition","backend":"nginx0-ams","from":"up","to":"paused"}
{"level":"INFO","msg":"backend-transition","backend":"nginx0-ams","from":"paused","to":"unknown"}
{"level":"INFO","msg":"backend-transition","backend":"nginx0-ams","from":"unknown","to":"up","code":"L7OK","detail":""}

Probe-driven transitions also carry code and detail fields from the probe result (e.g. L4CON, L7STS, connection refused). Operator-driven transitions (pause, resume) carry empty code and detail.