- Replaced flat `backends: [...]` list on frontends with an ordered `pools:`
list; each pool has a name and a map of backends with per-pool weights (0–100,
default 100). Pools express priority: first pool with a healthy backend wins.
- Removed global backend weight (was on the backend, now lives in the pool).
- Config validation enforces non-empty pools, non-empty pool names, weight
range, and consistent address families across all pools of a frontend.
- Added `PoolBackendInfo { name, weight }` and changed `PoolInfo.backends` from
`repeated string` to `repeated PoolBackendInfo` so weights are visible over
the API.
- Full interactive shell with readline, tab completion, and `?` inline help.
- Command tree parser (Walk) handles fixed keywords and dynamic slot nodes;
prefix matching with exact-match priority.
- Commands: `show version/frontends/frontend/backends/backend/healthchecks/
healthcheck`, `set backend <name> pause|resume`, `quit`/`exit`.
- `show frontend` output is hierarchical (pools → backends) with per-backend
weights and `[disabled]` notation; pool section uses fixed-width formatting
so ANSI color codes don't corrupt tabwriter alignment.
- `-color` flag (default true) wraps static field labels in dark-blue ANSI;
works correctly with tabwriter because all labels carry identical-length
escape sequences.
- `cmd/version.go` package holds `version`, `commit`, `date` vars set at build
time via `-ldflags -X`.
- `make build` / `make build-amd64` / `make build-arm64` all inject
`VERSION=0.1.1`, `COMMIT_HASH` (from `git rev-parse --short HEAD`), and
`DATE` (UTC ISO-8601).
- `maglevc` prints version on interactive startup and exposes `show version`.
- `maglevd` logs version/commit/date at startup; `-version` flag prints and exits.
- `doHTTPProbe` was building a `https://` target URL even though TLS was already
applied to the connection inside `inNetns`. `http.Transport` then wrapped the
connection in a second TLS layer, producing "http: server gave HTTP response
to HTTPS client". Fixed by always using `http://` in the target URL.
- Added `TestHTTPSProbe` using `httptest.NewTLSServer` to cover the full path.
- New `docs/user-guide.md`: maglevd flags/signals, maglevc commands, shell
completion, and command-tree parser walkthrough.
- New `docs/healthchecks.md`: state machine, rise/fall model, probe intervals,
all transition events with log examples.
- Updated `docs/config-guide.md`: pools design, removed global weight from
backends, updated all examples.
- Updated `README.md`: packaging table, build paths, corrected binary locations
(`/usr/sbin/maglevd`), config filename (`.yaml`).
- `debian/` directory contains `control.in`, `maglevd.service`, `default.maglev`,
`maglev.yaml` (example config), `conffiles`, `postinst`, `prerm`.
- `debian/build-deb.sh` stages a package tree and calls `dpkg-deb`; emits
`build/vpp-maglev_<version>~<commit>_<arch>.deb`.
- Cross-compiles for amd64 and arm64 in one `make pkg-deb` invocation.
- `maglevd` installed to `/usr/sbin/`, `maglevc` to `/usr/bin/`.
- Service reads `MAGLEV_CONFIG` from `/etc/default/maglev`
(default: `/etc/maglev/maglev.yaml`).
- Man pages `maglevd(8)` and `maglevc(1)` live in `docs/` and are gzip'd into
the package.
- All build output goes to `build/<arch>/`; `build/` is gitignored.
179 lines
5.8 KiB
Markdown
179 lines
5.8 KiB
Markdown
# Health Checking
|
||
|
||
`maglevd` probes each backend independently of how many frontends reference it.
|
||
Every backend runs exactly one probe goroutine. State changes are broadcast as
|
||
gRPC events to all connected `WatchBackendEvents` subscribers.
|
||
|
||
---
|
||
|
||
## States
|
||
|
||
| State | Meaning |
|
||
|---|---|
|
||
| `unknown` | Initial state; also entered after a resume or backend restart. |
|
||
| `up` | Backend is healthy and eligible to receive traffic. |
|
||
| `down` | Backend has failed enough consecutive probes to be considered offline. |
|
||
| `paused` | Health checking suspended by an operator. Probes fire but results are discarded. |
|
||
| `removed` | Backend was removed from configuration. No further probes are accepted. |
|
||
|
||
---
|
||
|
||
## Rise / fall counter
|
||
|
||
The state machine is driven by HAProxy's single-integer health counter.
|
||
|
||
```
|
||
counter ∈ [0, rise + fall − 1] (called Max below)
|
||
|
||
backend is UP when counter ≥ rise
|
||
backend is DOWN when counter < rise
|
||
```
|
||
|
||
On each probe:
|
||
- **pass** — counter increments, ceiling at Max.
|
||
- **fail** — counter decrements, floor at 0.
|
||
|
||
This gives **hysteresis**: a backend that is barely up (counter = rise) needs
|
||
`fall` consecutive failures before it transitions to down. A backend that is
|
||
fully down (counter = 0) needs `rise` consecutive passes to come back up. A
|
||
backend that oscillates between passing and failing stays in the degraded range
|
||
without bouncing between up and down.
|
||
|
||
### Expedited unknown resolution
|
||
|
||
When a backend enters `unknown` state (new, restarted, or resumed) its counter
|
||
is pre-loaded to `rise − 1`. This means a single probe result is enough to
|
||
resolve the state:
|
||
|
||
- **1 pass** → `up`
|
||
- **1 fail** → `down` (also via the special unknown shortcut below)
|
||
|
||
In addition, any failure while state is `unknown` transitions immediately to
|
||
`down`, regardless of the counter value.
|
||
|
||
### Example: rise=2, fall=3 (Max=4)
|
||
|
||
```
|
||
counter: 0 1 2 3 4
|
||
state: DOWN DOWN UP UP UP
|
||
^
|
||
rise boundary
|
||
```
|
||
|
||
A backend starting from unknown has counter=1 (rise−1). One pass → counter=2
|
||
→ up. One fail while unknown → down immediately.
|
||
|
||
A backend that just became up sits at counter=2. It needs 3 failures to go down
|
||
(2→1→0, crossing the rise boundary at 2→1).
|
||
|
||
A backend that has been fully healthy for a while sits at counter=4. It needs 3
|
||
failures to go down (4→3→2→1, crossing the rise boundary at 2→1).
|
||
|
||
---
|
||
|
||
## Probe intervals
|
||
|
||
The interval used between probes depends on the backend's counter state:
|
||
|
||
| Condition | Interval used |
|
||
|---|---|
|
||
| State is `unknown` | `fast-interval` (falls back to `interval`) |
|
||
| Counter = Max (fully healthy) | `interval` |
|
||
| Counter = 0 (fully down) | `down-interval` (falls back to `interval`) |
|
||
| Counter between 0 and Max (degraded) | `fast-interval` (falls back to `interval`) |
|
||
|
||
Using `fast-interval` in degraded and unknown states means a flapping or
|
||
recovering backend is re-evaluated quickly without waiting a full `interval`.
|
||
Using `down-interval` for fully down backends reduces probe traffic to servers
|
||
that are known to be offline.
|
||
|
||
---
|
||
|
||
## Transition events
|
||
|
||
Every state change is logged as `backend-transition` and emitted as a gRPC
|
||
`BackendEvent` to all active `WatchBackendEvents` streams.
|
||
|
||
### Backend added (config load or reload)
|
||
|
||
```
|
||
unknown → unknown (code: start)
|
||
```
|
||
|
||
The counter is pre-loaded to `rise − 1`. The first probe fires immediately at
|
||
`fast-interval` (or `interval` if not configured). One pass produces `unknown →
|
||
up`; one fail produces `unknown → down`.
|
||
|
||
If multiple backends start together they are staggered across the first
|
||
`interval` to avoid probe bursts.
|
||
|
||
### Probe pass
|
||
|
||
- Counter increments.
|
||
- If counter reaches `rise` from below: `down → up` (or `unknown → up`).
|
||
- If already up: no transition. Next probe at `fast-interval` if degraded,
|
||
`interval` if fully healthy.
|
||
|
||
### Probe fail
|
||
|
||
- Counter decrements.
|
||
- If counter drops below `rise` from above: `up → down`.
|
||
- If state is `unknown`: transition immediately to `down` regardless of counter.
|
||
- Next probe at `down-interval` if fully down, `fast-interval` if degraded.
|
||
|
||
### Pause
|
||
|
||
```
|
||
<any> → paused (operator action)
|
||
```
|
||
|
||
The counter is reset to 0. Probes continue to fire on their normal schedule but
|
||
all results are discarded. The backend stays `paused` until explicitly resumed.
|
||
|
||
### Resume
|
||
|
||
```
|
||
paused → unknown (operator action)
|
||
```
|
||
|
||
The counter is reset to `rise − 1`. The probe goroutine is woken immediately
|
||
(no wait for the next scheduled probe). One subsequent pass produces `unknown →
|
||
up`; one fail produces `unknown → down`.
|
||
|
||
### Backend removed (config reload)
|
||
|
||
```
|
||
<any> → removed (code: removed)
|
||
```
|
||
|
||
The probe goroutine stops. No further state changes occur. The removed event is
|
||
emitted using the frontend map from before the reload so that consumers can
|
||
correlate it to the correct frontend.
|
||
|
||
### Backend healthcheck config changed (config reload)
|
||
|
||
The old probe goroutine is stopped (`<any> → removed`) and a new one started
|
||
(`unknown → unknown`, code: `start`). The new goroutine resolves state on the
|
||
first probe as described under *Backend added* above.
|
||
|
||
### Backend metadata changed without healthcheck change (config reload)
|
||
|
||
Weight, enabled flag, and similar fields are updated in place. The probe
|
||
goroutine is not restarted and no transition event is emitted.
|
||
|
||
---
|
||
|
||
## Log lines
|
||
|
||
All state changes produce a structured log line at `INFO` level:
|
||
|
||
```json
|
||
{"level":"INFO","msg":"backend-transition","backend":"nginx0-ams","from":"up","to":"paused"}
|
||
{"level":"INFO","msg":"backend-transition","backend":"nginx0-ams","from":"paused","to":"unknown"}
|
||
{"level":"INFO","msg":"backend-transition","backend":"nginx0-ams","from":"unknown","to":"up","code":"L7OK","detail":""}
|
||
```
|
||
|
||
Probe-driven transitions also carry `code` and `detail` fields from the probe
|
||
result (e.g. `L4CON`, `L7STS`, `connection refused`). Operator-driven
|
||
transitions (pause, resume) carry empty code and detail.
|