Pools, CLI, versioning, Debian packaging, HTTPS fix
- Replaced flat `backends: [...]` list on frontends with an ordered `pools:`
list; each pool has a name and a map of backends with per-pool weights (0–100,
default 100). Pools express priority: first pool with a healthy backend wins.
- Removed global backend weight (was on the backend, now lives in the pool).
- Config validation enforces non-empty pools, non-empty pool names, weight
range, and consistent address families across all pools of a frontend.
- Added `PoolBackendInfo { name, weight }` and changed `PoolInfo.backends` from
`repeated string` to `repeated PoolBackendInfo` so weights are visible over
the API.
- Full interactive shell with readline, tab completion, and `?` inline help.
- Command tree parser (Walk) handles fixed keywords and dynamic slot nodes;
prefix matching with exact-match priority.
- Commands: `show version/frontends/frontend/backends/backend/healthchecks/
healthcheck`, `set backend <name> pause|resume`, `quit`/`exit`.
- `show frontend` output is hierarchical (pools → backends) with per-backend
weights and `[disabled]` notation; pool section uses fixed-width formatting
so ANSI color codes don't corrupt tabwriter alignment.
- `-color` flag (default true) wraps static field labels in dark-blue ANSI;
works correctly with tabwriter because all labels carry identical-length
escape sequences.
- `cmd/version.go` package holds `version`, `commit`, `date` vars set at build
time via `-ldflags -X`.
- `make build` / `make build-amd64` / `make build-arm64` all inject
`VERSION=0.1.1`, `COMMIT_HASH` (from `git rev-parse --short HEAD`), and
`DATE` (UTC ISO-8601).
- `maglevc` prints version on interactive startup and exposes `show version`.
- `maglevd` logs version/commit/date at startup; `-version` flag prints and exits.
- `doHTTPProbe` was building a `https://` target URL even though TLS was already
applied to the connection inside `inNetns`. `http.Transport` then wrapped the
connection in a second TLS layer, producing "http: server gave HTTP response
to HTTPS client". Fixed by always using `http://` in the target URL.
- Added `TestHTTPSProbe` using `httptest.NewTLSServer` to cover the full path.
- New `docs/user-guide.md`: maglevd flags/signals, maglevc commands, shell
completion, and command-tree parser walkthrough.
- New `docs/healthchecks.md`: state machine, rise/fall model, probe intervals,
all transition events with log examples.
- Updated `docs/config-guide.md`: pools design, removed global weight from
backends, updated all examples.
- Updated `README.md`: packaging table, build paths, corrected binary locations
(`/usr/sbin/maglevd`), config filename (`.yaml`).
- `debian/` directory contains `control.in`, `maglevd.service`, `default.maglev`,
`maglev.yaml` (example config), `conffiles`, `postinst`, `prerm`.
- `debian/build-deb.sh` stages a package tree and calls `dpkg-deb`; emits
`build/vpp-maglev_<version>~<commit>_<arch>.deb`.
- Cross-compiles for amd64 and arm64 in one `make pkg-deb` invocation.
- `maglevd` installed to `/usr/sbin/`, `maglevc` to `/usr/bin/`.
- Service reads `MAGLEV_CONFIG` from `/etc/default/maglev`
(default: `/etc/maglev/maglev.yaml`).
- Man pages `maglevd(8)` and `maglevc(1)` live in `docs/` and are gzip'd into
the package.
- All build output goes to `build/<arch>/`; `build/` is gitignored.
This commit is contained in:
@@ -77,12 +77,13 @@ Common fields (all types):
|
||||
* ***probe-ipv6-src***: An optional IPv6 source address used when probing IPv6 backends.
|
||||
Must be an IPv6 address. When omitted, the OS chooses the source address.
|
||||
* ***interval***: Required. A positive Go duration string (e.g. `2s`, `500ms`) controlling
|
||||
how often a probe is sent when the backend is fully healthy or in the initial unknown state.
|
||||
how often a probe is sent when the backend is fully healthy (counter at maximum).
|
||||
* ***fast-interval***: Optional. A positive duration used instead of `interval` while the
|
||||
backend's health counter is degraded (between down and up). When omitted, `interval` is used.
|
||||
backend's health counter is degraded (between down and up) or in `unknown` state. When
|
||||
omitted, `interval` is used.
|
||||
* ***down-interval***: Optional. A positive duration used instead of `interval` while the
|
||||
backend is fully down. When omitted, `interval` is used. Setting this to a longer value
|
||||
reduces probe traffic to backends that are known to be offline.
|
||||
backend is fully down (counter at zero). When omitted, `interval` is used. Setting this to
|
||||
a longer value reduces probe traffic to backends that are known to be offline.
|
||||
* ***timeout***: Required. A positive duration after which an in-flight probe is abandoned
|
||||
and counted as a failure.
|
||||
* ***rise***: The number of consecutive successes required to transition from down to up.
|
||||
@@ -193,9 +194,6 @@ multiple frontends.
|
||||
* ***enabled***: A boolean controlling whether this backend participates in any frontend.
|
||||
When `false`, the backend is excluded entirely and no probe goroutine is started.
|
||||
Defaults to `true`.
|
||||
* ***weight***: An integer between 0 and 100 (inclusive) expressing the relative weight of
|
||||
this backend in a frontend's pool. `0` keeps the backend in the pool but assigns it no
|
||||
traffic. Defaults to `100`.
|
||||
|
||||
Examples:
|
||||
```yaml
|
||||
@@ -206,7 +204,6 @@ backends:
|
||||
nginx0-lon:
|
||||
address: 198.51.100.11
|
||||
healthcheck: nginx-http
|
||||
weight: 50
|
||||
nginx0-draining:
|
||||
address: 198.51.100.12
|
||||
healthcheck: nginx-http
|
||||
@@ -220,8 +217,8 @@ backends:
|
||||
|
||||
## frontends
|
||||
|
||||
A named map of virtual IPs (VIPs). Each frontend ties together a listener address with a set
|
||||
of backends. The gRPC API exposes frontends by name.
|
||||
A named map of virtual IPs (VIPs). Each frontend ties together a listener address with an
|
||||
ordered list of backend pools. The gRPC API exposes frontends by name.
|
||||
|
||||
* ***description***: An optional free-text string for documentation purposes.
|
||||
* ***address***: Required. The IPv4 or IPv6 address of the VIP.
|
||||
@@ -232,38 +229,50 @@ of backends. The gRPC API exposes frontends by name.
|
||||
`protocol` to be set. When omitted, the frontend matches all ports. Note that the
|
||||
frontend port is independent of the healthcheck port: a frontend on port 443 may use
|
||||
a healthcheck that probes port 80.
|
||||
* ***backends***: Required. A non-empty list of backend names. All backends in a frontend
|
||||
must have addresses of the same address family (all IPv4 or all IPv6). Every name must
|
||||
refer to an existing entry in the `backends` section.
|
||||
* ***pools***: Required. A non-empty ordered list of pool objects. Pools express priority:
|
||||
the first pool is preferred; subsequent pools act as fallbacks. All backends across all
|
||||
pools in a frontend must have addresses of the same address family (all IPv4 or all IPv6).
|
||||
|
||||
Each pool has:
|
||||
|
||||
* ***name***: Required. A non-empty string identifying the pool (e.g. `primary`, `fallback`).
|
||||
* ***backends***: A map of backend names to per-pool backend options. Every name must refer
|
||||
to an existing entry in the `backends` section.
|
||||
|
||||
Per-pool backend options:
|
||||
|
||||
* ***weight***: An integer between 0 and 100 (inclusive) expressing the relative weight of
|
||||
this backend within the pool. `0` keeps the backend in the pool but assigns it no traffic.
|
||||
Defaults to `100`. Weight is per-pool, not global — the same backend can appear with
|
||||
different weights in different frontends.
|
||||
|
||||
Examples:
|
||||
```yaml
|
||||
frontends:
|
||||
nginx-v4-http:
|
||||
description: "IPv4 HTTP VIP"
|
||||
description: "IPv4 HTTP VIP with fallback"
|
||||
address: 198.51.100.1
|
||||
protocol: tcp
|
||||
port: 80
|
||||
backends: [nginx0-ams, nginx0-lon]
|
||||
|
||||
nginx-v4-https:
|
||||
description: "IPv4 HTTPS VIP — reuses the same backends as HTTP"
|
||||
address: 198.51.100.1
|
||||
protocol: tcp
|
||||
port: 443
|
||||
backends: [nginx0-ams, nginx0-lon]
|
||||
pools:
|
||||
- name: primary
|
||||
backends:
|
||||
nginx0-ams: { weight: 10 }
|
||||
nginx0-lon: {}
|
||||
- name: fallback
|
||||
backends:
|
||||
nginx0-fra: {}
|
||||
|
||||
maildrop-imaps:
|
||||
description: "IMAPS VIP"
|
||||
address: 2001:db8::1
|
||||
protocol: tcp
|
||||
port: 993
|
||||
backends: [maildrop0-ams, maildrop0-lon]
|
||||
|
||||
catchall:
|
||||
description: "Match all traffic to this VIP regardless of protocol or port"
|
||||
address: 198.51.100.2
|
||||
backends: [static-backend]
|
||||
pools:
|
||||
- name: primary
|
||||
backends:
|
||||
maildrop0-ams: {}
|
||||
maildrop0-lon: {}
|
||||
```
|
||||
|
||||
---
|
||||
@@ -322,7 +331,6 @@ maglev:
|
||||
nginx0-fra:
|
||||
address: 198.51.100.12
|
||||
healthcheck: nginx
|
||||
weight: 50
|
||||
maildrop0-ams:
|
||||
address: 2001:db8:1::10
|
||||
healthcheck: dovecot
|
||||
@@ -332,23 +340,46 @@ maglev:
|
||||
|
||||
frontends:
|
||||
nginx-http:
|
||||
description: "HTTP VIP"
|
||||
description: "HTTP VIP with fallback"
|
||||
address: 198.51.100.1
|
||||
protocol: tcp
|
||||
port: 80
|
||||
backends: [nginx0-ams, nginx0-lon, nginx0-fra]
|
||||
pools:
|
||||
- name: primary
|
||||
backends:
|
||||
nginx0-ams: { weight: 10 }
|
||||
nginx0-lon: {}
|
||||
- name: fallback
|
||||
backends:
|
||||
nginx0-fra: {}
|
||||
|
||||
nginx-https:
|
||||
description: "HTTPS VIP — same backends, different port"
|
||||
address: 198.51.100.1
|
||||
protocol: tcp
|
||||
port: 443
|
||||
backends: [nginx0-ams, nginx0-lon, nginx0-fra]
|
||||
pools:
|
||||
- name: primary
|
||||
backends:
|
||||
nginx0-ams: { weight: 10 }
|
||||
nginx0-lon: {}
|
||||
- name: fallback
|
||||
backends:
|
||||
nginx0-fra: {}
|
||||
|
||||
maildrop-imaps:
|
||||
description: "IMAPS VIP"
|
||||
address: 2001:db8::1
|
||||
protocol: tcp
|
||||
port: 993
|
||||
backends: [maildrop0-ams, maildrop0-lon]
|
||||
pools:
|
||||
- name: primary
|
||||
backends:
|
||||
maildrop0-ams: {}
|
||||
maildrop0-lon: {}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
For a detailed description of the health state machine, probe intervals, and all
|
||||
transition events, see [healthchecks.md](healthchecks.md).
|
||||
|
||||
178
docs/healthchecks.md
Normal file
178
docs/healthchecks.md
Normal file
@@ -0,0 +1,178 @@
|
||||
# Health Checking
|
||||
|
||||
`maglevd` probes each backend independently of how many frontends reference it.
|
||||
Every backend runs exactly one probe goroutine. State changes are broadcast as
|
||||
gRPC events to all connected `WatchBackendEvents` subscribers.
|
||||
|
||||
---
|
||||
|
||||
## States
|
||||
|
||||
| State | Meaning |
|
||||
|---|---|
|
||||
| `unknown` | Initial state; also entered after a resume or backend restart. |
|
||||
| `up` | Backend is healthy and eligible to receive traffic. |
|
||||
| `down` | Backend has failed enough consecutive probes to be considered offline. |
|
||||
| `paused` | Health checking suspended by an operator. Probes fire but results are discarded. |
|
||||
| `removed` | Backend was removed from configuration. No further probes are accepted. |
|
||||
|
||||
---
|
||||
|
||||
## Rise / fall counter
|
||||
|
||||
The state machine is driven by HAProxy's single-integer health counter.
|
||||
|
||||
```
|
||||
counter ∈ [0, rise + fall − 1] (called Max below)
|
||||
|
||||
backend is UP when counter ≥ rise
|
||||
backend is DOWN when counter < rise
|
||||
```
|
||||
|
||||
On each probe:
|
||||
- **pass** — counter increments, ceiling at Max.
|
||||
- **fail** — counter decrements, floor at 0.
|
||||
|
||||
This gives **hysteresis**: a backend that is barely up (counter = rise) needs
|
||||
`fall` consecutive failures before it transitions to down. A backend that is
|
||||
fully down (counter = 0) needs `rise` consecutive passes to come back up. A
|
||||
backend that oscillates between passing and failing stays in the degraded range
|
||||
without bouncing between up and down.
|
||||
|
||||
### Expedited unknown resolution
|
||||
|
||||
When a backend enters `unknown` state (new, restarted, or resumed) its counter
|
||||
is pre-loaded to `rise − 1`. This means a single probe result is enough to
|
||||
resolve the state:
|
||||
|
||||
- **1 pass** → `up`
|
||||
- **1 fail** → `down` (also via the special unknown shortcut below)
|
||||
|
||||
In addition, any failure while state is `unknown` transitions immediately to
|
||||
`down`, regardless of the counter value.
|
||||
|
||||
### Example: rise=2, fall=3 (Max=4)
|
||||
|
||||
```
|
||||
counter: 0 1 2 3 4
|
||||
state: DOWN DOWN UP UP UP
|
||||
^
|
||||
rise boundary
|
||||
```
|
||||
|
||||
A backend starting from unknown has counter=1 (rise−1). One pass → counter=2
|
||||
→ up. One fail while unknown → down immediately.
|
||||
|
||||
A backend that just became up sits at counter=2. It needs 3 failures to go down
|
||||
(2→1→0, crossing the rise boundary at 2→1).
|
||||
|
||||
A backend that has been fully healthy for a while sits at counter=4. It needs 3
|
||||
failures to go down (4→3→2→1, crossing the rise boundary at 2→1).
|
||||
|
||||
---
|
||||
|
||||
## Probe intervals
|
||||
|
||||
The interval used between probes depends on the backend's counter state:
|
||||
|
||||
| Condition | Interval used |
|
||||
|---|---|
|
||||
| State is `unknown` | `fast-interval` (falls back to `interval`) |
|
||||
| Counter = Max (fully healthy) | `interval` |
|
||||
| Counter = 0 (fully down) | `down-interval` (falls back to `interval`) |
|
||||
| Counter between 0 and Max (degraded) | `fast-interval` (falls back to `interval`) |
|
||||
|
||||
Using `fast-interval` in degraded and unknown states means a flapping or
|
||||
recovering backend is re-evaluated quickly without waiting a full `interval`.
|
||||
Using `down-interval` for fully down backends reduces probe traffic to servers
|
||||
that are known to be offline.
|
||||
|
||||
---
|
||||
|
||||
## Transition events
|
||||
|
||||
Every state change is logged as `backend-transition` and emitted as a gRPC
|
||||
`BackendEvent` to all active `WatchBackendEvents` streams.
|
||||
|
||||
### Backend added (config load or reload)
|
||||
|
||||
```
|
||||
unknown → unknown (code: start)
|
||||
```
|
||||
|
||||
The counter is pre-loaded to `rise − 1`. The first probe fires immediately at
|
||||
`fast-interval` (or `interval` if not configured). One pass produces `unknown →
|
||||
up`; one fail produces `unknown → down`.
|
||||
|
||||
If multiple backends start together they are staggered across the first
|
||||
`interval` to avoid probe bursts.
|
||||
|
||||
### Probe pass
|
||||
|
||||
- Counter increments.
|
||||
- If counter reaches `rise` from below: `down → up` (or `unknown → up`).
|
||||
- If already up: no transition. Next probe at `fast-interval` if degraded,
|
||||
`interval` if fully healthy.
|
||||
|
||||
### Probe fail
|
||||
|
||||
- Counter decrements.
|
||||
- If counter drops below `rise` from above: `up → down`.
|
||||
- If state is `unknown`: transition immediately to `down` regardless of counter.
|
||||
- Next probe at `down-interval` if fully down, `fast-interval` if degraded.
|
||||
|
||||
### Pause
|
||||
|
||||
```
|
||||
<any> → paused (operator action)
|
||||
```
|
||||
|
||||
The counter is reset to 0. Probes continue to fire on their normal schedule but
|
||||
all results are discarded. The backend stays `paused` until explicitly resumed.
|
||||
|
||||
### Resume
|
||||
|
||||
```
|
||||
paused → unknown (operator action)
|
||||
```
|
||||
|
||||
The counter is reset to `rise − 1`. The probe goroutine is woken immediately
|
||||
(no wait for the next scheduled probe). One subsequent pass produces `unknown →
|
||||
up`; one fail produces `unknown → down`.
|
||||
|
||||
### Backend removed (config reload)
|
||||
|
||||
```
|
||||
<any> → removed (code: removed)
|
||||
```
|
||||
|
||||
The probe goroutine stops. No further state changes occur. The removed event is
|
||||
emitted using the frontend map from before the reload so that consumers can
|
||||
correlate it to the correct frontend.
|
||||
|
||||
### Backend healthcheck config changed (config reload)
|
||||
|
||||
The old probe goroutine is stopped (`<any> → removed`) and a new one started
|
||||
(`unknown → unknown`, code: `start`). The new goroutine resolves state on the
|
||||
first probe as described under *Backend added* above.
|
||||
|
||||
### Backend metadata changed without healthcheck change (config reload)
|
||||
|
||||
Weight, enabled flag, and similar fields are updated in place. The probe
|
||||
goroutine is not restarted and no transition event is emitted.
|
||||
|
||||
---
|
||||
|
||||
## Log lines
|
||||
|
||||
All state changes produce a structured log line at `INFO` level:
|
||||
|
||||
```json
|
||||
{"level":"INFO","msg":"backend-transition","backend":"nginx0-ams","from":"up","to":"paused"}
|
||||
{"level":"INFO","msg":"backend-transition","backend":"nginx0-ams","from":"paused","to":"unknown"}
|
||||
{"level":"INFO","msg":"backend-transition","backend":"nginx0-ams","from":"unknown","to":"up","code":"L7OK","detail":""}
|
||||
```
|
||||
|
||||
Probe-driven transitions also carry `code` and `detail` fields from the probe
|
||||
result (e.g. `L4CON`, `L7STS`, `connection refused`). Operator-driven
|
||||
transitions (pause, resume) carry empty code and detail.
|
||||
112
docs/maglevc.1
Normal file
112
docs/maglevc.1
Normal file
@@ -0,0 +1,112 @@
|
||||
.TH MAGLEVC 1 "April 2026" "vpp\-maglev" "User Commands"
|
||||
.SH NAME
|
||||
maglevc \- Maglev health\-checker CLI client
|
||||
.SH SYNOPSIS
|
||||
.B maglevc
|
||||
[\fB\-server\fR \fIaddr\fR]
|
||||
[\fB\-color\fR[=\fIbool\fR]]
|
||||
[\fIcommand\fR [\fIargs\fR...]]
|
||||
.SH DESCRIPTION
|
||||
.B maglevc
|
||||
is an interactive CLI client for
|
||||
.BR maglevd (8).
|
||||
Without arguments it opens a readline shell with tab completion and
|
||||
inline help.
|
||||
A command may also be passed directly on the command line for one\-shot use,
|
||||
which is useful for scripting (use
|
||||
.B \-color=false
|
||||
to suppress ANSI codes).
|
||||
.PP
|
||||
When the shell starts it prints the build version and connects to the
|
||||
.B maglevd
|
||||
gRPC server specified by
|
||||
.BR \-server .
|
||||
.SH OPTIONS
|
||||
.TP
|
||||
.BI \-server " addr"
|
||||
Address of the
|
||||
.B maglevd
|
||||
gRPC server.
|
||||
(default:
|
||||
.IR localhost:9090 )
|
||||
.TP
|
||||
.BR \-color [=\fIbool\fR]
|
||||
Colorize static field labels in output using ANSI dark blue.
|
||||
(default: true)
|
||||
Pass
|
||||
.B \-color=false
|
||||
to disable, e.g.\& when piping output.
|
||||
.SH COMMANDS
|
||||
Commands are entered at the
|
||||
.B maglevc>
|
||||
prompt or passed as arguments on the command line.
|
||||
All static tokens support tab completion; dynamic names (frontend, backend,
|
||||
health\-check names) are completed by querying the server.
|
||||
Type
|
||||
.B ?
|
||||
at any point to list completions without advancing the input.
|
||||
.SS Show commands
|
||||
.TP
|
||||
.B show version
|
||||
Print build version, commit hash, and build date.
|
||||
.TP
|
||||
.B show frontends
|
||||
List all configured frontends.
|
||||
.TP
|
||||
.BI "show frontend " name
|
||||
Show address, protocol, port, description, and pools (with weights and
|
||||
disabled\-backend notation) for the named frontend.
|
||||
.TP
|
||||
.B show backends
|
||||
List all active backends.
|
||||
.TP
|
||||
.BI "show backend " name
|
||||
Show address, current health state (with duration), enabled flag,
|
||||
health\-check name, and recent state transitions with timestamps.
|
||||
.TP
|
||||
.B show healthchecks
|
||||
List all configured health checks.
|
||||
.TP
|
||||
.BI "show healthcheck " name
|
||||
Show the full configuration of the named health check.
|
||||
.SS Set commands
|
||||
.TP
|
||||
.BI "set backend " "name " pause
|
||||
Pause health checking for a backend, freezing its state.
|
||||
.TP
|
||||
.BI "set backend " "name " resume
|
||||
Resume health checking for a backend; state resets to
|
||||
.BR unknown .
|
||||
.SS Shell commands
|
||||
.TP
|
||||
.BR quit ", " exit
|
||||
Exit the interactive shell.
|
||||
.SH COMPLETION
|
||||
In interactive mode, press
|
||||
.B Tab
|
||||
to complete the current token.
|
||||
If more than one completion is possible, all candidates are listed.
|
||||
Type
|
||||
.B ?
|
||||
anywhere on the line to list candidates at that position without consuming
|
||||
the character or advancing the cursor.
|
||||
.SH EXAMPLES
|
||||
One\-shot query (no color, suitable for scripts):
|
||||
.PP
|
||||
.RS
|
||||
.EX
|
||||
maglevc \-color=false show backends
|
||||
.EE
|
||||
.RE
|
||||
.PP
|
||||
Interactive session:
|
||||
.PP
|
||||
.RS
|
||||
.EX
|
||||
maglevc \-server 10.0.0.1:9090
|
||||
.EE
|
||||
.RE
|
||||
.SH SEE ALSO
|
||||
.BR maglevd (8)
|
||||
.SH AUTHOR
|
||||
Pim van Pelt <pim@ipng.ch>
|
||||
85
docs/maglevd.8
Normal file
85
docs/maglevd.8
Normal file
@@ -0,0 +1,85 @@
|
||||
.TH MAGLEVD 8 "April 2026" "vpp\-maglev" "System Administration"
|
||||
.SH NAME
|
||||
maglevd \- Maglev health\-checker daemon
|
||||
.SH SYNOPSIS
|
||||
.B maglevd
|
||||
[\fB\-config\fR \fIfile\fR]
|
||||
[\fB\-grpc\-addr\fR \fIaddr\fR]
|
||||
[\fB\-log\-level\fR \fIlevel\fR]
|
||||
[\fB\-version\fR]
|
||||
.SH DESCRIPTION
|
||||
.B maglevd
|
||||
is a health\-checker daemon that monitors backends (HTTP, TCP, ICMP) and
|
||||
exposes their aggregated state via a gRPC API.
|
||||
Configuration is loaded from a YAML file.
|
||||
A running daemon reloads its configuration when it receives
|
||||
.BR SIGHUP .
|
||||
.PP
|
||||
Backends are tracked with a rise/fall counter model.
|
||||
Each backend cycles through the states
|
||||
.BR unknown ,
|
||||
.BR up ,
|
||||
.BR down ,
|
||||
and
|
||||
.B paused
|
||||
(operator\-set).
|
||||
Health\-check intervals adapt automatically: a faster interval is used when
|
||||
a backend is not fully healthy, and a slower interval when it has been
|
||||
continuously down.
|
||||
.SH OPTIONS
|
||||
Each flag may also be supplied via an environment variable (shown in
|
||||
parentheses); the flag takes precedence.
|
||||
.TP
|
||||
.BI \-config " file"
|
||||
Path to the YAML configuration file.
|
||||
.RI "(default: " /etc/maglev/maglev.conf "; env: " MAGLEV_CONFIG )
|
||||
.TP
|
||||
.BI \-grpc\-addr " addr"
|
||||
TCP address on which the gRPC server listens.
|
||||
.RI "(default: " :9090 "; env: " MAGLEV_GRPC_ADDR )
|
||||
.TP
|
||||
.BI \-log\-level " level"
|
||||
Structured\-log verbosity:
|
||||
.BR debug ,
|
||||
.BR info ,
|
||||
.BR warn ,
|
||||
or
|
||||
.BR error .
|
||||
.RI "(default: " info "; env: " MAGLEV_LOG_LEVEL )
|
||||
.TP
|
||||
.B \-version
|
||||
Print version, commit hash, and build date, then exit.
|
||||
.SH SIGNALS
|
||||
.TP
|
||||
.B SIGHUP
|
||||
Reload the configuration file without restarting.
|
||||
New backends are added, removed backends are stopped, and unchanged
|
||||
backend workers are left running.
|
||||
.TP
|
||||
.BR SIGTERM ", " SIGINT
|
||||
Gracefully shut down: drain active gRPC streams, then exit.
|
||||
.SH FILES
|
||||
.TP
|
||||
.I /etc/maglev/maglev.conf
|
||||
Default configuration file (YAML).
|
||||
.TP
|
||||
.I /etc/default/maglev
|
||||
Environment file sourced by the systemd unit before starting
|
||||
.BR maglevd .
|
||||
.SH CONFIGURATION
|
||||
The configuration file uses YAML and has four top\-level sections under the
|
||||
.B maglev
|
||||
key:
|
||||
.BR healthchecker ,
|
||||
.BR healthchecks ,
|
||||
.BR backends ,
|
||||
and
|
||||
.BR frontends .
|
||||
.PP
|
||||
See the example at
|
||||
.I /etc/maglev/maglev.conf
|
||||
and the full reference in the project documentation.
|
||||
.SH SEE ALSO
|
||||
.BR maglevc (1)
|
||||
.SH AUTHOR
|
||||
Pim van Pelt <pim@ipng.ch>
|
||||
133
docs/user-guide.md
Normal file
133
docs/user-guide.md
Normal file
@@ -0,0 +1,133 @@
|
||||
# User Guide
|
||||
|
||||
## maglevd
|
||||
|
||||
`maglevd` is the health-checker daemon. It probes backends according to the
|
||||
configuration file, maintains their health state, and exposes a gRPC API for
|
||||
inspection and control.
|
||||
|
||||
### Flags
|
||||
|
||||
| Flag | Environment variable | Default | Description |
|
||||
|---|---|---|---|
|
||||
| `--config` | `MAGLEV_CONFIG` | `/etc/maglev/maglev.yaml` | Path to the YAML configuration file. |
|
||||
| `--grpc-addr` | `MAGLEV_GRPC_ADDR` | `:9090` | TCP address on which the gRPC server listens. |
|
||||
| `--log-level` | `MAGLEV_LOG_LEVEL` | `info` | Log verbosity: `debug`, `info`, `warn`, or `error`. |
|
||||
| `--version` | — | — | Print version, commit hash, and build date, then exit. |
|
||||
|
||||
Flags take precedence over environment variables. Both are optional; defaults
|
||||
are used for anything not set.
|
||||
|
||||
### Signals
|
||||
|
||||
| Signal | Effect |
|
||||
|---|---|
|
||||
| `SIGHUP` | Reload the configuration file. New backends are started, removed backends are stopped, backends whose health-check config is unchanged continue probing without interruption. |
|
||||
| `SIGTERM` / `SIGINT` | Graceful shutdown. Active gRPC streams are closed, the server drains, then the process exits. |
|
||||
|
||||
### Capabilities
|
||||
|
||||
`maglevd` requires `CAP_NET_RAW` when any health check uses `type: icmp`.
|
||||
All other check types (`tcp`, `http`) use normal TCP sockets and require no
|
||||
special capabilities.
|
||||
|
||||
### Logging
|
||||
|
||||
All log output is written to stdout as JSON using Go's `log/slog`. The first
|
||||
line logged after the logger is configured is a `starting` record that includes
|
||||
`version`, `commit`, and `date`. Every state change emits a `backend-transition`
|
||||
line at `INFO` level. Set `--log-level debug` to see individual probe attempts
|
||||
and their outcomes.
|
||||
|
||||
---
|
||||
|
||||
## maglevc
|
||||
|
||||
`maglevc` is the interactive control-plane client. It connects to a running
|
||||
`maglevd` over gRPC and either executes a single command or drops into an
|
||||
interactive shell.
|
||||
|
||||
### Usage
|
||||
|
||||
```sh
|
||||
maglevc [--server host:port] [--color[=bool]] [command...]
|
||||
```
|
||||
|
||||
| Flag | Default | Description |
|
||||
|---|---|---|
|
||||
| `--server` | `localhost:9090` | Address of the `maglevd` gRPC server. |
|
||||
| `--color` | `true` | Colorize static field labels in output (dark blue ANSI). Pass `--color=false` to disable, e.g. when piping. |
|
||||
|
||||
When `command` arguments are supplied the command is executed and `maglevc`
|
||||
exits. When no arguments are given an interactive shell is started and the
|
||||
build version is printed on entry.
|
||||
|
||||
### Commands
|
||||
|
||||
```
|
||||
show version Print build version, commit hash, and build date.
|
||||
|
||||
show frontends List all frontend names.
|
||||
show frontend <name> Show address, protocol, port, description, and pools.
|
||||
Each pool lists its backends with weights (if != 100)
|
||||
and marks disabled backends with [disabled].
|
||||
|
||||
show backends List all backend names.
|
||||
show backend <name> Show address, current state (with duration in that state),
|
||||
enabled flag, health check, and recent state transitions
|
||||
with timestamps and how long ago each occurred.
|
||||
|
||||
show healthchecks List all health-check names.
|
||||
show healthcheck <name> Show full health-check configuration.
|
||||
|
||||
set backend <name> pause Suspend health checking for a backend, freezing its state.
|
||||
set backend <name> resume Resume health checking; backend re-enters unknown state
|
||||
and is probed immediately.
|
||||
|
||||
quit / exit Leave the interactive shell.
|
||||
```
|
||||
|
||||
### Interactive shell
|
||||
|
||||
The shell prompt is `maglev> `. Two completion mechanisms are available:
|
||||
|
||||
**Tab completion** — pressing `<Tab>` at any point completes the current token.
|
||||
Fixed keywords (commands and subcommands) are completed from the command tree.
|
||||
Backend, frontend, and health-check names are fetched live from the server with
|
||||
a 1-second timeout. If the partial token is unambiguous the word is completed
|
||||
in place; if multiple candidates exist they are listed and the prompt is
|
||||
restored.
|
||||
|
||||
**Inline help (`?`)** — typing `?` at any point prints the available
|
||||
completions for the current position, with a short description next to each
|
||||
keyword. The `?` character is not added to the input line.
|
||||
|
||||
Commands and keywords support **prefix matching**: typing `sh b` is equivalent
|
||||
to `show backend` provided the prefix is unambiguous. Exact matches always take
|
||||
priority over prefix matches, so `show backend` and `show backends` are
|
||||
unambiguous even though one is a prefix of the other.
|
||||
|
||||
### Command tree and parser
|
||||
|
||||
Commands form a tree of `Node` values. Each node has a fixed `Word` (a keyword)
|
||||
or is a *slot node* (marked by a `Dynamic` function that enumerates valid
|
||||
values at completion time). The parser (`Walk`) descends the tree token by
|
||||
token:
|
||||
|
||||
1. Try to match the current token against the fixed-keyword children of the
|
||||
current node (exact match first, then unique prefix match).
|
||||
2. If no fixed child matches, try a slot child — any token is accepted and
|
||||
stored as an argument.
|
||||
3. Stop when tokens are exhausted or no match is found.
|
||||
|
||||
The leaf node reached by `Walk` must have a `Run` function; otherwise the
|
||||
available sub-commands at that position are printed as help. Arguments
|
||||
collected from slot nodes are passed to `Run` as a slice.
|
||||
|
||||
Example walk for `set backend nginx0-ams pause`:
|
||||
|
||||
```
|
||||
root → set → backend → <name>(nginx0-ams collected as arg) → pause
|
||||
```
|
||||
|
||||
`pause.Run` is called with `args = ["nginx0-ams"]`.
|
||||
Reference in New Issue
Block a user