VPP LB counters, src-ip-sticky, and frontend state aggregation
New feature: per-VIP / per-backend runtime counters
* New GetVPPLBCounters RPC serving an in-process snapshot refreshed
by a 5s scrape loop (internal/vpp/lbstats.go). Each cycle pulls
the LB plugin's four SimpleCounters (next, first, untracked,
no-server) plus the FIB /net/route/to CombinedCounter for every
VIP and every backend host prefix via a single DumpStats call.
* FIB stats-index discovery via ip_route_lookup (internal/vpp/
fibstats.go); per-worker reduction happens in the collector.
* Prometheus collector exports vip_packets_total (kind label),
vip_route_{packets,bytes}_total, and backend_route_{packets,
bytes}_total. Metrics source interface extended with VIPStats /
BackendRouteStats; vpp.Client publishes snapshots via
atomic.Pointer and clears them on disconnect.
* New 'show vpp lb counters' CLI command. The 'show vpp lbstate'
and 'sync vpp lbstate' commands are restructured under 'show
vpp lb {state,counters}' / 'sync vpp lb state' to make room
for the new verb.
New feature: src-ip-sticky frontends
* New frontend YAML key 'src-ip-sticky' (bool). Plumbed through
config.Frontend, desiredVIP, and the lb_add_del_vip_v2 call.
* Reflected in gRPC FrontendInfo.src_ip_sticky and VPPLBVIP.
src_ip_sticky, and shown in 'show vpp lb state' output.
* Scraped back from VPP by parsing 'show lb vips verbose' through
cli_inband — lb_vip_details does not expose the flag. The same
scrape also recovers the LB pool index for each VIP, which the
stats-segment counters are keyed on. This is a documented
temporary workaround until VPP ships an lb_vip_v2_dump.
* src_ip_sticky cannot be mutated on a live VIP, so a flipped flag
triggers a tear-down-and-recreate in reconcileVIP (ASes deleted
with flush, VIP deleted, then re-added). Flip is logged.
New feature: frontend state aggregation and events
* New health.FrontendState (unknown/up/down) and FrontendTransition
types. A frontend is 'up' iff at least one backend has a nonzero
effective weight, 'unknown' iff no backend has real state yet,
and 'down' otherwise.
* Checker tracks per-frontend aggregate state, recomputing after
each backend transition and emitting a frontend-transition Event
on change. Reload drops entries for removed frontends.
* checker.Event gains an optional FrontendTransition pointer;
backend- vs. frontend-transition events are demultiplexed on
that field.
* WatchEvents now sends an initial snapshot of frontend state on
connect (mirroring the existing backend snapshot), subscribes
once to the checker stream, and fans out to backend/frontend
handlers based on the client's filter flags. The proto
FrontendEvent message grows name + transition fields.
* New Checker.FrontendState accessor.
Refactor: pure health helpers
* Moved the priority-failover selector and the (pool idx, active
pool, state, cfg weight) → (vpp weight, flush) mapping out of
internal/vpp/lbsync.go into a new internal/health/weights.go so
the checker can reuse them for frontend-state computation
without importing internal/vpp.
* New functions: health.ActivePoolIndex, BackendEffectiveWeight,
EffectiveWeights, ComputeFrontendState. lbsync.go now calls
these directly; vpp.EffectiveWeights is a thin wrapper over
health.EffectiveWeights retained for the gRPC observability
path. Fully unit-tested in internal/health/weights_test.go.
maglevc polish
* --color default is now mode-aware: on in the interactive shell,
off in one-shot mode so piped output is script-safe. Explicit
--color=true/false still overrides.
* New stripHostMask helper drops /32 and /128 from VIP display;
non-host prefixes pass through unchanged.
* Counter table column order fixed (first before next) and
packets/bytes columns renamed to fib-packets/fib-bytes to
clarify they come from the FIB, not the LB plugin.
Docs
* config-guide: document src-ip-sticky, including the VIP
recreate-on-change caveat.
* user-guide, maglevc.1, maglevd.8: updated command tree, new
counters command, color defaults, and the src-ip-sticky field.
This commit is contained in:
@@ -71,13 +71,19 @@ func buildTree() *Node {
|
|||||||
Children: []*Node{showHealthCheckName},
|
Children: []*Node{showHealthCheckName},
|
||||||
}
|
}
|
||||||
|
|
||||||
// show vpp info / lbstate
|
// show vpp info / lb state / lb counters
|
||||||
showVPPInfo := &Node{Word: "info", Help: "Show VPP version, uptime, and connection status", Run: runShowVPPInfo}
|
showVPPInfo := &Node{Word: "info", Help: "Show VPP version, uptime, and connection status", Run: runShowVPPInfo}
|
||||||
showVPPLBState := &Node{Word: "lbstate", Help: "Show VPP load-balancer state (VIPs and application servers)", Run: runShowVPPLBState}
|
showVPPLBState := &Node{Word: "state", Help: "Show VPP load-balancer state (VIPs and application servers)", Run: runShowVPPLBState}
|
||||||
|
showVPPLBCounters := &Node{Word: "counters", Help: "Show VPP per-VIP and per-backend packet/byte counters (refreshed every ~5s server-side)", Run: runShowVPPLBCounters}
|
||||||
|
showVPPLB := &Node{
|
||||||
|
Word: "lb",
|
||||||
|
Help: "VPP load-balancer information",
|
||||||
|
Children: []*Node{showVPPLBState, showVPPLBCounters},
|
||||||
|
}
|
||||||
showVPP := &Node{
|
showVPP := &Node{
|
||||||
Word: "vpp",
|
Word: "vpp",
|
||||||
Help: "VPP dataplane information",
|
Help: "VPP dataplane information",
|
||||||
Children: []*Node{showVPPInfo, showVPPLBState},
|
Children: []*Node{showVPPInfo, showVPPLB},
|
||||||
}
|
}
|
||||||
|
|
||||||
show.Children = []*Node{
|
show.Children = []*Node{
|
||||||
@@ -175,7 +181,7 @@ func buildTree() *Node {
|
|||||||
Children: []*Node{configCheck, configReload},
|
Children: []*Node{configCheck, configReload},
|
||||||
}
|
}
|
||||||
|
|
||||||
// sync vpp lbstate [<name>]
|
// sync vpp lb state [<name>]
|
||||||
//
|
//
|
||||||
// Without a name: run SyncLBStateAll (may remove stale VIPs).
|
// Without a name: run SyncLBStateAll (may remove stale VIPs).
|
||||||
// With a name: run SyncLBStateVIP(name) for just that frontend (no removals).
|
// With a name: run SyncLBStateVIP(name) for just that frontend (no removals).
|
||||||
@@ -186,15 +192,20 @@ func buildTree() *Node {
|
|||||||
Run: runSyncVPPLBState,
|
Run: runSyncVPPLBState,
|
||||||
}
|
}
|
||||||
syncVPPLBState := &Node{
|
syncVPPLBState := &Node{
|
||||||
Word: "lbstate",
|
Word: "state",
|
||||||
Help: "Sync the VPP load-balancer dataplane from the running config",
|
Help: "Sync the VPP load-balancer dataplane from the running config",
|
||||||
Run: runSyncVPPLBState,
|
Run: runSyncVPPLBState,
|
||||||
Children: []*Node{syncVPPLBStateName},
|
Children: []*Node{syncVPPLBStateName},
|
||||||
}
|
}
|
||||||
|
syncVPPLB := &Node{
|
||||||
|
Word: "lb",
|
||||||
|
Help: "VPP load-balancer sync commands",
|
||||||
|
Children: []*Node{syncVPPLBState},
|
||||||
|
}
|
||||||
syncVPP := &Node{
|
syncVPP := &Node{
|
||||||
Word: "vpp",
|
Word: "vpp",
|
||||||
Help: "VPP dataplane sync commands",
|
Help: "VPP dataplane sync commands",
|
||||||
Children: []*Node{syncVPPLBState},
|
Children: []*Node{syncVPPLB},
|
||||||
}
|
}
|
||||||
syncNode := &Node{
|
syncNode := &Node{
|
||||||
Word: "sync",
|
Word: "sync",
|
||||||
@@ -295,10 +306,11 @@ func runShowVPPLBState(ctx context.Context, client grpcapi.MaglevClient, _ []str
|
|||||||
for _, v := range state.Vips {
|
for _, v := range state.Vips {
|
||||||
fmt.Println()
|
fmt.Println()
|
||||||
w := tabwriter.NewWriter(os.Stdout, 0, 0, 2, ' ', 0)
|
w := tabwriter.NewWriter(os.Stdout, 0, 0, 2, ' ', 0)
|
||||||
fmt.Fprintf(w, "%s\t%s\n", label("vip"), v.Prefix)
|
fmt.Fprintf(w, "%s\t%s\n", label("vip"), stripHostMask(v.Prefix))
|
||||||
fmt.Fprintf(w, " %s\t%s\n", label("protocol"), protoString(v.Protocol))
|
fmt.Fprintf(w, " %s\t%s\n", label("protocol"), protoString(v.Protocol))
|
||||||
fmt.Fprintf(w, " %s\t%d\n", label("port"), v.Port)
|
fmt.Fprintf(w, " %s\t%d\n", label("port"), v.Port)
|
||||||
fmt.Fprintf(w, " %s\t%s\n", label("encap"), v.Encap)
|
fmt.Fprintf(w, " %s\t%s\n", label("encap"), v.Encap)
|
||||||
|
fmt.Fprintf(w, " %s\t%t\n", label("src-ip-sticky"), v.SrcIpSticky)
|
||||||
fmt.Fprintf(w, " %s\t%d\n", label("flow-table-length"), v.FlowTableLength)
|
fmt.Fprintf(w, " %s\t%d\n", label("flow-table-length"), v.FlowTableLength)
|
||||||
fmt.Fprintf(w, " %s\t%d\n", label("application-servers"), len(v.ApplicationServers))
|
fmt.Fprintf(w, " %s\t%d\n", label("application-servers"), len(v.ApplicationServers))
|
||||||
if err := w.Flush(); err != nil {
|
if err := w.Flush(); err != nil {
|
||||||
@@ -314,6 +326,80 @@ func runShowVPPLBState(ctx context.Context, client grpcapi.MaglevClient, _ []str
|
|||||||
return nil
|
return nil
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// runShowVPPLBCounters prints the per-VIP and per-backend runtime counters
|
||||||
|
// captured by maglevd's 5s scrape loop. Values are up to 5 seconds stale;
|
||||||
|
// Prometheus is the right tool if you need live rates.
|
||||||
|
func runShowVPPLBCounters(ctx context.Context, client grpcapi.MaglevClient, _ []string) error {
|
||||||
|
ctx, cancel := context.WithTimeout(ctx, callTimeout)
|
||||||
|
defer cancel()
|
||||||
|
resp, err := client.GetVPPLBCounters(ctx, &grpcapi.GetVPPLBCountersRequest{})
|
||||||
|
if err != nil {
|
||||||
|
return err
|
||||||
|
}
|
||||||
|
|
||||||
|
if len(resp.Vips) == 0 && len(resp.Backends) == 0 {
|
||||||
|
fmt.Println("(no counters — VPP disconnected or scrape pending)")
|
||||||
|
return nil
|
||||||
|
}
|
||||||
|
|
||||||
|
// ---- frontend-counters ----
|
||||||
|
fmt.Println(label("frontend-counters"))
|
||||||
|
if len(resp.Vips) == 0 {
|
||||||
|
fmt.Println(" (none)")
|
||||||
|
} else {
|
||||||
|
w := tabwriter.NewWriter(os.Stdout, 0, 0, 2, ' ', 0)
|
||||||
|
fmt.Fprintf(w, " %s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\n",
|
||||||
|
label("vip"), label("proto"), label("port"),
|
||||||
|
label("first"), label("next"),
|
||||||
|
label("untracked"), label("no-server"),
|
||||||
|
label("fib-packets"), label("fib-bytes"),
|
||||||
|
)
|
||||||
|
for _, v := range resp.Vips {
|
||||||
|
fmt.Fprintf(w, " %s\t%s\t%d\t%d\t%d\t%d\t%d\t%d\t%d\n",
|
||||||
|
stripHostMask(v.Prefix), v.Protocol, v.Port,
|
||||||
|
v.FirstPacket, v.NextPacket,
|
||||||
|
v.UntrackedPacket, v.NoServer,
|
||||||
|
v.Packets, v.Bytes,
|
||||||
|
)
|
||||||
|
}
|
||||||
|
if err := w.Flush(); err != nil {
|
||||||
|
return err
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
fmt.Println()
|
||||||
|
|
||||||
|
// ---- backend-counters ----
|
||||||
|
fmt.Println(label("backend-counters"))
|
||||||
|
if len(resp.Backends) == 0 {
|
||||||
|
fmt.Println(" (none)")
|
||||||
|
return nil
|
||||||
|
}
|
||||||
|
w := tabwriter.NewWriter(os.Stdout, 0, 0, 2, ' ', 0)
|
||||||
|
fmt.Fprintf(w, " %s\t%s\t%s\t%s\n",
|
||||||
|
label("backend"), label("address"),
|
||||||
|
label("fib-packets"), label("fib-bytes"),
|
||||||
|
)
|
||||||
|
for _, b := range resp.Backends {
|
||||||
|
fmt.Fprintf(w, " %s\t%s\t%d\t%d\n",
|
||||||
|
b.Backend, b.Address, b.Packets, b.Bytes,
|
||||||
|
)
|
||||||
|
}
|
||||||
|
return w.Flush()
|
||||||
|
}
|
||||||
|
|
||||||
|
// stripHostMask trims "/32" (IPv4) or "/128" (IPv6) from a VIP's CIDR
|
||||||
|
// string. maglevd only programs host-prefix VIPs so the mask is always
|
||||||
|
// one of these two values and carries no information for a human reader.
|
||||||
|
// Non-host prefixes and unparseable strings are returned unchanged so
|
||||||
|
// future changes don't silently lose data.
|
||||||
|
func stripHostMask(prefix string) string {
|
||||||
|
if strings.HasSuffix(prefix, "/32") || strings.HasSuffix(prefix, "/128") {
|
||||||
|
return prefix[:strings.LastIndexByte(prefix, '/')]
|
||||||
|
}
|
||||||
|
return prefix
|
||||||
|
}
|
||||||
|
|
||||||
// protoString renders an IP protocol number as a name (tcp, udp, any, or numeric).
|
// protoString renders an IP protocol number as a name (tcp, udp, any, or numeric).
|
||||||
func protoString(p uint32) string {
|
func protoString(p uint32) string {
|
||||||
switch p {
|
switch p {
|
||||||
@@ -385,6 +471,7 @@ func runShowFrontend(ctx context.Context, client grpcapi.MaglevClient, args []st
|
|||||||
fmt.Fprintf(w, "%s\t%s\n", label("address"), info.Address)
|
fmt.Fprintf(w, "%s\t%s\n", label("address"), info.Address)
|
||||||
fmt.Fprintf(w, "%s\t%s\n", label("protocol"), info.Protocol)
|
fmt.Fprintf(w, "%s\t%s\n", label("protocol"), info.Protocol)
|
||||||
fmt.Fprintf(w, "%s\t%d\n", label("port"), info.Port)
|
fmt.Fprintf(w, "%s\t%d\n", label("port"), info.Port)
|
||||||
|
fmt.Fprintf(w, "%s\t%t\n", label("src-ip-sticky"), info.SrcIpSticky)
|
||||||
if info.Description != "" {
|
if info.Description != "" {
|
||||||
fmt.Fprintf(w, "%s\t%s\n", label("description"), info.Description)
|
fmt.Fprintf(w, "%s\t%s\n", label("description"), info.Description)
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -25,9 +25,18 @@ func main() {
|
|||||||
|
|
||||||
func run() error {
|
func run() error {
|
||||||
serverAddr := flag.String("server", "localhost:9090", "maglev server address")
|
serverAddr := flag.String("server", "localhost:9090", "maglev server address")
|
||||||
color := flag.Bool("color", true, "colorize static labels in output")
|
color := flag.Bool("color", true, "colorize static labels in output (defaults to false in one-shot mode)")
|
||||||
flag.Parse()
|
flag.Parse()
|
||||||
colorEnabled = *color
|
|
||||||
|
// Detect whether -color was explicitly set so we can pick a
|
||||||
|
// mode-aware default: color is useful in the interactive shell but
|
||||||
|
// noise (ANSI escapes) when piping one-shot output into scripts.
|
||||||
|
colorExplicit := false
|
||||||
|
flag.Visit(func(f *flag.Flag) {
|
||||||
|
if f.Name == "color" {
|
||||||
|
colorExplicit = true
|
||||||
|
}
|
||||||
|
})
|
||||||
|
|
||||||
conn, err := grpc.NewClient(*serverAddr,
|
conn, err := grpc.NewClient(*serverAddr,
|
||||||
grpc.WithTransportCredentials(insecure.NewCredentials()))
|
grpc.WithTransportCredentials(insecure.NewCredentials()))
|
||||||
@@ -41,13 +50,25 @@ func run() error {
|
|||||||
|
|
||||||
args := flag.Args()
|
args := flag.Args()
|
||||||
if len(args) == 0 {
|
if len(args) == 0 {
|
||||||
// Interactive shell: announce version on startup.
|
// Interactive shell: color defaults to true.
|
||||||
|
if colorExplicit {
|
||||||
|
colorEnabled = *color
|
||||||
|
} else {
|
||||||
|
colorEnabled = true
|
||||||
|
}
|
||||||
fmt.Printf("maglevc %s (commit %s, built %s)\n",
|
fmt.Printf("maglevc %s (commit %s, built %s)\n",
|
||||||
buildinfo.Version(), buildinfo.Commit(), buildinfo.Date())
|
buildinfo.Version(), buildinfo.Commit(), buildinfo.Date())
|
||||||
return runShell(ctx, client)
|
return runShell(ctx, client)
|
||||||
}
|
}
|
||||||
|
|
||||||
// One-shot command from CLI arguments.
|
// One-shot command from CLI arguments: color defaults to false so
|
||||||
|
// output is script-safe. Operators wanting color can still pass
|
||||||
|
// -color=true explicitly.
|
||||||
|
if colorExplicit {
|
||||||
|
colorEnabled = *color
|
||||||
|
} else {
|
||||||
|
colorEnabled = false
|
||||||
|
}
|
||||||
root := buildTree()
|
root := buildTree()
|
||||||
tokens := splitTokens(strings.Join(args, " "))
|
tokens := splitTokens(strings.Join(args, " "))
|
||||||
return dispatch(ctx, root, client, tokens)
|
return dispatch(ctx, root, client, tokens)
|
||||||
|
|||||||
@@ -29,10 +29,11 @@ func TestExpandPathsRoot(t *testing.T) {
|
|||||||
"watch events <opt>",
|
"watch events <opt>",
|
||||||
"config check",
|
"config check",
|
||||||
"show vpp info",
|
"show vpp info",
|
||||||
"show vpp lbstate",
|
"show vpp lb state",
|
||||||
|
"show vpp lb counters",
|
||||||
"config reload",
|
"config reload",
|
||||||
"sync vpp lbstate",
|
"sync vpp lb state",
|
||||||
"sync vpp lbstate <name>",
|
"sync vpp lb state <name>",
|
||||||
"quit",
|
"quit",
|
||||||
"exit",
|
"exit",
|
||||||
}
|
}
|
||||||
@@ -63,9 +64,10 @@ func TestExpandPathsShow(t *testing.T) {
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
// version, frontends, frontends <name>, backends, backends <name>,
|
// version, frontends, frontends <name>, backends, backends <name>,
|
||||||
// healthchecks, healthchecks <name>, vpp info, vpp lb = 9 lines
|
// healthchecks, healthchecks <name>, vpp info, vpp lb state,
|
||||||
if len(lines) != 9 {
|
// vpp lb counters = 10 lines
|
||||||
t.Errorf("expected exactly 9 show subcommands, got %d", len(lines))
|
if len(lines) != 10 {
|
||||||
|
t.Errorf("expected exactly 10 show subcommands, got %d", len(lines))
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|||||||
@@ -288,6 +288,15 @@ ordered list of backend pools. The gRPC API exposes frontends by name.
|
|||||||
cascades across further tiers. See [healthchecks.md](healthchecks.md#pool-failover)
|
cascades across further tiers. See [healthchecks.md](healthchecks.md#pool-failover)
|
||||||
for the full failover semantics. All backends across all pools in a frontend must
|
for the full failover semantics. All backends across all pools in a frontend must
|
||||||
have addresses of the same address family (all IPv4 or all IPv6).
|
have addresses of the same address family (all IPv4 or all IPv6).
|
||||||
|
* ***src-ip-sticky***: Boolean, default `false`. When `true`, the VPP load-balancer
|
||||||
|
programs this VIP with source-IP-based stickiness — all flows from the same client
|
||||||
|
source IP hash to the same backend (subject to the Maglev consistent-hash bucket
|
||||||
|
assignment). Use this for protocols that require session affinity at the L3 level,
|
||||||
|
or when clients open many short flows that should land on one backend. Changing this
|
||||||
|
field in a running config and reloading causes maglevd to tear down the VIP (all
|
||||||
|
application servers are deleted with flush, then the VIP itself is deleted) and
|
||||||
|
recreate it with the new value; VPP has no API to mutate `src_ip_sticky` on an
|
||||||
|
existing VIP, and existing flow state cannot be preserved across the flip.
|
||||||
|
|
||||||
Each pool has:
|
Each pool has:
|
||||||
|
|
||||||
@@ -324,6 +333,7 @@ frontends:
|
|||||||
address: 2001:db8::1
|
address: 2001:db8::1
|
||||||
protocol: tcp
|
protocol: tcp
|
||||||
port: 993
|
port: 993
|
||||||
|
src-ip-sticky: true
|
||||||
pools:
|
pools:
|
||||||
- name: primary
|
- name: primary
|
||||||
backends:
|
backends:
|
||||||
|
|||||||
116
docs/maglevc.1
116
docs/maglevc.1
@@ -12,15 +12,21 @@ is an interactive CLI client for
|
|||||||
.BR maglevd (8).
|
.BR maglevd (8).
|
||||||
Without arguments it opens a readline shell with tab completion and
|
Without arguments it opens a readline shell with tab completion and
|
||||||
inline help.
|
inline help.
|
||||||
A command may also be passed directly on the command line for one\-shot use,
|
A command may also be passed directly on the command line for one\-shot
|
||||||
which is useful for scripting (use
|
use, which is useful for scripting; in that mode ANSI color is
|
||||||
.B \-color=false
|
disabled by default so the output is script\-safe. Pass
|
||||||
to suppress ANSI codes).
|
.B \-color=true
|
||||||
|
explicitly if you want color in one\-shot mode.
|
||||||
.PP
|
.PP
|
||||||
When the shell starts it prints the build version and connects to the
|
When the shell starts it prints the build version and connects to the
|
||||||
.B maglevd
|
.B maglevd
|
||||||
gRPC server specified by
|
gRPC server specified by
|
||||||
.BR \-server .
|
.BR \-server .
|
||||||
|
All static tokens support tab completion; dynamic names (frontend,
|
||||||
|
backend, health\-check names) are completed by querying the server.
|
||||||
|
Type
|
||||||
|
.B ?
|
||||||
|
at any point to list completions without advancing the input.
|
||||||
.SH OPTIONS
|
.SH OPTIONS
|
||||||
.TP
|
.TP
|
||||||
.BI \-server " addr"
|
.BI \-server " addr"
|
||||||
@@ -31,81 +37,59 @@ gRPC server.
|
|||||||
.IR localhost:9090 )
|
.IR localhost:9090 )
|
||||||
.TP
|
.TP
|
||||||
.BR \-color [=\fIbool\fR]
|
.BR \-color [=\fIbool\fR]
|
||||||
Colorize static field labels in output using ANSI dark blue.
|
Colorize static field labels in output using ANSI dark blue. The
|
||||||
(default: true)
|
default is mode\-aware: enabled (true) in the interactive shell, and
|
||||||
Pass
|
disabled (false) in one\-shot mode so that output piped into scripts
|
||||||
|
or files stays free of escape codes. Pass
|
||||||
|
.B \-color=true
|
||||||
|
or
|
||||||
.B \-color=false
|
.B \-color=false
|
||||||
to disable, e.g.\& when piping output.
|
explicitly to override the default for either mode.
|
||||||
.SH COMMANDS
|
|
||||||
Commands are entered at the
|
|
||||||
.B maglevc>
|
|
||||||
prompt or passed as arguments on the command line.
|
|
||||||
All static tokens support tab completion; dynamic names (frontend, backend,
|
|
||||||
health\-check names) are completed by querying the server.
|
|
||||||
Type
|
|
||||||
.B ?
|
|
||||||
at any point to list completions without advancing the input.
|
|
||||||
.SS Show commands
|
|
||||||
.TP
|
|
||||||
.B show version
|
|
||||||
Print build version, commit hash, and build date.
|
|
||||||
.TP
|
|
||||||
.B show frontends
|
|
||||||
List all configured frontends.
|
|
||||||
.TP
|
|
||||||
.BI "show frontend " name
|
|
||||||
Show address, protocol, port, description, and pools (with weights and
|
|
||||||
disabled\-backend notation) for the named frontend.
|
|
||||||
.TP
|
|
||||||
.B show backends
|
|
||||||
List all active backends.
|
|
||||||
.TP
|
|
||||||
.BI "show backend " name
|
|
||||||
Show address, current health state (with duration), enabled flag,
|
|
||||||
health\-check name, and recent state transitions with timestamps.
|
|
||||||
.TP
|
|
||||||
.B show healthchecks
|
|
||||||
List all configured health checks.
|
|
||||||
.TP
|
|
||||||
.BI "show healthcheck " name
|
|
||||||
Show the full configuration of the named health check.
|
|
||||||
.SS Set commands
|
|
||||||
.TP
|
|
||||||
.BI "set backend " "name " pause
|
|
||||||
Pause health checking for a backend, freezing its state.
|
|
||||||
.TP
|
|
||||||
.BI "set backend " "name " resume
|
|
||||||
Resume health checking for a backend; state resets to
|
|
||||||
.BR unknown .
|
|
||||||
.SS Shell commands
|
|
||||||
.TP
|
|
||||||
.BR quit ", " exit
|
|
||||||
Exit the interactive shell.
|
|
||||||
.SH COMPLETION
|
|
||||||
In interactive mode, press
|
|
||||||
.B Tab
|
|
||||||
to complete the current token.
|
|
||||||
If more than one completion is possible, all candidates are listed.
|
|
||||||
Type
|
|
||||||
.B ?
|
|
||||||
anywhere on the line to list candidates at that position without consuming
|
|
||||||
the character or advancing the cursor.
|
|
||||||
.SH EXAMPLES
|
.SH EXAMPLES
|
||||||
One\-shot query (no color, suitable for scripts):
|
Open the interactive shell (no command on the command line). Tab
|
||||||
|
completes the current token; typing
|
||||||
|
.B ?
|
||||||
|
lists candidates at the cursor.
|
||||||
|
.B quit
|
||||||
|
or
|
||||||
|
.B exit
|
||||||
|
(or Ctrl\-D) leaves the shell:
|
||||||
.PP
|
.PP
|
||||||
.RS
|
.RS
|
||||||
.EX
|
.EX
|
||||||
maglevc \-color=false show backends
|
$ maglevc
|
||||||
|
maglevc> show frontends
|
||||||
|
\&...
|
||||||
|
maglevc> quit
|
||||||
.EE
|
.EE
|
||||||
.RE
|
.RE
|
||||||
.PP
|
.PP
|
||||||
Interactive session:
|
One\-shot query passed on the command line (color is off by default
|
||||||
|
in this mode so the output is script\-safe):
|
||||||
.PP
|
.PP
|
||||||
.RS
|
.RS
|
||||||
.EX
|
.EX
|
||||||
maglevc \-server 10.0.0.1:9090
|
$ maglevc show frontends
|
||||||
.EE
|
.EE
|
||||||
.RE
|
.RE
|
||||||
|
.PP
|
||||||
|
Query VPP version and connection status, forcing color on:
|
||||||
|
.PP
|
||||||
|
.RS
|
||||||
|
.EX
|
||||||
|
$ maglevc \-color=true show vpp info
|
||||||
|
.EE
|
||||||
|
.RE
|
||||||
|
.SH "FULL DOCUMENTATION"
|
||||||
|
This manpage documents only the invocation of
|
||||||
|
.BR maglevc .
|
||||||
|
For the complete command reference — every
|
||||||
|
.BR show ", " set ", " sync ", " config ", and " watch
|
||||||
|
command, with examples and operational notes — see the user guide at:
|
||||||
|
.PP
|
||||||
|
.RS
|
||||||
|
https://git.ipng.ch/ipng/vpp-maglev/docs/user-guide.md
|
||||||
|
.RE
|
||||||
.SH SEE ALSO
|
.SH SEE ALSO
|
||||||
.BR maglevd (8)
|
.BR maglevd (8)
|
||||||
.SH AUTHOR
|
.SH AUTHOR
|
||||||
|
|||||||
107
docs/maglevd.8
107
docs/maglevd.8
@@ -1,6 +1,6 @@
|
|||||||
.TH MAGLEVD 8 "April 2026" "vpp\-maglev" "System Administration"
|
.TH MAGLEVD 8 "April 2026" "vpp\-maglev" "System Administration"
|
||||||
.SH NAME
|
.SH NAME
|
||||||
maglevd \- Maglev health\-checker daemon
|
maglevd \- Maglev health\-checker daemon and VPP load\-balancer controller
|
||||||
.SH SYNOPSIS
|
.SH SYNOPSIS
|
||||||
.B maglevd
|
.B maglevd
|
||||||
[\fB\-config\fR \fIfile\fR]
|
[\fB\-config\fR \fIfile\fR]
|
||||||
@@ -10,23 +10,44 @@ maglevd \- Maglev health\-checker daemon
|
|||||||
[\fB\-version\fR]
|
[\fB\-version\fR]
|
||||||
.SH DESCRIPTION
|
.SH DESCRIPTION
|
||||||
.B maglevd
|
.B maglevd
|
||||||
is a health\-checker daemon that monitors backends (HTTP, TCP, ICMP) and
|
has two responsibilities:
|
||||||
exposes their aggregated state via a gRPC API.
|
.IP \(bu 2
|
||||||
Configuration is loaded from a YAML file.
|
It monitors backends with active health checks (HTTP, TCP, ICMP) and
|
||||||
A running daemon reloads its configuration when it receives
|
aggregates the results into a state machine per backend. Probe
|
||||||
.BR SIGHUP .
|
intervals adapt automatically: a faster interval is used while a
|
||||||
|
backend is degraded, and a slower one once it has been continuously
|
||||||
|
down. Operators can also
|
||||||
|
.B pause/resume
|
||||||
|
or
|
||||||
|
.B disable/enable
|
||||||
|
a backend out of band.
|
||||||
|
.IP \(bu 2
|
||||||
|
It programs the VPP dataplane via the
|
||||||
|
.B lb
|
||||||
|
(load\-balancer) plugin. Each frontend in the config becomes a VPP
|
||||||
|
load\-balancer VIP; each healthy backend becomes an application server
|
||||||
|
under its frontend's VIPs. State transitions drive
|
||||||
|
application\-server weight updates over the VPP binary API so that
|
||||||
|
unhealthy backends stop receiving new flows. Drift between the
|
||||||
|
running config and VPP's view is reconciled periodically
|
||||||
|
.RB ( maglev.vpp.lb.sync-interval ,
|
||||||
|
default 30s), on
|
||||||
|
.B SIGHUP
|
||||||
|
reloads, and on operator request via
|
||||||
|
.BR maglevc .
|
||||||
.PP
|
.PP
|
||||||
Backends are tracked with a rise/fall counter model.
|
The aggregated backend state, VPP dataplane state, and per\-VIP /
|
||||||
Each backend cycles through the states
|
per\-backend stats\-segment counters are exposed via a gRPC API (and
|
||||||
.BR unknown ,
|
scraped into Prometheus when the
|
||||||
.BR up ,
|
.B /metrics
|
||||||
.BR down ,
|
endpoint is enabled).
|
||||||
and
|
See
|
||||||
.B paused
|
.BR maglevc (1)
|
||||||
(operator\-set).
|
for the interactive CLI client.
|
||||||
Health\-check intervals adapt automatically: a faster interval is used when
|
.PP
|
||||||
a backend is not fully healthy, and a slower interval when it has been
|
Configuration is loaded from a YAML file. A running daemon reloads
|
||||||
continuously down.
|
its configuration when it receives
|
||||||
|
.BR SIGHUP .
|
||||||
.SH OPTIONS
|
.SH OPTIONS
|
||||||
Each flag may also be supplied via an environment variable (shown in
|
Each flag may also be supplied via an environment variable (shown in
|
||||||
parentheses); the flag takes precedence.
|
parentheses); the flag takes precedence.
|
||||||
@@ -63,12 +84,16 @@ Print version, commit hash, and build date, then exit.
|
|||||||
.SH SIGNALS
|
.SH SIGNALS
|
||||||
.TP
|
.TP
|
||||||
.B SIGHUP
|
.B SIGHUP
|
||||||
Reload the configuration file without restarting.
|
Reload the configuration file without restarting. New backends are
|
||||||
New backends are added, removed backends are stopped, and unchanged
|
added, removed backends are stopped, and unchanged backend workers
|
||||||
backend workers are left running.
|
keep running. After the reload, the VPP dataplane is reconciled so
|
||||||
|
added/removed frontends and application servers are programmed
|
||||||
|
immediately.
|
||||||
.TP
|
.TP
|
||||||
.BR SIGTERM ", " SIGINT
|
.BR SIGTERM ", " SIGINT
|
||||||
Gracefully shut down: drain active gRPC streams, then exit.
|
Gracefully shut down: drain active gRPC streams, then exit. VPP
|
||||||
|
dataplane state is left in place so that existing VIPs continue to
|
||||||
|
forward traffic during a restart.
|
||||||
.SH FILES
|
.SH FILES
|
||||||
.TP
|
.TP
|
||||||
.I /etc/vpp-maglev/maglev.yaml
|
.I /etc/vpp-maglev/maglev.yaml
|
||||||
@@ -77,20 +102,34 @@ Default configuration file (YAML).
|
|||||||
.I /etc/default/vpp-maglev
|
.I /etc/default/vpp-maglev
|
||||||
Environment file sourced by the systemd unit before starting
|
Environment file sourced by the systemd unit before starting
|
||||||
.BR maglevd .
|
.BR maglevd .
|
||||||
.SH CONFIGURATION
|
.TP
|
||||||
The configuration file uses YAML and has four top\-level sections under the
|
.I /run/vpp/api.sock
|
||||||
.B maglev
|
VPP binary\-API socket
|
||||||
key:
|
.BR maglevd
|
||||||
.BR healthchecker ,
|
connects to when programming the
|
||||||
.BR healthchecks ,
|
.B lb
|
||||||
.BR backends ,
|
plugin.
|
||||||
and
|
.TP
|
||||||
.BR frontends .
|
.I /run/vpp/stats.sock
|
||||||
|
VPP stats\-segment socket used to scrape per\-VIP and per\-backend
|
||||||
|
packet/byte counters.
|
||||||
|
.SH "FULL DOCUMENTATION"
|
||||||
|
This manpage documents only the invocation of
|
||||||
|
.BR maglevd .
|
||||||
|
For the configuration file reference (including the
|
||||||
|
.B maglev.vpp.lb
|
||||||
|
section controlling the VPP integration), the health\-check state
|
||||||
|
machine, and the full operational guide, see:
|
||||||
.PP
|
.PP
|
||||||
See the example at
|
.RS
|
||||||
.I /etc/vpp-maglev/maglev.yaml
|
https://git.ipng.ch/ipng/vpp-maglev/docs/config-guide.md
|
||||||
and the full reference in the project documentation.
|
.br
|
||||||
|
https://git.ipng.ch/ipng/vpp-maglev/docs/healthchecks.md
|
||||||
|
.br
|
||||||
|
https://git.ipng.ch/ipng/vpp-maglev/docs/user-guide.md
|
||||||
|
.RE
|
||||||
.SH SEE ALSO
|
.SH SEE ALSO
|
||||||
.BR maglevc (1)
|
.BR maglevc (1),
|
||||||
|
.BR vpp (8)
|
||||||
.SH AUTHOR
|
.SH AUTHOR
|
||||||
Pim van Pelt <pim@ipng.ch>
|
Pim van Pelt <pim@ipng.ch>
|
||||||
|
|||||||
@@ -100,11 +100,12 @@ maglevc [--server host:port] [--color[=bool]] [command...]
|
|||||||
| Flag | Default | Description |
|
| Flag | Default | Description |
|
||||||
|---|---|---|
|
|---|---|---|
|
||||||
| `--server` | `localhost:9090` | Address of the `maglevd` gRPC server. |
|
| `--server` | `localhost:9090` | Address of the `maglevd` gRPC server. |
|
||||||
| `--color` | `true` | Colorize static field labels in output (dark blue ANSI). Pass `--color=false` to disable, e.g. when piping. |
|
| `--color` | mode-aware | Colorize static field labels (dark blue ANSI). Defaults to `true` in the interactive shell and `false` in one-shot mode, so output piped into scripts stays free of escape codes. Pass `--color=true` or `--color=false` explicitly to override either default. |
|
||||||
|
|
||||||
When `command` arguments are supplied the command is executed and `maglevc`
|
When `command` arguments are supplied the command is executed and `maglevc`
|
||||||
exits. When no arguments are given an interactive shell is started and the
|
exits; in this mode ANSI color is off by default so the output is script-safe.
|
||||||
build version is printed on entry.
|
When no arguments are given an interactive shell is started, the build version
|
||||||
|
is printed on entry, and color is on by default.
|
||||||
|
|
||||||
### Commands
|
### Commands
|
||||||
|
|
||||||
@@ -112,9 +113,9 @@ build version is printed on entry.
|
|||||||
show version Print build version, commit hash, and build date.
|
show version Print build version, commit hash, and build date.
|
||||||
|
|
||||||
show frontends [<name>] Without name: list all frontend names.
|
show frontends [<name>] Without name: list all frontend names.
|
||||||
With name: show address, protocol, port, description,
|
With name: show address, protocol, port, src-ip-sticky,
|
||||||
and pools. Each pool lists its backends with two
|
description, and pools. Each pool lists its backends
|
||||||
weight columns:
|
with two weight columns:
|
||||||
weight — configured weight from the YAML
|
weight — configured weight from the YAML
|
||||||
effective — state-aware weight after pool failover
|
effective — state-aware weight after pool failover
|
||||||
(what gets programmed into VPP)
|
(what gets programmed into VPP)
|
||||||
@@ -131,12 +132,20 @@ show healthchecks [<name>] Without name: list all health-check names.
|
|||||||
show vpp info Show VPP version, build date, PID, uptime, and when
|
show vpp info Show VPP version, build date, PID, uptime, and when
|
||||||
maglevd connected. Returns an error if VPP is not
|
maglevd connected. Returns an error if VPP is not
|
||||||
connected.
|
connected.
|
||||||
show vpp lbstate Show the VPP load-balancer plugin state: global
|
show vpp lb state Show the VPP load-balancer plugin state: global
|
||||||
configuration, configured VIPs, and their attached
|
configuration, configured VIPs, and their attached
|
||||||
application servers (address, weight, bucket count).
|
application servers (address, weight, bucket count).
|
||||||
Returns an error if VPP is not connected.
|
Returns an error if VPP is not connected.
|
||||||
|
show vpp lb counters Show per-VIP and per-backend packet/byte counters
|
||||||
|
from the VPP stats segment, refreshed roughly every
|
||||||
|
five seconds by maglevd. Each VIP row reports the LB
|
||||||
|
plugin counters (next, first, untracked, no-server)
|
||||||
|
and the FIB packets/bytes at the VIP's host prefix.
|
||||||
|
Each backend row reports FIB packets/bytes at the
|
||||||
|
backend's /32 or /128 prefix. Use Prometheus for
|
||||||
|
live rates; this command shows absolute values.
|
||||||
|
|
||||||
sync vpp lbstate [<name>] Reconcile the VPP load-balancer dataplane from the
|
sync vpp lb state [<name>] Reconcile the VPP load-balancer dataplane from the
|
||||||
running config. Without a name: runs a full sync —
|
running config. Without a name: runs a full sync —
|
||||||
creates missing VIPs, removes stale VIPs, and adjusts
|
creates missing VIPs, removes stale VIPs, and adjusts
|
||||||
application-server membership and weights across all
|
application-server membership and weights across all
|
||||||
|
|||||||
@@ -23,13 +23,26 @@ type BackendSnapshot struct {
|
|||||||
Config config.Backend
|
Config config.Backend
|
||||||
}
|
}
|
||||||
|
|
||||||
// Event is emitted on every backend state transition, once per frontend that
|
// Event is emitted on every state transition the checker observes. There are
|
||||||
// references the backend.
|
// two kinds, distinguished by which of BackendName or FrontendTransition is
|
||||||
|
// populated:
|
||||||
|
//
|
||||||
|
// - Backend transition: FrontendName is the frontend that references the
|
||||||
|
// backend (one event per frontend per backend transition), BackendName
|
||||||
|
// and Backend are set, and Transition carries the health.Transition.
|
||||||
|
// FrontendTransition is nil.
|
||||||
|
// - Frontend transition: FrontendName is the frontend whose aggregate state
|
||||||
|
// changed, FrontendTransition is non-nil. BackendName and Backend are
|
||||||
|
// empty, Transition is the zero value.
|
||||||
|
//
|
||||||
|
// Consumers dispatch on FrontendTransition != nil.
|
||||||
type Event struct {
|
type Event struct {
|
||||||
FrontendName string
|
FrontendName string
|
||||||
BackendName string
|
BackendName string
|
||||||
Backend net.IP
|
Backend net.IP
|
||||||
Transition health.Transition
|
Transition health.Transition
|
||||||
|
|
||||||
|
FrontendTransition *health.FrontendTransition
|
||||||
}
|
}
|
||||||
|
|
||||||
type worker struct {
|
type worker struct {
|
||||||
@@ -49,6 +62,13 @@ type Checker struct {
|
|||||||
mu sync.RWMutex
|
mu sync.RWMutex
|
||||||
workers map[string]*worker // keyed by backend name
|
workers map[string]*worker // keyed by backend name
|
||||||
|
|
||||||
|
// frontendStates tracks the aggregated state of every configured frontend
|
||||||
|
// (unknown/up/down). Updated whenever a backend transition happens; a
|
||||||
|
// change emits a frontend-transition Event. The zero value for a missing
|
||||||
|
// key is FrontendStateUnknown, so initial-reference accesses behave
|
||||||
|
// correctly even without explicit seeding.
|
||||||
|
frontendStates map[string]health.FrontendState
|
||||||
|
|
||||||
subsMu sync.Mutex
|
subsMu sync.Mutex
|
||||||
nextID int
|
nextID int
|
||||||
subs map[int]chan Event
|
subs map[int]chan Event
|
||||||
@@ -60,6 +80,7 @@ func New(cfg *config.Config) *Checker {
|
|||||||
return &Checker{
|
return &Checker{
|
||||||
cfg: cfg,
|
cfg: cfg,
|
||||||
workers: make(map[string]*worker),
|
workers: make(map[string]*worker),
|
||||||
|
frontendStates: make(map[string]health.FrontendState),
|
||||||
subs: make(map[int]chan Event),
|
subs: make(map[int]chan Event),
|
||||||
eventCh: make(chan Event, 256),
|
eventCh: make(chan Event, 256),
|
||||||
}
|
}
|
||||||
@@ -131,6 +152,13 @@ func (c *Checker) Reload(ctx context.Context, cfg *config.Config) error {
|
|||||||
c.emitForBackend(name, c.workers[name].backend.Address, c.workers[name].backend.Transitions[0], cfg.Frontends)
|
c.emitForBackend(name, c.workers[name].backend.Address, c.workers[name].backend.Transitions[0], cfg.Frontends)
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// Drop frontendStates entries for frontends no longer in config.
|
||||||
|
for feName := range c.frontendStates {
|
||||||
|
if _, ok := cfg.Frontends[feName]; !ok {
|
||||||
|
delete(c.frontendStates, feName)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
c.cfg = cfg
|
c.cfg = cfg
|
||||||
return nil
|
return nil
|
||||||
}
|
}
|
||||||
@@ -174,6 +202,18 @@ func (c *Checker) BackendState(name string) (health.State, bool) {
|
|||||||
return w.backend.State, true
|
return w.backend.State, true
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// FrontendState returns the current aggregate state of a frontend (unknown,
|
||||||
|
// up, or down). Returns (FrontendStateUnknown, false) when the frontend is
|
||||||
|
// not known to the checker.
|
||||||
|
func (c *Checker) FrontendState(name string) (health.FrontendState, bool) {
|
||||||
|
c.mu.RLock()
|
||||||
|
defer c.mu.RUnlock()
|
||||||
|
if _, ok := c.cfg.Frontends[name]; !ok {
|
||||||
|
return health.FrontendStateUnknown, false
|
||||||
|
}
|
||||||
|
return c.frontendStates[name], true
|
||||||
|
}
|
||||||
|
|
||||||
// ListFrontends returns the names of all configured frontends.
|
// ListFrontends returns the names of all configured frontends.
|
||||||
func (c *Checker) ListFrontends() []string {
|
func (c *Checker) ListFrontends() []string {
|
||||||
c.mu.RLock()
|
c.mu.RLock()
|
||||||
@@ -575,24 +615,60 @@ func (c *Checker) runProbe(ctx context.Context, name string, pos, total int) {
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
// emitForBackend emits one Event per frontend that references backendName
|
// emitForBackend emits one backend-transition Event per frontend that
|
||||||
// (in any pool), using the provided frontends map. Must be called with c.mu held.
|
// references backendName (in any pool), using the provided frontends map.
|
||||||
|
// After emitting the backend event for a frontend, it also re-computes that
|
||||||
|
// frontend's aggregate state and emits a frontend-transition Event if the
|
||||||
|
// state has changed. Must be called with c.mu held.
|
||||||
func (c *Checker) emitForBackend(backendName string, addr net.IP, t health.Transition, frontends map[string]config.Frontend) {
|
func (c *Checker) emitForBackend(backendName string, addr net.IP, t health.Transition, frontends map[string]config.Frontend) {
|
||||||
for feName, fe := range frontends {
|
for feName, fe := range frontends {
|
||||||
emitted := false
|
if !frontendReferencesBackend(fe, backendName) {
|
||||||
for _, pool := range fe.Pools {
|
continue
|
||||||
if emitted {
|
|
||||||
break
|
|
||||||
}
|
}
|
||||||
for name := range pool.Backends {
|
|
||||||
if name == backendName {
|
|
||||||
c.emit(Event{FrontendName: feName, BackendName: backendName, Backend: addr, Transition: t})
|
c.emit(Event{FrontendName: feName, BackendName: backendName, Backend: addr, Transition: t})
|
||||||
emitted = true
|
c.updateFrontendState(feName, fe)
|
||||||
break
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// frontendReferencesBackend reports whether fe has the named backend in any
|
||||||
|
// of its pools.
|
||||||
|
func frontendReferencesBackend(fe config.Frontend, backendName string) bool {
|
||||||
|
for _, pool := range fe.Pools {
|
||||||
|
if _, ok := pool.Backends[backendName]; ok {
|
||||||
|
return true
|
||||||
|
}
|
||||||
|
}
|
||||||
|
return false
|
||||||
|
}
|
||||||
|
|
||||||
|
// updateFrontendState recomputes the aggregate state of fe, compares against
|
||||||
|
// the last known state, and emits a frontend-transition Event on change.
|
||||||
|
// Must be called with c.mu held. The current state is read from the worker
|
||||||
|
// map — so the caller (who already holds c.mu) sees a consistent view.
|
||||||
|
func (c *Checker) updateFrontendState(feName string, fe config.Frontend) {
|
||||||
|
states := make(map[string]health.State)
|
||||||
|
for _, pool := range fe.Pools {
|
||||||
|
for bName := range pool.Backends {
|
||||||
|
if w, ok := c.workers[bName]; ok {
|
||||||
|
states[bName] = w.backend.State
|
||||||
|
} else {
|
||||||
|
states[bName] = health.StateUnknown
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
newState := health.ComputeFrontendState(fe, states)
|
||||||
|
old := c.frontendStates[feName] // zero value (Unknown) on first access
|
||||||
|
if old == newState {
|
||||||
|
return
|
||||||
}
|
}
|
||||||
|
c.frontendStates[feName] = newState
|
||||||
|
ft := health.FrontendTransition{From: old, To: newState, At: time.Now()}
|
||||||
|
slog.Info("frontend-transition",
|
||||||
|
"frontend", feName,
|
||||||
|
"from", old.String(),
|
||||||
|
"to", newState.String(),
|
||||||
|
)
|
||||||
|
c.emit(Event{FrontendName: feName, FrontendTransition: &ft})
|
||||||
}
|
}
|
||||||
|
|
||||||
// emit sends an event to the internal fan-out channel (non-blocking).
|
// emit sends an event to the internal fan-out channel (non-blocking).
|
||||||
|
|||||||
@@ -119,6 +119,7 @@ type Frontend struct {
|
|||||||
Protocol string // "tcp", "udp", or "" (all traffic)
|
Protocol string // "tcp", "udp", or "" (all traffic)
|
||||||
Port uint16 // 0 means omitted (all ports)
|
Port uint16 // 0 means omitted (all ports)
|
||||||
Pools []Pool // ordered tiers; first pool with any up backend is active
|
Pools []Pool // ordered tiers; first pool with any up backend is active
|
||||||
|
SrcIPSticky bool // when true, VPP LB uses src-IP-based hashing for this VIP
|
||||||
}
|
}
|
||||||
|
|
||||||
// ---- raw YAML types --------------------------------------------------------
|
// ---- raw YAML types --------------------------------------------------------
|
||||||
@@ -199,6 +200,7 @@ type rawFrontend struct {
|
|||||||
Protocol string `yaml:"protocol"`
|
Protocol string `yaml:"protocol"`
|
||||||
Port uint16 `yaml:"port"`
|
Port uint16 `yaml:"port"`
|
||||||
Pools []rawPool `yaml:"pools"`
|
Pools []rawPool `yaml:"pools"`
|
||||||
|
SrcIPSticky bool `yaml:"src-ip-sticky"`
|
||||||
}
|
}
|
||||||
|
|
||||||
// ---- Check / Load ----------------------------------------------------------
|
// ---- Check / Load ----------------------------------------------------------
|
||||||
@@ -519,6 +521,7 @@ func convertFrontend(name string, r *rawFrontend, backends map[string]Backend) (
|
|||||||
Description: r.Description,
|
Description: r.Description,
|
||||||
Protocol: r.Protocol,
|
Protocol: r.Protocol,
|
||||||
Port: r.Port,
|
Port: r.Port,
|
||||||
|
SrcIPSticky: r.SrcIPSticky,
|
||||||
}
|
}
|
||||||
|
|
||||||
ip := net.ParseIP(r.Address)
|
ip := net.ParseIP(r.Address)
|
||||||
|
|||||||
File diff suppressed because it is too large
Load Diff
@@ -36,6 +36,7 @@ const (
|
|||||||
Maglev_GetVPPInfo_FullMethodName = "/maglev.Maglev/GetVPPInfo"
|
Maglev_GetVPPInfo_FullMethodName = "/maglev.Maglev/GetVPPInfo"
|
||||||
Maglev_GetVPPLBState_FullMethodName = "/maglev.Maglev/GetVPPLBState"
|
Maglev_GetVPPLBState_FullMethodName = "/maglev.Maglev/GetVPPLBState"
|
||||||
Maglev_SyncVPPLBState_FullMethodName = "/maglev.Maglev/SyncVPPLBState"
|
Maglev_SyncVPPLBState_FullMethodName = "/maglev.Maglev/SyncVPPLBState"
|
||||||
|
Maglev_GetVPPLBCounters_FullMethodName = "/maglev.Maglev/GetVPPLBCounters"
|
||||||
)
|
)
|
||||||
|
|
||||||
// MaglevClient is the client API for Maglev service.
|
// MaglevClient is the client API for Maglev service.
|
||||||
@@ -61,6 +62,7 @@ type MaglevClient interface {
|
|||||||
GetVPPInfo(ctx context.Context, in *GetVPPInfoRequest, opts ...grpc.CallOption) (*VPPInfo, error)
|
GetVPPInfo(ctx context.Context, in *GetVPPInfoRequest, opts ...grpc.CallOption) (*VPPInfo, error)
|
||||||
GetVPPLBState(ctx context.Context, in *GetVPPLBStateRequest, opts ...grpc.CallOption) (*VPPLBState, error)
|
GetVPPLBState(ctx context.Context, in *GetVPPLBStateRequest, opts ...grpc.CallOption) (*VPPLBState, error)
|
||||||
SyncVPPLBState(ctx context.Context, in *SyncVPPLBStateRequest, opts ...grpc.CallOption) (*SyncVPPLBStateResponse, error)
|
SyncVPPLBState(ctx context.Context, in *SyncVPPLBStateRequest, opts ...grpc.CallOption) (*SyncVPPLBStateResponse, error)
|
||||||
|
GetVPPLBCounters(ctx context.Context, in *GetVPPLBCountersRequest, opts ...grpc.CallOption) (*VPPLBCounters, error)
|
||||||
}
|
}
|
||||||
|
|
||||||
type maglevClient struct {
|
type maglevClient struct {
|
||||||
@@ -250,6 +252,16 @@ func (c *maglevClient) SyncVPPLBState(ctx context.Context, in *SyncVPPLBStateReq
|
|||||||
return out, nil
|
return out, nil
|
||||||
}
|
}
|
||||||
|
|
||||||
|
func (c *maglevClient) GetVPPLBCounters(ctx context.Context, in *GetVPPLBCountersRequest, opts ...grpc.CallOption) (*VPPLBCounters, error) {
|
||||||
|
cOpts := append([]grpc.CallOption{grpc.StaticMethod()}, opts...)
|
||||||
|
out := new(VPPLBCounters)
|
||||||
|
err := c.cc.Invoke(ctx, Maglev_GetVPPLBCounters_FullMethodName, in, out, cOpts...)
|
||||||
|
if err != nil {
|
||||||
|
return nil, err
|
||||||
|
}
|
||||||
|
return out, nil
|
||||||
|
}
|
||||||
|
|
||||||
// MaglevServer is the server API for Maglev service.
|
// MaglevServer is the server API for Maglev service.
|
||||||
// All implementations must embed UnimplementedMaglevServer
|
// All implementations must embed UnimplementedMaglevServer
|
||||||
// for forward compatibility.
|
// for forward compatibility.
|
||||||
@@ -273,6 +285,7 @@ type MaglevServer interface {
|
|||||||
GetVPPInfo(context.Context, *GetVPPInfoRequest) (*VPPInfo, error)
|
GetVPPInfo(context.Context, *GetVPPInfoRequest) (*VPPInfo, error)
|
||||||
GetVPPLBState(context.Context, *GetVPPLBStateRequest) (*VPPLBState, error)
|
GetVPPLBState(context.Context, *GetVPPLBStateRequest) (*VPPLBState, error)
|
||||||
SyncVPPLBState(context.Context, *SyncVPPLBStateRequest) (*SyncVPPLBStateResponse, error)
|
SyncVPPLBState(context.Context, *SyncVPPLBStateRequest) (*SyncVPPLBStateResponse, error)
|
||||||
|
GetVPPLBCounters(context.Context, *GetVPPLBCountersRequest) (*VPPLBCounters, error)
|
||||||
mustEmbedUnimplementedMaglevServer()
|
mustEmbedUnimplementedMaglevServer()
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -334,6 +347,9 @@ func (UnimplementedMaglevServer) GetVPPLBState(context.Context, *GetVPPLBStateRe
|
|||||||
func (UnimplementedMaglevServer) SyncVPPLBState(context.Context, *SyncVPPLBStateRequest) (*SyncVPPLBStateResponse, error) {
|
func (UnimplementedMaglevServer) SyncVPPLBState(context.Context, *SyncVPPLBStateRequest) (*SyncVPPLBStateResponse, error) {
|
||||||
return nil, status.Error(codes.Unimplemented, "method SyncVPPLBState not implemented")
|
return nil, status.Error(codes.Unimplemented, "method SyncVPPLBState not implemented")
|
||||||
}
|
}
|
||||||
|
func (UnimplementedMaglevServer) GetVPPLBCounters(context.Context, *GetVPPLBCountersRequest) (*VPPLBCounters, error) {
|
||||||
|
return nil, status.Error(codes.Unimplemented, "method GetVPPLBCounters not implemented")
|
||||||
|
}
|
||||||
func (UnimplementedMaglevServer) mustEmbedUnimplementedMaglevServer() {}
|
func (UnimplementedMaglevServer) mustEmbedUnimplementedMaglevServer() {}
|
||||||
func (UnimplementedMaglevServer) testEmbeddedByValue() {}
|
func (UnimplementedMaglevServer) testEmbeddedByValue() {}
|
||||||
|
|
||||||
@@ -654,6 +670,24 @@ func _Maglev_SyncVPPLBState_Handler(srv interface{}, ctx context.Context, dec fu
|
|||||||
return interceptor(ctx, in, info, handler)
|
return interceptor(ctx, in, info, handler)
|
||||||
}
|
}
|
||||||
|
|
||||||
|
func _Maglev_GetVPPLBCounters_Handler(srv interface{}, ctx context.Context, dec func(interface{}) error, interceptor grpc.UnaryServerInterceptor) (interface{}, error) {
|
||||||
|
in := new(GetVPPLBCountersRequest)
|
||||||
|
if err := dec(in); err != nil {
|
||||||
|
return nil, err
|
||||||
|
}
|
||||||
|
if interceptor == nil {
|
||||||
|
return srv.(MaglevServer).GetVPPLBCounters(ctx, in)
|
||||||
|
}
|
||||||
|
info := &grpc.UnaryServerInfo{
|
||||||
|
Server: srv,
|
||||||
|
FullMethod: Maglev_GetVPPLBCounters_FullMethodName,
|
||||||
|
}
|
||||||
|
handler := func(ctx context.Context, req interface{}) (interface{}, error) {
|
||||||
|
return srv.(MaglevServer).GetVPPLBCounters(ctx, req.(*GetVPPLBCountersRequest))
|
||||||
|
}
|
||||||
|
return interceptor(ctx, in, info, handler)
|
||||||
|
}
|
||||||
|
|
||||||
// Maglev_ServiceDesc is the grpc.ServiceDesc for Maglev service.
|
// Maglev_ServiceDesc is the grpc.ServiceDesc for Maglev service.
|
||||||
// It's only intended for direct use with grpc.RegisterService,
|
// It's only intended for direct use with grpc.RegisterService,
|
||||||
// and not to be introspected or modified (even as a copy)
|
// and not to be introspected or modified (even as a copy)
|
||||||
@@ -725,6 +759,10 @@ var Maglev_ServiceDesc = grpc.ServiceDesc{
|
|||||||
MethodName: "SyncVPPLBState",
|
MethodName: "SyncVPPLBState",
|
||||||
Handler: _Maglev_SyncVPPLBState_Handler,
|
Handler: _Maglev_SyncVPPLBState_Handler,
|
||||||
},
|
},
|
||||||
|
{
|
||||||
|
MethodName: "GetVPPLBCounters",
|
||||||
|
Handler: _Maglev_GetVPPLBCounters_Handler,
|
||||||
|
},
|
||||||
},
|
},
|
||||||
Streams: []grpc.StreamDesc{
|
Streams: []grpc.StreamDesc{
|
||||||
{
|
{
|
||||||
|
|||||||
@@ -7,6 +7,7 @@ import (
|
|||||||
"errors"
|
"errors"
|
||||||
"log/slog"
|
"log/slog"
|
||||||
"net"
|
"net"
|
||||||
|
"sort"
|
||||||
|
|
||||||
"google.golang.org/grpc/codes"
|
"google.golang.org/grpc/codes"
|
||||||
"google.golang.org/grpc/status"
|
"google.golang.org/grpc/status"
|
||||||
@@ -128,14 +129,13 @@ func (s *Server) GetHealthCheck(_ context.Context, req *GetHealthCheckRequest) (
|
|||||||
}
|
}
|
||||||
|
|
||||||
// WatchEvents streams events to the client. On connect, the current state of
|
// WatchEvents streams events to the client. On connect, the current state of
|
||||||
// all backends is sent as synthetic BackendEvents. Afterwards, live events are
|
// every backend and/or frontend is sent as a synthetic event. Afterwards,
|
||||||
// forwarded based on the filter flags in req. An unset (nil) flag defaults to
|
// live events are forwarded based on the filter flags in req. An unset (nil)
|
||||||
// true (subscribe). An empty log_level defaults to "info".
|
// flag defaults to true (subscribe). An empty log_level defaults to "info".
|
||||||
func (s *Server) WatchEvents(req *WatchRequest, stream Maglev_WatchEventsServer) error {
|
func (s *Server) WatchEvents(req *WatchRequest, stream Maglev_WatchEventsServer) error {
|
||||||
wantLog := req.Log == nil || *req.Log
|
wantLog := req.Log == nil || *req.Log
|
||||||
wantBackend := req.Backend == nil || *req.Backend
|
wantBackend := req.Backend == nil || *req.Backend
|
||||||
wantFrontend := req.Frontend == nil || *req.Frontend
|
wantFrontend := req.Frontend == nil || *req.Frontend
|
||||||
_ = wantFrontend // no frontend events emitted yet
|
|
||||||
|
|
||||||
logLevel := slog.LevelInfo
|
logLevel := slog.LevelInfo
|
||||||
if req.LogLevel != "" {
|
if req.LogLevel != "" {
|
||||||
@@ -152,8 +152,20 @@ func (s *Server) WatchEvents(req *WatchRequest, stream Maglev_WatchEventsServer)
|
|||||||
defer unsub()
|
defer unsub()
|
||||||
}
|
}
|
||||||
|
|
||||||
// Subscribe to backend events; send initial state snapshot first.
|
// Subscribe to the checker event stream once; we demultiplex backend
|
||||||
var backendCh <-chan checker.Event
|
// and frontend events in the select below. Skip the subscription if
|
||||||
|
// neither kind is wanted.
|
||||||
|
var eventCh <-chan checker.Event
|
||||||
|
if wantBackend || wantFrontend {
|
||||||
|
var unsub func()
|
||||||
|
eventCh, unsub = s.checker.Subscribe()
|
||||||
|
defer unsub()
|
||||||
|
}
|
||||||
|
|
||||||
|
// Send initial state snapshot: one synthetic event per existing backend
|
||||||
|
// (if wanted), and one per existing frontend (if wanted). Clients that
|
||||||
|
// connect mid-flight see the current state immediately instead of
|
||||||
|
// waiting for the next transition.
|
||||||
if wantBackend {
|
if wantBackend {
|
||||||
for _, name := range s.checker.ListBackends() {
|
for _, name := range s.checker.ListBackends() {
|
||||||
snap, ok := s.checker.GetBackend(name)
|
snap, ok := s.checker.GetBackend(name)
|
||||||
@@ -172,9 +184,25 @@ func (s *Server) WatchEvents(req *WatchRequest, stream Maglev_WatchEventsServer)
|
|||||||
return err
|
return err
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
var unsub func()
|
}
|
||||||
backendCh, unsub = s.checker.Subscribe()
|
if wantFrontend {
|
||||||
defer unsub()
|
for _, name := range s.checker.ListFrontends() {
|
||||||
|
fs, ok := s.checker.FrontendState(name)
|
||||||
|
if !ok {
|
||||||
|
continue
|
||||||
|
}
|
||||||
|
ev := &Event{Event: &Event_Frontend{Frontend: &FrontendEvent{
|
||||||
|
FrontendName: name,
|
||||||
|
Transition: &TransitionRecord{
|
||||||
|
From: fs.String(),
|
||||||
|
To: fs.String(),
|
||||||
|
AtUnixNs: 0,
|
||||||
|
},
|
||||||
|
}}}
|
||||||
|
if err := stream.Send(ev); err != nil {
|
||||||
|
return err
|
||||||
|
}
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
for {
|
for {
|
||||||
@@ -190,10 +218,29 @@ func (s *Server) WatchEvents(req *WatchRequest, stream Maglev_WatchEventsServer)
|
|||||||
if err := stream.Send(&Event{Event: &Event_Log{Log: le}}); err != nil {
|
if err := stream.Send(&Event{Event: &Event_Log{Log: le}}); err != nil {
|
||||||
return err
|
return err
|
||||||
}
|
}
|
||||||
case e, ok := <-backendCh:
|
case e, ok := <-eventCh:
|
||||||
if !ok {
|
if !ok {
|
||||||
return nil
|
return nil
|
||||||
}
|
}
|
||||||
|
if e.FrontendTransition != nil {
|
||||||
|
if !wantFrontend {
|
||||||
|
continue
|
||||||
|
}
|
||||||
|
if err := stream.Send(&Event{Event: &Event_Frontend{Frontend: &FrontendEvent{
|
||||||
|
FrontendName: e.FrontendName,
|
||||||
|
Transition: &TransitionRecord{
|
||||||
|
From: e.FrontendTransition.From.String(),
|
||||||
|
To: e.FrontendTransition.To.String(),
|
||||||
|
AtUnixNs: e.FrontendTransition.At.UnixNano(),
|
||||||
|
},
|
||||||
|
}}}); err != nil {
|
||||||
|
return err
|
||||||
|
}
|
||||||
|
continue
|
||||||
|
}
|
||||||
|
if !wantBackend {
|
||||||
|
continue
|
||||||
|
}
|
||||||
if err := stream.Send(&Event{Event: &Event_Backend{Backend: &BackendEvent{
|
if err := stream.Send(&Event{Event: &Event_Backend{Backend: &BackendEvent{
|
||||||
BackendName: e.BackendName,
|
BackendName: e.BackendName,
|
||||||
Transition: transitionToProto(e.Transition),
|
Transition: transitionToProto(e.Transition),
|
||||||
@@ -302,6 +349,51 @@ func (s *Server) GetVPPLBState(_ context.Context, _ *GetVPPLBStateRequest) (*VPP
|
|||||||
return lbStateToProto(state), nil
|
return lbStateToProto(state), nil
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// GetVPPLBCounters returns the most recent per-VIP and per-backend counter
|
||||||
|
// snapshot captured by the client's 5s scrape loop. The call is served
|
||||||
|
// from an in-process cache and does not hit VPP. An empty response is
|
||||||
|
// returned when VPP is disconnected or no scrape has completed yet.
|
||||||
|
func (s *Server) GetVPPLBCounters(_ context.Context, _ *GetVPPLBCountersRequest) (*VPPLBCounters, error) {
|
||||||
|
if s.vppClient == nil {
|
||||||
|
return nil, status.Error(codes.Unavailable, "VPP integration is disabled")
|
||||||
|
}
|
||||||
|
out := &VPPLBCounters{}
|
||||||
|
for _, v := range s.vppClient.VIPStats() {
|
||||||
|
out.Vips = append(out.Vips, &VPPLBVIPCounters{
|
||||||
|
Prefix: v.Prefix,
|
||||||
|
Protocol: v.Protocol,
|
||||||
|
Port: uint32(v.Port),
|
||||||
|
NextPacket: v.NextPkt,
|
||||||
|
FirstPacket: v.FirstPkt,
|
||||||
|
UntrackedPacket: v.Untracked,
|
||||||
|
NoServer: v.NoServer,
|
||||||
|
Packets: v.Packets,
|
||||||
|
Bytes: v.Bytes,
|
||||||
|
})
|
||||||
|
}
|
||||||
|
sort.Slice(out.Vips, func(i, j int) bool {
|
||||||
|
if out.Vips[i].Prefix != out.Vips[j].Prefix {
|
||||||
|
return out.Vips[i].Prefix < out.Vips[j].Prefix
|
||||||
|
}
|
||||||
|
if out.Vips[i].Protocol != out.Vips[j].Protocol {
|
||||||
|
return out.Vips[i].Protocol < out.Vips[j].Protocol
|
||||||
|
}
|
||||||
|
return out.Vips[i].Port < out.Vips[j].Port
|
||||||
|
})
|
||||||
|
for _, b := range s.vppClient.BackendRouteStats() {
|
||||||
|
out.Backends = append(out.Backends, &VPPLBBackendCounters{
|
||||||
|
Backend: b.Backend,
|
||||||
|
Address: b.Address,
|
||||||
|
Packets: b.Packets,
|
||||||
|
Bytes: b.Bytes,
|
||||||
|
})
|
||||||
|
}
|
||||||
|
sort.Slice(out.Backends, func(i, j int) bool {
|
||||||
|
return out.Backends[i].Backend < out.Backends[j].Backend
|
||||||
|
})
|
||||||
|
return out, nil
|
||||||
|
}
|
||||||
|
|
||||||
// SyncVPPLBState runs the LB reconciler. With frontend_name unset it does a
|
// SyncVPPLBState runs the LB reconciler. With frontend_name unset it does a
|
||||||
// full sync (SyncLBStateAll), which may remove stale VIPs. With frontend_name
|
// full sync (SyncLBStateAll), which may remove stale VIPs. With frontend_name
|
||||||
// set it does a single-VIP sync (SyncLBStateVIP) that only adds/updates.
|
// set it does a single-VIP sync (SyncLBStateVIP) that only adds/updates.
|
||||||
@@ -342,6 +434,7 @@ func lbStateToProto(s *vpp.LBState) *VPPLBState {
|
|||||||
Port: uint32(v.Port),
|
Port: uint32(v.Port),
|
||||||
Encap: v.Encap,
|
Encap: v.Encap,
|
||||||
FlowTableLength: uint32(v.FlowTableLength),
|
FlowTableLength: uint32(v.FlowTableLength),
|
||||||
|
SrcIpSticky: v.SrcIPSticky,
|
||||||
}
|
}
|
||||||
for _, a := range v.ASes {
|
for _, a := range v.ASes {
|
||||||
var ts int64
|
var ts int64
|
||||||
@@ -393,6 +486,7 @@ func frontendToProto(name string, fe config.Frontend, src vpp.StateSource) *Fron
|
|||||||
Port: uint32(fe.Port),
|
Port: uint32(fe.Port),
|
||||||
Description: fe.Description,
|
Description: fe.Description,
|
||||||
Pools: pools,
|
Pools: pools,
|
||||||
|
SrcIpSticky: fe.SrcIPSticky,
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|||||||
@@ -353,9 +353,11 @@ func TestWatchEventsServerShutdown(t *testing.T) {
|
|||||||
if err != nil {
|
if err != nil {
|
||||||
t.Fatalf("WatchEvents: %v", err)
|
t.Fatalf("WatchEvents: %v", err)
|
||||||
}
|
}
|
||||||
// Drain the initial synthetic backend event.
|
// Drain the initial synthetic snapshots (one per backend, one per frontend).
|
||||||
|
for i := 0; i < 2; i++ {
|
||||||
if _, err := stream.Recv(); err != nil {
|
if _, err := stream.Recv(); err != nil {
|
||||||
t.Fatalf("initial Recv: %v", err)
|
t.Fatalf("initial Recv %d: %v", i, err)
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
// Cancel the server context; the stream must terminate.
|
// Cancel the server context; the stream must terminate.
|
||||||
|
|||||||
@@ -64,6 +64,43 @@ type Transition struct {
|
|||||||
Result ProbeResult
|
Result ProbeResult
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// FrontendState is the aggregated state of a frontend derived from the
|
||||||
|
// effective weights of its member backends. Frontends do not have their
|
||||||
|
// own rise/fall counters: they're purely a reduction over backend state.
|
||||||
|
//
|
||||||
|
// - unknown: no backends, or every referenced backend is in StateUnknown
|
||||||
|
// (the checker has no probe data yet).
|
||||||
|
// - up: at least one backend has effective weight > 0 — the VIP has
|
||||||
|
// something to serve.
|
||||||
|
// - down: backends exist with real state, but none have effective
|
||||||
|
// weight > 0 — the VIP has nothing to serve.
|
||||||
|
type FrontendState int
|
||||||
|
|
||||||
|
const (
|
||||||
|
FrontendStateUnknown FrontendState = iota
|
||||||
|
FrontendStateUp
|
||||||
|
FrontendStateDown
|
||||||
|
)
|
||||||
|
|
||||||
|
func (s FrontendState) String() string {
|
||||||
|
switch s {
|
||||||
|
case FrontendStateUnknown:
|
||||||
|
return "unknown"
|
||||||
|
case FrontendStateUp:
|
||||||
|
return "up"
|
||||||
|
case FrontendStateDown:
|
||||||
|
return "down"
|
||||||
|
}
|
||||||
|
return "unknown"
|
||||||
|
}
|
||||||
|
|
||||||
|
// FrontendTransition records a frontend state change event.
|
||||||
|
type FrontendTransition struct {
|
||||||
|
From FrontendState
|
||||||
|
To FrontendState
|
||||||
|
At time.Time
|
||||||
|
}
|
||||||
|
|
||||||
// HealthCounter is HAProxy's single-integer rise/fall model.
|
// HealthCounter is HAProxy's single-integer rise/fall model.
|
||||||
//
|
//
|
||||||
// Health ∈ [0, Rise+Fall-1]. Server is UP when Health >= Rise, DOWN when
|
// Health ∈ [0, Rise+Fall-1]. Server is UP when Health >= Rise, DOWN when
|
||||||
|
|||||||
118
internal/health/weights.go
Normal file
118
internal/health/weights.go
Normal file
@@ -0,0 +1,118 @@
|
|||||||
|
// Copyright (c) 2026, Pim van Pelt <pim@ipng.ch>
|
||||||
|
|
||||||
|
package health
|
||||||
|
|
||||||
|
import (
|
||||||
|
"git.ipng.ch/ipng/vpp-maglev/internal/config"
|
||||||
|
)
|
||||||
|
|
||||||
|
// ActivePoolIndex returns the priority-failover pool index for fe given
|
||||||
|
// the current backend states. The active pool is the first pool that
|
||||||
|
// contains at least one backend in StateUp — pool[0] is the primary,
|
||||||
|
// pool[1] the first fallback, and so on. Returns 0 when no pool has
|
||||||
|
// any up backend, in which case every backend maps to weight 0 and the
|
||||||
|
// return value is unobservable.
|
||||||
|
func ActivePoolIndex(fe config.Frontend, states map[string]State) int {
|
||||||
|
for i, pool := range fe.Pools {
|
||||||
|
for bName := range pool.Backends {
|
||||||
|
if states[bName] == StateUp {
|
||||||
|
return i
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
return 0
|
||||||
|
}
|
||||||
|
|
||||||
|
// BackendEffectiveWeight is the pure mapping from (pool index, active pool,
|
||||||
|
// backend state, config weight) to the desired VPP AS weight and flush hint.
|
||||||
|
// This is the single source of truth for the state → dataplane rule.
|
||||||
|
//
|
||||||
|
// A backend gets its configured weight iff it is up AND belongs to the
|
||||||
|
// currently-active pool. Every other case yields weight 0. Only StateDisabled
|
||||||
|
// produces flush=true (immediate session teardown).
|
||||||
|
//
|
||||||
|
// state in active pool not in active pool flush
|
||||||
|
// -------- -------------- ------------------- -----
|
||||||
|
// unknown 0 0 no
|
||||||
|
// up configured 0 (standby) no
|
||||||
|
// down 0 0 no
|
||||||
|
// paused 0 0 no
|
||||||
|
// disabled 0 0 yes
|
||||||
|
func BackendEffectiveWeight(poolIdx, activePool int, state State, cfgWeight int) (weight uint8, flush bool) {
|
||||||
|
switch state {
|
||||||
|
case StateUp:
|
||||||
|
if poolIdx == activePool {
|
||||||
|
return clampWeight(cfgWeight), false
|
||||||
|
}
|
||||||
|
return 0, false
|
||||||
|
case StateDisabled:
|
||||||
|
return 0, true
|
||||||
|
default:
|
||||||
|
return 0, false
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// EffectiveWeights computes per-pool per-backend effective weights for fe,
|
||||||
|
// given a snapshot of backend states. Result layout: weights[poolIdx][backendName].
|
||||||
|
func EffectiveWeights(fe config.Frontend, states map[string]State) map[int]map[string]uint8 {
|
||||||
|
activePool := ActivePoolIndex(fe, states)
|
||||||
|
out := make(map[int]map[string]uint8, len(fe.Pools))
|
||||||
|
for poolIdx, pool := range fe.Pools {
|
||||||
|
out[poolIdx] = make(map[string]uint8, len(pool.Backends))
|
||||||
|
for bName, pb := range pool.Backends {
|
||||||
|
w, _ := BackendEffectiveWeight(poolIdx, activePool, states[bName], pb.Weight)
|
||||||
|
out[poolIdx][bName] = w
|
||||||
|
}
|
||||||
|
}
|
||||||
|
return out
|
||||||
|
}
|
||||||
|
|
||||||
|
// ComputeFrontendState derives the FrontendState for fe from a snapshot of
|
||||||
|
// backend states. Rules:
|
||||||
|
//
|
||||||
|
// - no backends → unknown
|
||||||
|
// - every referenced backend is in StateUnknown → unknown
|
||||||
|
// - any backend has effective weight > 0 → up
|
||||||
|
// - otherwise → down
|
||||||
|
func ComputeFrontendState(fe config.Frontend, states map[string]State) FrontendState {
|
||||||
|
// Unique set of backends referenced by this frontend (a single backend
|
||||||
|
// may appear in multiple pools; we count it once).
|
||||||
|
seen := make(map[string]struct{})
|
||||||
|
for _, pool := range fe.Pools {
|
||||||
|
for bName := range pool.Backends {
|
||||||
|
seen[bName] = struct{}{}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
if len(seen) == 0 {
|
||||||
|
return FrontendStateUnknown
|
||||||
|
}
|
||||||
|
allUnknown := true
|
||||||
|
for bName := range seen {
|
||||||
|
if states[bName] != StateUnknown {
|
||||||
|
allUnknown = false
|
||||||
|
break
|
||||||
|
}
|
||||||
|
}
|
||||||
|
if allUnknown {
|
||||||
|
return FrontendStateUnknown
|
||||||
|
}
|
||||||
|
ew := EffectiveWeights(fe, states)
|
||||||
|
for _, poolMap := range ew {
|
||||||
|
for _, w := range poolMap {
|
||||||
|
if w > 0 {
|
||||||
|
return FrontendStateUp
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
return FrontendStateDown
|
||||||
|
}
|
||||||
|
|
||||||
|
func clampWeight(w int) uint8 {
|
||||||
|
if w < 0 {
|
||||||
|
return 0
|
||||||
|
}
|
||||||
|
if w > 100 {
|
||||||
|
return 100
|
||||||
|
}
|
||||||
|
return uint8(w)
|
||||||
|
}
|
||||||
232
internal/health/weights_test.go
Normal file
232
internal/health/weights_test.go
Normal file
@@ -0,0 +1,232 @@
|
|||||||
|
// Copyright (c) 2026, Pim van Pelt <pim@ipng.ch>
|
||||||
|
|
||||||
|
package health
|
||||||
|
|
||||||
|
import (
|
||||||
|
"testing"
|
||||||
|
|
||||||
|
"git.ipng.ch/ipng/vpp-maglev/internal/config"
|
||||||
|
)
|
||||||
|
|
||||||
|
// TestBackendEffectiveWeight locks down the state → (weight, flush) truth
|
||||||
|
// table. This is the single source of truth for how maglevd decides what
|
||||||
|
// to program into VPP for each backend state. If this test needs updating
|
||||||
|
// the behavior has deliberately changed.
|
||||||
|
func TestBackendEffectiveWeight(t *testing.T) {
|
||||||
|
cases := []struct {
|
||||||
|
name string
|
||||||
|
poolIdx int
|
||||||
|
activePool int
|
||||||
|
state State
|
||||||
|
cfgWeight int
|
||||||
|
wantWeight uint8
|
||||||
|
wantFlush bool
|
||||||
|
}{
|
||||||
|
{"up active w100", 0, 0, StateUp, 100, 100, false},
|
||||||
|
{"up active w50", 0, 0, StateUp, 50, 50, false},
|
||||||
|
{"up active w0", 0, 0, StateUp, 0, 0, false},
|
||||||
|
{"up active clamp-high", 0, 0, StateUp, 150, 100, false},
|
||||||
|
{"up active clamp-low", 0, 0, StateUp, -5, 0, false},
|
||||||
|
|
||||||
|
{"up standby pool0 active=1", 0, 1, StateUp, 100, 0, false},
|
||||||
|
{"up standby pool1 active=0", 1, 0, StateUp, 100, 0, false},
|
||||||
|
{"up standby pool2 active=0", 2, 0, StateUp, 100, 0, false},
|
||||||
|
|
||||||
|
{"up failover pool1 active=1", 1, 1, StateUp, 100, 100, false},
|
||||||
|
|
||||||
|
{"unknown pool0 active=0", 0, 0, StateUnknown, 100, 0, false},
|
||||||
|
{"unknown pool1 active=0", 1, 0, StateUnknown, 100, 0, false},
|
||||||
|
|
||||||
|
{"down pool0 active=0", 0, 0, StateDown, 100, 0, false},
|
||||||
|
{"down pool1 active=1", 1, 1, StateDown, 100, 0, false},
|
||||||
|
|
||||||
|
{"paused pool0 active=0", 0, 0, StatePaused, 100, 0, false},
|
||||||
|
|
||||||
|
{"disabled pool0 active=0", 0, 0, StateDisabled, 100, 0, true},
|
||||||
|
{"disabled pool1 active=1", 1, 1, StateDisabled, 100, 0, true},
|
||||||
|
}
|
||||||
|
|
||||||
|
for _, tc := range cases {
|
||||||
|
t.Run(tc.name, func(t *testing.T) {
|
||||||
|
w, f := BackendEffectiveWeight(tc.poolIdx, tc.activePool, tc.state, tc.cfgWeight)
|
||||||
|
if w != tc.wantWeight {
|
||||||
|
t.Errorf("weight: got %d, want %d", w, tc.wantWeight)
|
||||||
|
}
|
||||||
|
if f != tc.wantFlush {
|
||||||
|
t.Errorf("flush: got %v, want %v", f, tc.wantFlush)
|
||||||
|
}
|
||||||
|
})
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// TestActivePoolIndex locks down the priority-failover selector: the first
|
||||||
|
// pool containing at least one up backend is the active pool. Default 0.
|
||||||
|
func TestActivePoolIndex(t *testing.T) {
|
||||||
|
mkFE := func(pools ...[]string) config.Frontend {
|
||||||
|
out := make([]config.Pool, len(pools))
|
||||||
|
for i, p := range pools {
|
||||||
|
out[i] = config.Pool{Name: "p", Backends: map[string]config.PoolBackend{}}
|
||||||
|
for _, name := range p {
|
||||||
|
out[i].Backends[name] = config.PoolBackend{Weight: 100}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
return config.Frontend{Pools: out}
|
||||||
|
}
|
||||||
|
|
||||||
|
cases := []struct {
|
||||||
|
name string
|
||||||
|
fe config.Frontend
|
||||||
|
states map[string]State
|
||||||
|
want int
|
||||||
|
}{
|
||||||
|
{
|
||||||
|
name: "pool0 has up, pool1 standby",
|
||||||
|
fe: mkFE([]string{"a", "b"}, []string{"c", "d"}),
|
||||||
|
states: map[string]State{"a": StateUp, "b": StateDown, "c": StateUp, "d": StateUp},
|
||||||
|
want: 0,
|
||||||
|
},
|
||||||
|
{
|
||||||
|
name: "pool0 all down, pool1 has up → failover",
|
||||||
|
fe: mkFE([]string{"a", "b"}, []string{"c", "d"}),
|
||||||
|
states: map[string]State{"a": StateDown, "b": StateDown, "c": StateUp, "d": StateUp},
|
||||||
|
want: 1,
|
||||||
|
},
|
||||||
|
{
|
||||||
|
name: "pool0 all disabled, pool1 has up → failover",
|
||||||
|
fe: mkFE([]string{"a", "b"}, []string{"c"}),
|
||||||
|
states: map[string]State{"a": StateDisabled, "b": StateDisabled, "c": StateUp},
|
||||||
|
want: 1,
|
||||||
|
},
|
||||||
|
{
|
||||||
|
name: "pool0 all paused, pool1 has up → failover",
|
||||||
|
fe: mkFE([]string{"a"}, []string{"c"}),
|
||||||
|
states: map[string]State{"a": StatePaused, "c": StateUp},
|
||||||
|
want: 1,
|
||||||
|
},
|
||||||
|
{
|
||||||
|
name: "pool0 all unknown (startup), pool1 up → pool1",
|
||||||
|
fe: mkFE([]string{"a"}, []string{"c"}),
|
||||||
|
states: map[string]State{"a": StateUnknown, "c": StateUp},
|
||||||
|
want: 1,
|
||||||
|
},
|
||||||
|
{
|
||||||
|
name: "nothing up anywhere → default 0",
|
||||||
|
fe: mkFE([]string{"a"}, []string{"c"}),
|
||||||
|
states: map[string]State{"a": StateDown, "c": StateDown},
|
||||||
|
want: 0,
|
||||||
|
},
|
||||||
|
{
|
||||||
|
name: "1 up in pool0 is enough",
|
||||||
|
fe: mkFE([]string{"a", "b", "c"}, []string{"d"}),
|
||||||
|
states: map[string]State{"a": StateDown, "b": StateDown, "c": StateUp, "d": StateUp},
|
||||||
|
want: 0,
|
||||||
|
},
|
||||||
|
{
|
||||||
|
name: "three tiers, pool0 and pool1 both empty → pool2",
|
||||||
|
fe: mkFE([]string{"a"}, []string{"b"}, []string{"c"}),
|
||||||
|
states: map[string]State{"a": StateDown, "b": StateDown, "c": StateUp},
|
||||||
|
want: 2,
|
||||||
|
},
|
||||||
|
}
|
||||||
|
|
||||||
|
for _, tc := range cases {
|
||||||
|
t.Run(tc.name, func(t *testing.T) {
|
||||||
|
got := ActivePoolIndex(tc.fe, tc.states)
|
||||||
|
if got != tc.want {
|
||||||
|
t.Errorf("got pool %d, want pool %d", got, tc.want)
|
||||||
|
}
|
||||||
|
})
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// TestComputeFrontendState locks down the reduction rule: frontends are
|
||||||
|
// up iff any backend has effective weight > 0, unknown iff all backends
|
||||||
|
// are still in StateUnknown (or there are no backends), and down otherwise.
|
||||||
|
func TestComputeFrontendState(t *testing.T) {
|
||||||
|
mkFE := func(pools ...[]string) config.Frontend {
|
||||||
|
out := make([]config.Pool, len(pools))
|
||||||
|
for i, p := range pools {
|
||||||
|
out[i] = config.Pool{Name: "p", Backends: map[string]config.PoolBackend{}}
|
||||||
|
for _, name := range p {
|
||||||
|
out[i].Backends[name] = config.PoolBackend{Weight: 100}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
return config.Frontend{Pools: out}
|
||||||
|
}
|
||||||
|
|
||||||
|
cases := []struct {
|
||||||
|
name string
|
||||||
|
fe config.Frontend
|
||||||
|
states map[string]State
|
||||||
|
want FrontendState
|
||||||
|
}{
|
||||||
|
{
|
||||||
|
name: "no backends → unknown",
|
||||||
|
fe: config.Frontend{Pools: []config.Pool{{Name: "primary", Backends: map[string]config.PoolBackend{}}}},
|
||||||
|
want: FrontendStateUnknown,
|
||||||
|
},
|
||||||
|
{
|
||||||
|
name: "all unknown (startup) → unknown",
|
||||||
|
fe: mkFE([]string{"a", "b"}),
|
||||||
|
states: map[string]State{"a": StateUnknown, "b": StateUnknown},
|
||||||
|
want: FrontendStateUnknown,
|
||||||
|
},
|
||||||
|
{
|
||||||
|
name: "one up in primary → up",
|
||||||
|
fe: mkFE([]string{"a", "b"}),
|
||||||
|
states: map[string]State{"a": StateUp, "b": StateDown},
|
||||||
|
want: FrontendStateUp,
|
||||||
|
},
|
||||||
|
{
|
||||||
|
name: "all down → down",
|
||||||
|
fe: mkFE([]string{"a", "b"}),
|
||||||
|
states: map[string]State{"a": StateDown, "b": StateDown},
|
||||||
|
want: FrontendStateDown,
|
||||||
|
},
|
||||||
|
{
|
||||||
|
name: "all disabled → down",
|
||||||
|
fe: mkFE([]string{"a", "b"}),
|
||||||
|
states: map[string]State{"a": StateDisabled, "b": StateDisabled},
|
||||||
|
want: FrontendStateDown,
|
||||||
|
},
|
||||||
|
{
|
||||||
|
name: "all paused → down",
|
||||||
|
fe: mkFE([]string{"a"}),
|
||||||
|
states: map[string]State{"a": StatePaused},
|
||||||
|
want: FrontendStateDown,
|
||||||
|
},
|
||||||
|
{
|
||||||
|
name: "primary down, secondary up → up (failover)",
|
||||||
|
fe: mkFE([]string{"a"}, []string{"b"}),
|
||||||
|
states: map[string]State{"a": StateDown, "b": StateUp},
|
||||||
|
want: FrontendStateUp,
|
||||||
|
},
|
||||||
|
{
|
||||||
|
name: "primary up, secondary down → up (secondary standby ignored)",
|
||||||
|
fe: mkFE([]string{"a"}, []string{"b"}),
|
||||||
|
states: map[string]State{"a": StateUp, "b": StateDown},
|
||||||
|
want: FrontendStateUp,
|
||||||
|
},
|
||||||
|
{
|
||||||
|
name: "primary unknown, secondary unknown → unknown",
|
||||||
|
fe: mkFE([]string{"a"}, []string{"b"}),
|
||||||
|
states: map[string]State{"a": StateUnknown, "b": StateUnknown},
|
||||||
|
want: FrontendStateUnknown,
|
||||||
|
},
|
||||||
|
{
|
||||||
|
name: "primary down, secondary unknown → down",
|
||||||
|
fe: mkFE([]string{"a"}, []string{"b"}),
|
||||||
|
states: map[string]State{"a": StateDown, "b": StateUnknown},
|
||||||
|
want: FrontendStateDown,
|
||||||
|
},
|
||||||
|
}
|
||||||
|
|
||||||
|
for _, tc := range cases {
|
||||||
|
t.Run(tc.name, func(t *testing.T) {
|
||||||
|
got := ComputeFrontendState(tc.fe, tc.states)
|
||||||
|
if got != tc.want {
|
||||||
|
t.Errorf("got %s, want %s", got, tc.want)
|
||||||
|
}
|
||||||
|
})
|
||||||
|
}
|
||||||
|
}
|
||||||
@@ -45,12 +45,53 @@ type VPPInfo struct {
|
|||||||
ConnectedSince time.Time
|
ConnectedSince time.Time
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// VIPStatEntry is a point-in-time snapshot of the per-VIP counters that
|
||||||
|
// VPP exposes via the stats segment: four SimpleCounters from the LB
|
||||||
|
// plugin (packets only) plus the FIB CombinedCounter at /net/route/to
|
||||||
|
// for the VIP's own host prefix (packets + bytes). Values are summed
|
||||||
|
// across worker threads. The labelling (prefix/protocol/port) matches
|
||||||
|
// the gRPC VPPLBVIP representation so a Prometheus time series
|
||||||
|
// corresponds 1:1 to a maglev frontend VIP.
|
||||||
|
type VIPStatEntry struct {
|
||||||
|
Prefix string // CIDR string, e.g. "192.0.2.1/32"
|
||||||
|
Protocol string // "tcp", "udp", "any"
|
||||||
|
Port uint16
|
||||||
|
// LB plugin SimpleCounters (packets only)
|
||||||
|
NextPkt uint64 // /packet from existing sessions
|
||||||
|
FirstPkt uint64 // /first session packet
|
||||||
|
Untracked uint64 // /untracked packet
|
||||||
|
NoServer uint64 // /no server configured
|
||||||
|
// FIB CombinedCounter from /net/route/to at the VIP prefix
|
||||||
|
Packets uint64
|
||||||
|
Bytes uint64
|
||||||
|
}
|
||||||
|
|
||||||
|
// BackendRouteStat is a point-in-time snapshot of the FIB combined counter
|
||||||
|
// (/net/route/to) for a single backend's host prefix. Values are summed
|
||||||
|
// across worker threads. Labels match the backend's identity so a time
|
||||||
|
// series corresponds 1:1 to a maglev backend entry.
|
||||||
|
type BackendRouteStat struct {
|
||||||
|
Backend string // backend name from the config
|
||||||
|
Address string // backend IP address as a string (e.g. "192.0.2.10")
|
||||||
|
Packets uint64
|
||||||
|
Bytes uint64
|
||||||
|
}
|
||||||
|
|
||||||
// VPPSource provides read-only access to the VPP client's state. vpp.Client
|
// VPPSource provides read-only access to the VPP client's state. vpp.Client
|
||||||
// is adapted to this interface via a small shim in the collector so the
|
// is adapted to this interface via a small shim in the collector so the
|
||||||
// metrics package stays decoupled from the vpp package's concrete types.
|
// metrics package stays decoupled from the vpp package's concrete types.
|
||||||
type VPPSource interface {
|
type VPPSource interface {
|
||||||
IsConnected() bool
|
IsConnected() bool
|
||||||
VPPInfo() (VPPInfo, bool)
|
VPPInfo() (VPPInfo, bool)
|
||||||
|
// VIPStats returns the most recent snapshot of per-VIP stats-segment
|
||||||
|
// counters, as captured by the LB stats loop. Returns nil when VPP is
|
||||||
|
// disconnected or no scrape has happened yet.
|
||||||
|
VIPStats() []VIPStatEntry
|
||||||
|
// BackendRouteStats returns the most recent snapshot of per-backend
|
||||||
|
// FIB combined counters (/net/route/to), as captured by the LB stats
|
||||||
|
// loop. Returns nil when VPP is disconnected, no scrape has happened
|
||||||
|
// yet, or the route lookup for every backend failed.
|
||||||
|
BackendRouteStats() []BackendRouteStat
|
||||||
}
|
}
|
||||||
|
|
||||||
// ---- inline metrics (updated per probe) ------------------------------------
|
// ---- inline metrics (updated per probe) ------------------------------------
|
||||||
@@ -118,6 +159,12 @@ type Collector struct {
|
|||||||
vppUptimeSeconds *prometheus.Desc
|
vppUptimeSeconds *prometheus.Desc
|
||||||
vppConnectedFor *prometheus.Desc
|
vppConnectedFor *prometheus.Desc
|
||||||
vppInfo *prometheus.Desc
|
vppInfo *prometheus.Desc
|
||||||
|
|
||||||
|
vipPackets *prometheus.Desc // per-VIP LB counters from stats segment
|
||||||
|
vipRoutePkts *prometheus.Desc // per-VIP FIB combined counter: packets
|
||||||
|
vipRouteByts *prometheus.Desc // per-VIP FIB combined counter: bytes
|
||||||
|
backendRoutePkts *prometheus.Desc // per-backend FIB combined counter: packets
|
||||||
|
backendRouteByts *prometheus.Desc // per-backend FIB combined counter: bytes
|
||||||
}
|
}
|
||||||
|
|
||||||
// NewCollector creates a Collector backed by the given StateSource. vpp may
|
// NewCollector creates a Collector backed by the given StateSource. vpp may
|
||||||
@@ -167,6 +214,31 @@ func NewCollector(src StateSource, vpp VPPSource) *Collector {
|
|||||||
"Static VPP build information. Always 1; metadata is conveyed via labels.",
|
"Static VPP build information. Always 1; metadata is conveyed via labels.",
|
||||||
[]string{"version", "build_date", "pid"}, nil,
|
[]string{"version", "build_date", "pid"}, nil,
|
||||||
),
|
),
|
||||||
|
vipPackets: prometheus.NewDesc(
|
||||||
|
"maglev_vpp_vip_packets_total",
|
||||||
|
"Per-VIP packet counters from the VPP LB plugin stats segment, summed across workers. kind ∈ {next, first, untracked, no_server}.",
|
||||||
|
[]string{"prefix", "protocol", "port", "kind"}, nil,
|
||||||
|
),
|
||||||
|
vipRoutePkts: prometheus.NewDesc(
|
||||||
|
"maglev_vpp_vip_route_packets_total",
|
||||||
|
"Packets forwarded by VPP's FIB toward each VIP's host prefix (from /net/route/to), summed across workers.",
|
||||||
|
[]string{"prefix", "protocol", "port"}, nil,
|
||||||
|
),
|
||||||
|
vipRouteByts: prometheus.NewDesc(
|
||||||
|
"maglev_vpp_vip_route_bytes_total",
|
||||||
|
"Bytes forwarded by VPP's FIB toward each VIP's host prefix (from /net/route/to), summed across workers.",
|
||||||
|
[]string{"prefix", "protocol", "port"}, nil,
|
||||||
|
),
|
||||||
|
backendRoutePkts: prometheus.NewDesc(
|
||||||
|
"maglev_vpp_backend_route_packets_total",
|
||||||
|
"Packets forwarded by VPP's FIB toward each backend's host prefix (from /net/route/to), summed across workers.",
|
||||||
|
[]string{"backend", "address"}, nil,
|
||||||
|
),
|
||||||
|
backendRouteByts: prometheus.NewDesc(
|
||||||
|
"maglev_vpp_backend_route_bytes_total",
|
||||||
|
"Bytes forwarded by VPP's FIB toward each backend's host prefix (from /net/route/to), summed across workers.",
|
||||||
|
[]string{"backend", "address"}, nil,
|
||||||
|
),
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -180,6 +252,11 @@ func (c *Collector) Describe(ch chan<- *prometheus.Desc) {
|
|||||||
ch <- c.vppUptimeSeconds
|
ch <- c.vppUptimeSeconds
|
||||||
ch <- c.vppConnectedFor
|
ch <- c.vppConnectedFor
|
||||||
ch <- c.vppInfo
|
ch <- c.vppInfo
|
||||||
|
ch <- c.vipPackets
|
||||||
|
ch <- c.vipRoutePkts
|
||||||
|
ch <- c.vipRouteByts
|
||||||
|
ch <- c.backendRoutePkts
|
||||||
|
ch <- c.backendRouteByts
|
||||||
}
|
}
|
||||||
|
|
||||||
// Collect implements prometheus.Collector.
|
// Collect implements prometheus.Collector.
|
||||||
@@ -271,6 +348,27 @@ func (c *Collector) Collect(ch chan<- prometheus.Metric) {
|
|||||||
c.vppInfo, prometheus.GaugeValue, 1.0,
|
c.vppInfo, prometheus.GaugeValue, 1.0,
|
||||||
info.Version, info.BuildDate, fmt.Sprintf("%d", info.PID),
|
info.Version, info.BuildDate, fmt.Sprintf("%d", info.PID),
|
||||||
)
|
)
|
||||||
|
|
||||||
|
// Per-VIP packet counters, read from the snapshot updated by the LB
|
||||||
|
// stats loop in internal/vpp. CounterValue so rate()/increase() work
|
||||||
|
// as expected; VPP counter resets (e.g. VIP recreate) are handled by
|
||||||
|
// Prometheus's built-in counter-reset detection.
|
||||||
|
for _, v := range c.vpp.VIPStats() {
|
||||||
|
port := fmt.Sprintf("%d", v.Port)
|
||||||
|
ch <- prometheus.MustNewConstMetric(c.vipPackets, prometheus.CounterValue, float64(v.NextPkt), v.Prefix, v.Protocol, port, "next")
|
||||||
|
ch <- prometheus.MustNewConstMetric(c.vipPackets, prometheus.CounterValue, float64(v.FirstPkt), v.Prefix, v.Protocol, port, "first")
|
||||||
|
ch <- prometheus.MustNewConstMetric(c.vipPackets, prometheus.CounterValue, float64(v.Untracked), v.Prefix, v.Protocol, port, "untracked")
|
||||||
|
ch <- prometheus.MustNewConstMetric(c.vipPackets, prometheus.CounterValue, float64(v.NoServer), v.Prefix, v.Protocol, port, "no_server")
|
||||||
|
ch <- prometheus.MustNewConstMetric(c.vipRoutePkts, prometheus.CounterValue, float64(v.Packets), v.Prefix, v.Protocol, port)
|
||||||
|
ch <- prometheus.MustNewConstMetric(c.vipRouteByts, prometheus.CounterValue, float64(v.Bytes), v.Prefix, v.Protocol, port)
|
||||||
|
}
|
||||||
|
|
||||||
|
// Per-backend FIB counters from /net/route/to. Same CounterValue
|
||||||
|
// semantics as above.
|
||||||
|
for _, b := range c.vpp.BackendRouteStats() {
|
||||||
|
ch <- prometheus.MustNewConstMetric(c.backendRoutePkts, prometheus.CounterValue, float64(b.Packets), b.Backend, b.Address)
|
||||||
|
ch <- prometheus.MustNewConstMetric(c.backendRouteByts, prometheus.CounterValue, float64(b.Bytes), b.Backend, b.Address)
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
// Register registers all metrics with the given registry. vpp may be nil
|
// Register registers all metrics with the given registry. vpp may be nil
|
||||||
|
|||||||
@@ -9,6 +9,7 @@ import (
|
|||||||
"context"
|
"context"
|
||||||
"log/slog"
|
"log/slog"
|
||||||
"sync"
|
"sync"
|
||||||
|
"sync/atomic"
|
||||||
"time"
|
"time"
|
||||||
|
|
||||||
"go.fd.io/govpp/adapter"
|
"go.fd.io/govpp/adapter"
|
||||||
@@ -60,6 +61,16 @@ type Client struct {
|
|||||||
info Info // populated on successful connect
|
info Info // populated on successful connect
|
||||||
stateSrc StateSource // optional; enables periodic LB sync
|
stateSrc StateSource // optional; enables periodic LB sync
|
||||||
lastLBConf *lb.LbConf // cached last-pushed lb_conf (dedup)
|
lastLBConf *lb.LbConf // cached last-pushed lb_conf (dedup)
|
||||||
|
|
||||||
|
// lbStatsSnap is the most recent per-VIP stats snapshot captured by
|
||||||
|
// lbStatsLoop. Published as an immutable slice via atomic.Pointer so
|
||||||
|
// Prometheus scrapes (metrics.Collector.Collect) don't take any lock.
|
||||||
|
lbStatsSnap atomic.Pointer[[]metrics.VIPStatEntry]
|
||||||
|
|
||||||
|
// backendRouteSnap is the most recent per-backend FIB stats snapshot
|
||||||
|
// captured by lbStatsLoop. Same atomic-pointer publish pattern as
|
||||||
|
// lbStatsSnap; see logBackendRouteStats in fibstats.go.
|
||||||
|
backendRouteSnap atomic.Pointer[[]metrics.BackendRouteStat]
|
||||||
}
|
}
|
||||||
|
|
||||||
// SetStateSource attaches a live config + health state source. When set, the
|
// SetStateSource attaches a live config + health state source. When set, the
|
||||||
@@ -140,10 +151,11 @@ func (c *Client) Run(ctx context.Context) {
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
// Start the LB sync loop for as long as the connection is up.
|
// Start the LB sync and stats loops for as long as the connection
|
||||||
// It exits when connCtx is cancelled (on disconnect or shutdown).
|
// is up. Both exit when connCtx is cancelled.
|
||||||
connCtx, connCancel := context.WithCancel(ctx)
|
connCtx, connCancel := context.WithCancel(ctx)
|
||||||
go c.lbSyncLoop(connCtx)
|
go c.lbSyncLoop(connCtx)
|
||||||
|
go c.lbStatsLoop(connCtx)
|
||||||
|
|
||||||
// Hold the connection, pinging periodically to detect VPP restarts.
|
// Hold the connection, pinging periodically to detect VPP restarts.
|
||||||
c.monitor(ctx)
|
c.monitor(ctx)
|
||||||
@@ -217,6 +229,30 @@ func (c *Client) GetInfo() (Info, error) {
|
|||||||
return c.info, nil
|
return c.info, nil
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// VIPStats satisfies metrics.VPPSource. It returns the latest snapshot of
|
||||||
|
// per-VIP LB stats-segment counters captured by lbStatsLoop. Returns nil
|
||||||
|
// until the first scrape completes, or after a disconnect (the pointer is
|
||||||
|
// cleared when the connection drops).
|
||||||
|
func (c *Client) VIPStats() []metrics.VIPStatEntry {
|
||||||
|
p := c.lbStatsSnap.Load()
|
||||||
|
if p == nil {
|
||||||
|
return nil
|
||||||
|
}
|
||||||
|
return *p
|
||||||
|
}
|
||||||
|
|
||||||
|
// BackendRouteStats satisfies metrics.VPPSource. It returns the latest
|
||||||
|
// snapshot of per-backend FIB combined counters (/net/route/to) captured
|
||||||
|
// by lbStatsLoop. Returns nil until the first scrape completes, or after
|
||||||
|
// a disconnect.
|
||||||
|
func (c *Client) BackendRouteStats() []metrics.BackendRouteStat {
|
||||||
|
p := c.backendRouteSnap.Load()
|
||||||
|
if p == nil {
|
||||||
|
return nil
|
||||||
|
}
|
||||||
|
return *p
|
||||||
|
}
|
||||||
|
|
||||||
// VPPInfo satisfies metrics.VPPSource. It returns a copy of the cached
|
// VPPInfo satisfies metrics.VPPSource. It returns a copy of the cached
|
||||||
// connection info as a metrics-local struct so the metrics package doesn't
|
// connection info as a metrics-local struct so the metrics package doesn't
|
||||||
// need to import internal/vpp. Second return is false when VPP is not
|
// need to import internal/vpp. Second return is false when VPP is not
|
||||||
@@ -272,6 +308,8 @@ func (c *Client) disconnect() {
|
|||||||
c.info = Info{}
|
c.info = Info{}
|
||||||
c.lastLBConf = nil // force re-push of lb_conf on reconnect
|
c.lastLBConf = nil // force re-push of lb_conf on reconnect
|
||||||
c.mu.Unlock()
|
c.mu.Unlock()
|
||||||
|
c.lbStatsSnap.Store(nil)
|
||||||
|
c.backendRouteSnap.Store(nil)
|
||||||
|
|
||||||
safeDisconnectAPI(apiConn)
|
safeDisconnectAPI(apiConn)
|
||||||
safeDisconnectStats(statsConn)
|
safeDisconnectStats(statsConn)
|
||||||
|
|||||||
87
internal/vpp/fibstats.go
Normal file
87
internal/vpp/fibstats.go
Normal file
@@ -0,0 +1,87 @@
|
|||||||
|
// Copyright (c) 2026, Pim van Pelt <pim@ipng.ch>
|
||||||
|
|
||||||
|
package vpp
|
||||||
|
|
||||||
|
import (
|
||||||
|
"fmt"
|
||||||
|
"net"
|
||||||
|
|
||||||
|
"go.fd.io/govpp/adapter"
|
||||||
|
"go.fd.io/govpp/binapi/ip"
|
||||||
|
"go.fd.io/govpp/binapi/ip_types"
|
||||||
|
)
|
||||||
|
|
||||||
|
// routeToStatPath is the VPP stats-segment path exposing the per-FIB-entry
|
||||||
|
// "route-to" combined counter (packets + bytes), indexed by the load-
|
||||||
|
// balance index of each FIB entry. See lbm_to_counters in
|
||||||
|
// src/vnet/dpo/load_balance.c.
|
||||||
|
const routeToStatPath = "/net/route/to"
|
||||||
|
|
||||||
|
// fibStatsIndex returns the FIB entry's stats_index (load_balance index)
|
||||||
|
// for the host prefix of addr. Uses exact=0 (longest-match) so a covering
|
||||||
|
// route is returned if there is no host-prefix entry — note this means
|
||||||
|
// two maglev entities sharing a covering route will report identical
|
||||||
|
// /net/route/to counters.
|
||||||
|
func fibStatsIndex(ch *loggedChannel, addr net.IP) (uint32, error) {
|
||||||
|
var prefix ip_types.Prefix
|
||||||
|
if v4 := addr.To4(); v4 != nil {
|
||||||
|
prefix.Address.Af = ip_types.ADDRESS_IP4
|
||||||
|
copy(prefix.Address.Un.XXX_UnionData[:4], v4)
|
||||||
|
prefix.Len = 32
|
||||||
|
} else {
|
||||||
|
prefix.Address.Af = ip_types.ADDRESS_IP6
|
||||||
|
copy(prefix.Address.Un.XXX_UnionData[:], addr.To16())
|
||||||
|
prefix.Len = 128
|
||||||
|
}
|
||||||
|
req := &ip.IPRouteLookup{
|
||||||
|
TableID: 0,
|
||||||
|
Exact: 0,
|
||||||
|
Prefix: prefix,
|
||||||
|
}
|
||||||
|
reply := &ip.IPRouteLookupReply{}
|
||||||
|
if err := ch.SendRequest(req).ReceiveReply(reply); err != nil {
|
||||||
|
return 0, fmt.Errorf("ip_route_lookup: %w", err)
|
||||||
|
}
|
||||||
|
if reply.Retval != 0 {
|
||||||
|
return 0, fmt.Errorf("ip_route_lookup: retval=%d", reply.Retval)
|
||||||
|
}
|
||||||
|
return reply.Route.StatsIndex, nil
|
||||||
|
}
|
||||||
|
|
||||||
|
// findCombinedCounter returns the CombinedCounterStat matching name, or
|
||||||
|
// nil if not found or the wrong type.
|
||||||
|
func findCombinedCounter(entries []adapter.StatEntry, name string) adapter.CombinedCounterStat {
|
||||||
|
for _, e := range entries {
|
||||||
|
if string(e.Name) != name {
|
||||||
|
continue
|
||||||
|
}
|
||||||
|
if s, ok := e.Data.(adapter.CombinedCounterStat); ok {
|
||||||
|
return s
|
||||||
|
}
|
||||||
|
}
|
||||||
|
return nil
|
||||||
|
}
|
||||||
|
|
||||||
|
// reduceCombinedCounter sums the (packets, bytes) CombinedCounter across
|
||||||
|
// workers at column i, tolerating short per-worker vectors.
|
||||||
|
func reduceCombinedCounter(s adapter.CombinedCounterStat, i int) (pkts, byts uint64) {
|
||||||
|
for _, thread := range s {
|
||||||
|
if i >= 0 && i < len(thread) {
|
||||||
|
pkts += thread[i][0]
|
||||||
|
byts += thread[i][1]
|
||||||
|
}
|
||||||
|
}
|
||||||
|
return pkts, byts
|
||||||
|
}
|
||||||
|
|
||||||
|
// vipKeyToIP extracts the VIP address from a vipKey's CIDR string. The
|
||||||
|
// second return is the prefix length. Used by the scrape path to feed
|
||||||
|
// a VIP prefix into fibStatsIndex.
|
||||||
|
func vipKeyToIP(k vipKey) (net.IP, int, error) {
|
||||||
|
ip, ipnet, err := net.ParseCIDR(k.prefix)
|
||||||
|
if err != nil {
|
||||||
|
return nil, 0, err
|
||||||
|
}
|
||||||
|
ones, _ := ipnet.Mask.Size()
|
||||||
|
return ip, ones, nil
|
||||||
|
}
|
||||||
@@ -30,6 +30,10 @@ type LBVIP struct {
|
|||||||
Dscp uint8
|
Dscp uint8
|
||||||
TargetPort uint16
|
TargetPort uint16
|
||||||
FlowTableLength uint16
|
FlowTableLength uint16
|
||||||
|
// SrcIPSticky is scraped from `show lb vips verbose` via cli_inband;
|
||||||
|
// VPP's lb_vip_details does not carry this flag. Populated by
|
||||||
|
// GetLBStateAll and GetLBStateVIP; see queryLBSticky in lbsync.go.
|
||||||
|
SrcIPSticky bool
|
||||||
ASes []LBAS
|
ASes []LBAS
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -70,12 +74,17 @@ func (c *Client) GetLBStateAll() (*LBState, error) {
|
|||||||
if err != nil {
|
if err != nil {
|
||||||
return nil, err
|
return nil, err
|
||||||
}
|
}
|
||||||
|
stickyMap, err := queryLBSticky(ch)
|
||||||
|
if err != nil {
|
||||||
|
return nil, err
|
||||||
|
}
|
||||||
for i := range vips {
|
for i := range vips {
|
||||||
ases, err := dumpASesForVIP(ch, vips[i].Protocol, vips[i].Port)
|
ases, err := dumpASesForVIP(ch, vips[i].Protocol, vips[i].Port)
|
||||||
if err != nil {
|
if err != nil {
|
||||||
return nil, err
|
return nil, err
|
||||||
}
|
}
|
||||||
vips[i].ASes = ases
|
vips[i].ASes = ases
|
||||||
|
vips[i].SrcIPSticky = stickyMap[makeVIPKey(vips[i].Prefix, vips[i].Protocol, vips[i].Port)]
|
||||||
}
|
}
|
||||||
state.VIPs = vips
|
state.VIPs = vips
|
||||||
return state, nil
|
return state, nil
|
||||||
@@ -90,7 +99,16 @@ func (c *Client) GetLBStateVIP(prefix *net.IPNet, protocol uint8, port uint16) (
|
|||||||
return nil, err
|
return nil, err
|
||||||
}
|
}
|
||||||
defer ch.Close()
|
defer ch.Close()
|
||||||
return lookupVIP(ch, prefix, protocol, port)
|
vip, err := lookupVIP(ch, prefix, protocol, port)
|
||||||
|
if err != nil || vip == nil {
|
||||||
|
return vip, err
|
||||||
|
}
|
||||||
|
stickyMap, err := queryLBSticky(ch)
|
||||||
|
if err != nil {
|
||||||
|
return nil, err
|
||||||
|
}
|
||||||
|
vip.SrcIPSticky = stickyMap[makeVIPKey(vip.Prefix, vip.Protocol, vip.Port)]
|
||||||
|
return vip, nil
|
||||||
}
|
}
|
||||||
|
|
||||||
// ---- low-level helpers (used by both Get and Sync paths) -------------------
|
// ---- low-level helpers (used by both Get and Sync paths) -------------------
|
||||||
|
|||||||
229
internal/vpp/lbstats.go
Normal file
229
internal/vpp/lbstats.go
Normal file
@@ -0,0 +1,229 @@
|
|||||||
|
// Copyright (c) 2026, Pim van Pelt <pim@ipng.ch>
|
||||||
|
|
||||||
|
package vpp
|
||||||
|
|
||||||
|
import (
|
||||||
|
"context"
|
||||||
|
"fmt"
|
||||||
|
"log/slog"
|
||||||
|
"sort"
|
||||||
|
"time"
|
||||||
|
|
||||||
|
"go.fd.io/govpp/adapter"
|
||||||
|
|
||||||
|
"git.ipng.ch/ipng/vpp-maglev/internal/metrics"
|
||||||
|
)
|
||||||
|
|
||||||
|
// lbStatsInterval is how often lbStatsLoop scrapes per-VIP and per-backend
|
||||||
|
// counters from the VPP stats segment. Hard-coded for now; the scrape
|
||||||
|
// feeds both slog.Debug lines and the Prometheus collector.
|
||||||
|
const lbStatsInterval = 5 * time.Second
|
||||||
|
|
||||||
|
// LB VIP counter names as they appear in the VPP stats segment. These
|
||||||
|
// come from lb_foreach_vip_counter in src/plugins/lb/lb.h — each entry is
|
||||||
|
// registered with only .name set, so the stats segment exposes them at
|
||||||
|
// the top level (spaces and all). Replace if the VPP plugin renames them.
|
||||||
|
const (
|
||||||
|
lbStatNextPacket = "/packet from existing sessions"
|
||||||
|
lbStatFirstPacket = "/first session packet"
|
||||||
|
lbStatUntrackedPkt = "/untracked packet"
|
||||||
|
lbStatNoServer = "/no server configured"
|
||||||
|
)
|
||||||
|
|
||||||
|
// lbStatPatterns is the full list of anchored regexes passed to DumpStats
|
||||||
|
// for one scrape cycle: the four LB-plugin SimpleCounters plus the FIB
|
||||||
|
// CombinedCounter for per-route packet+byte totals. Doing it in a single
|
||||||
|
// DumpStats avoids walking the stats segment twice.
|
||||||
|
var lbStatPatterns = []string{
|
||||||
|
`^/packet from existing sessions$`,
|
||||||
|
`^/first session packet$`,
|
||||||
|
`^/untracked packet$`,
|
||||||
|
`^/no server configured$`,
|
||||||
|
`^/net/route/to$`,
|
||||||
|
}
|
||||||
|
|
||||||
|
// lbStatsLoop periodically scrapes the LB plugin's per-VIP counters and
|
||||||
|
// the FIB's /net/route/to combined counter for both VIPs and backends,
|
||||||
|
// publishes the results to the atomic snapshots read by Prometheus, and
|
||||||
|
// emits one slog.Debug line per VIP and per backend. Exits when ctx is
|
||||||
|
// cancelled.
|
||||||
|
func (c *Client) lbStatsLoop(ctx context.Context) {
|
||||||
|
ticker := time.NewTicker(lbStatsInterval)
|
||||||
|
defer ticker.Stop()
|
||||||
|
for {
|
||||||
|
select {
|
||||||
|
case <-ctx.Done():
|
||||||
|
return
|
||||||
|
case <-ticker.C:
|
||||||
|
}
|
||||||
|
if err := c.scrapeLBStats(); err != nil {
|
||||||
|
slog.Debug("vpp-lb-stats-error", "err", err)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// scrapeLBStats runs one full scrape cycle: discover VIPs via cli_inband,
|
||||||
|
// look up FIB stats_indices for every VIP and every backend via
|
||||||
|
// ip_route_lookup, dump all five stats-segment paths in one DumpStats
|
||||||
|
// call, and reduce the counters into the two published snapshots.
|
||||||
|
func (c *Client) scrapeLBStats() error {
|
||||||
|
if !c.IsConnected() {
|
||||||
|
return nil
|
||||||
|
}
|
||||||
|
ch, err := c.apiChannel()
|
||||||
|
if err != nil {
|
||||||
|
return err
|
||||||
|
}
|
||||||
|
defer ch.Close()
|
||||||
|
|
||||||
|
snap, err := queryLBVIPSnapshot(ch)
|
||||||
|
if err != nil {
|
||||||
|
return fmt.Errorf("query vip snapshot: %w", err)
|
||||||
|
}
|
||||||
|
|
||||||
|
// Resolve FIB stats_indices for every VIP. The LB plugin installs a
|
||||||
|
// host-prefix FIB entry per VIP (lb.c:990), so the exact=0 lookup
|
||||||
|
// lands on it in the common case. See fibStatsIndex for the exact=0
|
||||||
|
// caveat when a covering route shadows the host prefix.
|
||||||
|
vipStatsIdx := make(map[vipKey]uint32, len(snap))
|
||||||
|
for k := range snap {
|
||||||
|
addr, _, err := vipKeyToIP(k)
|
||||||
|
if err != nil {
|
||||||
|
continue
|
||||||
|
}
|
||||||
|
idx, err := fibStatsIndex(ch, addr)
|
||||||
|
if err != nil {
|
||||||
|
slog.Debug("vpp-vip-route-lookup-failed",
|
||||||
|
"prefix", k.prefix, "err", err)
|
||||||
|
continue
|
||||||
|
}
|
||||||
|
vipStatsIdx[k] = idx
|
||||||
|
}
|
||||||
|
|
||||||
|
// Resolve FIB stats_indices for every backend in the running config.
|
||||||
|
type backendLookup struct {
|
||||||
|
name, addr string
|
||||||
|
index uint32
|
||||||
|
}
|
||||||
|
var backends []backendLookup
|
||||||
|
if src := c.getStateSource(); src != nil {
|
||||||
|
if cfg := src.Config(); cfg != nil {
|
||||||
|
names := make([]string, 0, len(cfg.Backends))
|
||||||
|
for name := range cfg.Backends {
|
||||||
|
names = append(names, name)
|
||||||
|
}
|
||||||
|
sort.Strings(names) // stable snapshot order
|
||||||
|
for _, name := range names {
|
||||||
|
b := cfg.Backends[name]
|
||||||
|
if b.Address == nil {
|
||||||
|
continue
|
||||||
|
}
|
||||||
|
idx, err := fibStatsIndex(ch, b.Address)
|
||||||
|
if err != nil {
|
||||||
|
slog.Debug("vpp-backend-route-lookup-failed",
|
||||||
|
"backend", name, "address", b.Address.String(), "err", err)
|
||||||
|
continue
|
||||||
|
}
|
||||||
|
backends = append(backends, backendLookup{
|
||||||
|
name: name, addr: b.Address.String(), index: idx,
|
||||||
|
})
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
c.mu.Lock()
|
||||||
|
sc := c.statsClient
|
||||||
|
c.mu.Unlock()
|
||||||
|
if sc == nil {
|
||||||
|
return nil
|
||||||
|
}
|
||||||
|
entries, err := sc.DumpStats(lbStatPatterns...)
|
||||||
|
if err != nil {
|
||||||
|
return fmt.Errorf("dump stats: %w", err)
|
||||||
|
}
|
||||||
|
|
||||||
|
nextPkt := findSimpleCounter(entries, lbStatNextPacket)
|
||||||
|
firstPkt := findSimpleCounter(entries, lbStatFirstPacket)
|
||||||
|
untracked := findSimpleCounter(entries, lbStatUntrackedPkt)
|
||||||
|
noServer := findSimpleCounter(entries, lbStatNoServer)
|
||||||
|
routeTo := findCombinedCounter(entries, routeToStatPath)
|
||||||
|
|
||||||
|
// ---- VIP snapshot ----
|
||||||
|
vipOut := make([]metrics.VIPStatEntry, 0, len(snap))
|
||||||
|
for key, info := range snap {
|
||||||
|
lbIdx := int(info.index)
|
||||||
|
entry := metrics.VIPStatEntry{
|
||||||
|
Prefix: key.prefix,
|
||||||
|
Protocol: protocolName(key.protocol),
|
||||||
|
Port: key.port,
|
||||||
|
NextPkt: reduceSimpleCounter(nextPkt, lbIdx),
|
||||||
|
FirstPkt: reduceSimpleCounter(firstPkt, lbIdx),
|
||||||
|
Untracked: reduceSimpleCounter(untracked, lbIdx),
|
||||||
|
NoServer: reduceSimpleCounter(noServer, lbIdx),
|
||||||
|
}
|
||||||
|
if fibIdx, ok := vipStatsIdx[key]; ok {
|
||||||
|
entry.Packets, entry.Bytes = reduceCombinedCounter(routeTo, int(fibIdx))
|
||||||
|
}
|
||||||
|
vipOut = append(vipOut, entry)
|
||||||
|
slog.Debug("vpp-vip-stats",
|
||||||
|
"prefix", entry.Prefix,
|
||||||
|
"protocol", entry.Protocol,
|
||||||
|
"port", entry.Port,
|
||||||
|
"next-packet", entry.NextPkt,
|
||||||
|
"first-packet", entry.FirstPkt,
|
||||||
|
"untracked", entry.Untracked,
|
||||||
|
"no-server", entry.NoServer,
|
||||||
|
"packets", entry.Packets,
|
||||||
|
"bytes", entry.Bytes,
|
||||||
|
)
|
||||||
|
}
|
||||||
|
c.lbStatsSnap.Store(&vipOut)
|
||||||
|
|
||||||
|
// ---- backend snapshot ----
|
||||||
|
backendOut := make([]metrics.BackendRouteStat, 0, len(backends))
|
||||||
|
for _, l := range backends {
|
||||||
|
pkts, byts := reduceCombinedCounter(routeTo, int(l.index))
|
||||||
|
entry := metrics.BackendRouteStat{
|
||||||
|
Backend: l.name,
|
||||||
|
Address: l.addr,
|
||||||
|
Packets: pkts,
|
||||||
|
Bytes: byts,
|
||||||
|
}
|
||||||
|
backendOut = append(backendOut, entry)
|
||||||
|
slog.Debug("vpp-backend-route-stats",
|
||||||
|
"backend", entry.Backend,
|
||||||
|
"address", entry.Address,
|
||||||
|
"packets", entry.Packets,
|
||||||
|
"bytes", entry.Bytes,
|
||||||
|
)
|
||||||
|
}
|
||||||
|
c.backendRouteSnap.Store(&backendOut)
|
||||||
|
return nil
|
||||||
|
}
|
||||||
|
|
||||||
|
// findSimpleCounter returns the SimpleCounterStat matching name, or nil if
|
||||||
|
// not found. Stats segment names are byte slices, so we compare as string.
|
||||||
|
func findSimpleCounter(entries []adapter.StatEntry, name string) adapter.SimpleCounterStat {
|
||||||
|
for _, e := range entries {
|
||||||
|
if string(e.Name) != name {
|
||||||
|
continue
|
||||||
|
}
|
||||||
|
if s, ok := e.Data.(adapter.SimpleCounterStat); ok {
|
||||||
|
return s
|
||||||
|
}
|
||||||
|
}
|
||||||
|
return nil
|
||||||
|
}
|
||||||
|
|
||||||
|
// reduceSimpleCounter sums per-worker values at column i. It tolerates a
|
||||||
|
// short per-worker vector (which can happen right after a VIP is added,
|
||||||
|
// before a worker has observed it) by skipping out-of-range rows.
|
||||||
|
func reduceSimpleCounter(s adapter.SimpleCounterStat, i int) uint64 {
|
||||||
|
var sum uint64
|
||||||
|
for _, thread := range s {
|
||||||
|
if i >= 0 && i < len(thread) {
|
||||||
|
sum += uint64(thread[i])
|
||||||
|
}
|
||||||
|
}
|
||||||
|
return sum
|
||||||
|
}
|
||||||
@@ -7,6 +7,11 @@ import (
|
|||||||
"fmt"
|
"fmt"
|
||||||
"log/slog"
|
"log/slog"
|
||||||
"net"
|
"net"
|
||||||
|
"regexp"
|
||||||
|
"strconv"
|
||||||
|
"strings"
|
||||||
|
|
||||||
|
"go.fd.io/govpp/binapi/vlib"
|
||||||
|
|
||||||
"git.ipng.ch/ipng/vpp-maglev/internal/config"
|
"git.ipng.ch/ipng/vpp-maglev/internal/config"
|
||||||
"git.ipng.ch/ipng/vpp-maglev/internal/health"
|
"git.ipng.ch/ipng/vpp-maglev/internal/health"
|
||||||
@@ -32,6 +37,7 @@ type desiredVIP struct {
|
|||||||
Prefix *net.IPNet
|
Prefix *net.IPNet
|
||||||
Protocol uint8 // 6=TCP, 17=UDP, 255=any
|
Protocol uint8 // 6=TCP, 17=UDP, 255=any
|
||||||
Port uint16
|
Port uint16
|
||||||
|
SrcIPSticky bool // lb_add_del_vip_v2.src_ip_sticky
|
||||||
ASes map[string]desiredAS // keyed by AS IP string
|
ASes map[string]desiredAS // keyed by AS IP string
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -133,10 +139,12 @@ func (c *Client) SyncLBStateAll(cfg *config.Config) error {
|
|||||||
for k, d := range desByKey {
|
for k, d := range desByKey {
|
||||||
cur, existing := curByKey[k]
|
cur, existing := curByKey[k]
|
||||||
var curPtr *LBVIP
|
var curPtr *LBVIP
|
||||||
|
var curSticky bool
|
||||||
if existing {
|
if existing {
|
||||||
curPtr = &cur
|
curPtr = &cur
|
||||||
|
curSticky = cur.SrcIPSticky
|
||||||
}
|
}
|
||||||
if err := reconcileVIP(ch, d, curPtr, &st); err != nil {
|
if err := reconcileVIP(ch, d, curPtr, curSticky, &st); err != nil {
|
||||||
return err
|
return err
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
@@ -189,8 +197,13 @@ func (c *Client) SyncLBStateVIP(cfg *config.Config, feName string) error {
|
|||||||
"protocol", protocolName(d.Protocol),
|
"protocol", protocolName(d.Protocol),
|
||||||
"port", d.Port)
|
"port", d.Port)
|
||||||
|
|
||||||
|
var curSticky bool
|
||||||
|
if cur != nil {
|
||||||
|
curSticky = cur.SrcIPSticky
|
||||||
|
}
|
||||||
|
|
||||||
var st syncStats
|
var st syncStats
|
||||||
if err := reconcileVIP(ch, d, cur, &st); err != nil {
|
if err := reconcileVIP(ch, d, cur, curSticky, &st); err != nil {
|
||||||
return err
|
return err
|
||||||
}
|
}
|
||||||
recordSyncStats("vip", &st)
|
recordSyncStats("vip", &st)
|
||||||
@@ -207,7 +220,14 @@ func (c *Client) SyncLBStateVIP(cfg *config.Config, feName string) error {
|
|||||||
// reconcileVIP brings one VIP's state in VPP into alignment with the desired
|
// reconcileVIP brings one VIP's state in VPP into alignment with the desired
|
||||||
// state. If cur is nil the VIP is added from scratch; otherwise ASes are
|
// state. If cur is nil the VIP is added from scratch; otherwise ASes are
|
||||||
// added, removed, and reweighted individually. Stats are accumulated into st.
|
// added, removed, and reweighted individually. Stats are accumulated into st.
|
||||||
func reconcileVIP(ch *loggedChannel, d desiredVIP, cur *LBVIP, st *syncStats) error {
|
//
|
||||||
|
// curSticky is the src_ip_sticky flag VPP currently has programmed for this
|
||||||
|
// VIP, as scraped by queryLBSticky. Callers only consult it when cur != nil,
|
||||||
|
// and the scrape reads the same live lb_main.vips pool as lb_vip_dump, so a
|
||||||
|
// matching entry is always present. When the flag differs from the desired
|
||||||
|
// value, the VIP is torn down (ASes del+flushed, VIP deleted) and recreated
|
||||||
|
// — VPP has no API to mutate src_ip_sticky on an existing VIP.
|
||||||
|
func reconcileVIP(ch *loggedChannel, d desiredVIP, cur *LBVIP, curSticky bool, st *syncStats) error {
|
||||||
if cur == nil {
|
if cur == nil {
|
||||||
if err := addVIP(ch, d); err != nil {
|
if err := addVIP(ch, d); err != nil {
|
||||||
return err
|
return err
|
||||||
@@ -222,6 +242,30 @@ func reconcileVIP(ch *loggedChannel, d desiredVIP, cur *LBVIP, st *syncStats) er
|
|||||||
return nil
|
return nil
|
||||||
}
|
}
|
||||||
|
|
||||||
|
if curSticky != d.SrcIPSticky {
|
||||||
|
slog.Info("vpp-lbsync-vip-recreate",
|
||||||
|
"prefix", d.Prefix.String(),
|
||||||
|
"protocol", protocolName(d.Protocol),
|
||||||
|
"port", d.Port,
|
||||||
|
"reason", "src-ip-sticky-changed",
|
||||||
|
"from", curSticky,
|
||||||
|
"to", d.SrcIPSticky)
|
||||||
|
if err := removeVIP(ch, *cur, st); err != nil {
|
||||||
|
return err
|
||||||
|
}
|
||||||
|
if err := addVIP(ch, d); err != nil {
|
||||||
|
return err
|
||||||
|
}
|
||||||
|
st.vipAdd++
|
||||||
|
for _, as := range d.ASes {
|
||||||
|
if err := addAS(ch, d.Prefix, d.Protocol, d.Port, as); err != nil {
|
||||||
|
return err
|
||||||
|
}
|
||||||
|
st.asAdd++
|
||||||
|
}
|
||||||
|
return nil
|
||||||
|
}
|
||||||
|
|
||||||
// VIP exists in both — reconcile ASes.
|
// VIP exists in both — reconcile ASes.
|
||||||
curASes := make(map[string]LBAS, len(cur.ASes))
|
curASes := make(map[string]LBAS, len(cur.ASes))
|
||||||
for _, a := range cur.ASes {
|
for _, a := range cur.ASes {
|
||||||
@@ -309,22 +353,12 @@ func desiredFromFrontend(cfg *config.Config, fe config.Frontend, src StateSource
|
|||||||
Prefix: &net.IPNet{IP: fe.Address, Mask: net.CIDRMask(bits, bits)},
|
Prefix: &net.IPNet{IP: fe.Address, Mask: net.CIDRMask(bits, bits)},
|
||||||
Protocol: protocolFromConfig(fe.Protocol),
|
Protocol: protocolFromConfig(fe.Protocol),
|
||||||
Port: fe.Port,
|
Port: fe.Port,
|
||||||
|
SrcIPSticky: fe.SrcIPSticky,
|
||||||
ASes: make(map[string]desiredAS),
|
ASes: make(map[string]desiredAS),
|
||||||
}
|
}
|
||||||
|
|
||||||
// Snapshot backend states once so the active-pool computation and the
|
states := snapshotStates(fe, src)
|
||||||
// per-backend weight assignment see a consistent view.
|
activePool := health.ActivePoolIndex(fe, states)
|
||||||
states := make(map[string]health.State)
|
|
||||||
for _, pool := range fe.Pools {
|
|
||||||
for bName := range pool.Backends {
|
|
||||||
if s, ok := src.BackendState(bName); ok {
|
|
||||||
states[bName] = s
|
|
||||||
} else {
|
|
||||||
states[bName] = health.StateUnknown
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
activePool := activePoolIndex(fe, states)
|
|
||||||
|
|
||||||
for poolIdx, pool := range fe.Pools {
|
for poolIdx, pool := range fe.Pools {
|
||||||
for bName, pb := range pool.Backends {
|
for bName, pb := range pool.Backends {
|
||||||
@@ -337,12 +371,13 @@ func desiredFromFrontend(cfg *config.Config, fe config.Frontend, src StateSource
|
|||||||
// weight=0 — they must not be deleted, otherwise a subsequent
|
// weight=0 — they must not be deleted, otherwise a subsequent
|
||||||
// enable has to re-add them and existing flow-table state (if
|
// enable has to re-add them and existing flow-table state (if
|
||||||
// any) is lost. The state machine drives what weight to set
|
// any) is lost. The state machine drives what weight to set
|
||||||
// via asFromBackend; we never filter on b.Enabled here.
|
// via health.BackendEffectiveWeight; we never filter on
|
||||||
|
// b.Enabled here.
|
||||||
addr := b.Address.String()
|
addr := b.Address.String()
|
||||||
if _, already := d.ASes[addr]; already {
|
if _, already := d.ASes[addr]; already {
|
||||||
continue
|
continue
|
||||||
}
|
}
|
||||||
w, flush := asFromBackend(poolIdx, activePool, states[bName], pb.Weight)
|
w, flush := health.BackendEffectiveWeight(poolIdx, activePool, states[bName], pb.Weight)
|
||||||
d.ASes[addr] = desiredAS{
|
d.ASes[addr] = desiredAS{
|
||||||
Address: b.Address,
|
Address: b.Address,
|
||||||
Weight: w,
|
Weight: w,
|
||||||
@@ -354,14 +389,17 @@ func desiredFromFrontend(cfg *config.Config, fe config.Frontend, src StateSource
|
|||||||
}
|
}
|
||||||
|
|
||||||
// EffectiveWeights returns the current effective VPP weight for every backend
|
// EffectiveWeights returns the current effective VPP weight for every backend
|
||||||
// in every pool of fe, keyed by poolIdx and backend name. It runs the same
|
// in every pool of fe, keyed by poolIdx and backend name. Intended for
|
||||||
// failover + state-aware weight calculation that the sync path uses, but
|
// observability (e.g. the GetFrontend gRPC handler) — the sync path and the
|
||||||
// produces a plain map instead of desiredVIP — intended for observability
|
// checker's frontend-state logic use health.EffectiveWeights directly.
|
||||||
// (e.g. the GetFrontend gRPC handler) and for robot-testing the failover
|
|
||||||
// logic without needing a running VPP instance.
|
|
||||||
//
|
|
||||||
// The returned map layout is: result[poolIdx][backendName] = effective weight.
|
|
||||||
func EffectiveWeights(fe config.Frontend, src StateSource) map[int]map[string]uint8 {
|
func EffectiveWeights(fe config.Frontend, src StateSource) map[int]map[string]uint8 {
|
||||||
|
return health.EffectiveWeights(fe, snapshotStates(fe, src))
|
||||||
|
}
|
||||||
|
|
||||||
|
// snapshotStates builds a state map for every backend referenced by fe. It
|
||||||
|
// takes one read-lock per backend via the StateSource interface, so the
|
||||||
|
// caller gets a consistent view to feed into the pure health helpers.
|
||||||
|
func snapshotStates(fe config.Frontend, src StateSource) map[string]health.State {
|
||||||
states := make(map[string]health.State)
|
states := make(map[string]health.State)
|
||||||
for _, pool := range fe.Pools {
|
for _, pool := range fe.Pools {
|
||||||
for bName := range pool.Backends {
|
for bName := range pool.Backends {
|
||||||
@@ -372,75 +410,7 @@ func EffectiveWeights(fe config.Frontend, src StateSource) map[int]map[string]ui
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
activePool := activePoolIndex(fe, states)
|
return states
|
||||||
|
|
||||||
out := make(map[int]map[string]uint8, len(fe.Pools))
|
|
||||||
for poolIdx, pool := range fe.Pools {
|
|
||||||
out[poolIdx] = make(map[string]uint8, len(pool.Backends))
|
|
||||||
for bName, pb := range pool.Backends {
|
|
||||||
w, _ := asFromBackend(poolIdx, activePool, states[bName], pb.Weight)
|
|
||||||
out[poolIdx][bName] = w
|
|
||||||
}
|
|
||||||
}
|
|
||||||
return out
|
|
||||||
}
|
|
||||||
|
|
||||||
// activePoolIndex returns the index of the first pool in fe that contains at
|
|
||||||
// least one backend currently in StateUp. This is the priority-failover
|
|
||||||
// selector: pool[0] is the primary, pool[1] is the first fallback, and so on.
|
|
||||||
// As long as pool[0] has any up backend, it stays active. When every pool[0]
|
|
||||||
// backend leaves StateUp (down, paused, disabled, unknown), pool[1] is
|
|
||||||
// promoted — and so on for further fallback tiers. When no pool has any up
|
|
||||||
// backend, returns 0 (the return value is unobservable in that case since
|
|
||||||
// every backend maps to weight 0 regardless of the active pool).
|
|
||||||
func activePoolIndex(fe config.Frontend, states map[string]health.State) int {
|
|
||||||
for i, pool := range fe.Pools {
|
|
||||||
for bName := range pool.Backends {
|
|
||||||
if states[bName] == health.StateUp {
|
|
||||||
return i
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
return 0
|
|
||||||
}
|
|
||||||
|
|
||||||
// asFromBackend is the pure mapping from (pool index, active pool, backend
|
|
||||||
// state, config weight) to the desired VPP AS weight and flush hint. This is
|
|
||||||
// the single source of truth for the state → dataplane rule — every LB change
|
|
||||||
// flows through this function.
|
|
||||||
//
|
|
||||||
// A backend gets its configured weight iff it is up AND belongs to the
|
|
||||||
// currently-active pool. Every other case yields weight 0. The only
|
|
||||||
// state that produces flush=true is disabled.
|
|
||||||
//
|
|
||||||
// state in active pool not in active pool flush
|
|
||||||
// -------- -------------- ------------------- -----
|
|
||||||
// unknown 0 0 no
|
|
||||||
// up configured 0 (standby) no
|
|
||||||
// down 0 0 no
|
|
||||||
// paused 0 0 no
|
|
||||||
// disabled 0 0 yes
|
|
||||||
// removed handled separately (AS deleted via delAS)
|
|
||||||
//
|
|
||||||
// Flush semantics: flush=true means "if the AS currently has a non-zero
|
|
||||||
// weight in VPP, drop its existing flow-table entries when setting weight
|
|
||||||
// to 0". The reconciler only acts on flush when transitioning (current
|
|
||||||
// weight > 0), so steady-state syncs never re-flush. Failover demotion
|
|
||||||
// (e.g. pool[1] up→standby when pool[0] recovers) does NOT flush — we
|
|
||||||
// let those sessions drain naturally.
|
|
||||||
func asFromBackend(poolIdx, activePool int, state health.State, cfgWeight int) (weight uint8, flush bool) {
|
|
||||||
switch state {
|
|
||||||
case health.StateUp:
|
|
||||||
if poolIdx == activePool {
|
|
||||||
return clampWeight(cfgWeight), false
|
|
||||||
}
|
|
||||||
return 0, false
|
|
||||||
case health.StateDisabled:
|
|
||||||
return 0, true
|
|
||||||
default:
|
|
||||||
// unknown, down, paused: off, drain existing flows naturally.
|
|
||||||
return 0, false
|
|
||||||
}
|
|
||||||
}
|
}
|
||||||
|
|
||||||
// ---- API call helpers ------------------------------------------------------
|
// ---- API call helpers ------------------------------------------------------
|
||||||
@@ -461,6 +431,7 @@ func addVIP(ch *loggedChannel, d desiredVIP) error {
|
|||||||
Encap: encap,
|
Encap: encap,
|
||||||
Type: lb_types.LB_API_SRV_TYPE_CLUSTERIP,
|
Type: lb_types.LB_API_SRV_TYPE_CLUSTERIP,
|
||||||
NewFlowsTableLength: defaultFlowsTableLength,
|
NewFlowsTableLength: defaultFlowsTableLength,
|
||||||
|
SrcIPSticky: d.SrcIPSticky,
|
||||||
IsDel: false,
|
IsDel: false,
|
||||||
}
|
}
|
||||||
reply := &lb.LbAddDelVipV2Reply{}
|
reply := &lb.LbAddDelVipV2Reply{}
|
||||||
@@ -474,7 +445,8 @@ func addVIP(ch *loggedChannel, d desiredVIP) error {
|
|||||||
"prefix", d.Prefix.String(),
|
"prefix", d.Prefix.String(),
|
||||||
"protocol", protocolName(d.Protocol),
|
"protocol", protocolName(d.Protocol),
|
||||||
"port", d.Port,
|
"port", d.Port,
|
||||||
"encap", encapName(encap))
|
"encap", encapName(encap),
|
||||||
|
"src-ip-sticky", d.SrcIPSticky)
|
||||||
return nil
|
return nil
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -574,6 +546,136 @@ func setASWeight(ch *loggedChannel, prefix *net.IPNet, protocol uint8, port uint
|
|||||||
return nil
|
return nil
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// ---- VIP snapshot scrape ---------------------------------------------------
|
||||||
|
|
||||||
|
// TEMPORARY WORKAROUND: VPP's lb_vip_dump / lb_vip_details message does not
|
||||||
|
// return the src_ip_sticky flag, and lb_vip_details also doesn't expose the
|
||||||
|
// LB plugin's pool index — which is what the stats segment uses to key
|
||||||
|
// per-VIP counters. We work around both by running `show lb vips verbose`
|
||||||
|
// via the cli_inband API and parsing the human-readable output. This is
|
||||||
|
// slow and fragile (the format is not a stable API) and must be replaced
|
||||||
|
// once VPP ships an lb_vip_v2_dump that includes these fields.
|
||||||
|
|
||||||
|
// lbVIPSnapshot holds the per-VIP facts that the scrape recovers. `index`
|
||||||
|
// is the LB pool index (`vip - lbm->vips` in lb.c) used as the second
|
||||||
|
// dimension of the vip_counters SimpleCounterStat in the stats segment.
|
||||||
|
type lbVIPSnapshot struct {
|
||||||
|
index uint32
|
||||||
|
sticky bool
|
||||||
|
}
|
||||||
|
|
||||||
|
// queryLBVIPSnapshot runs `show lb vips verbose` and parses it into a map
|
||||||
|
// keyed by vipKey. The returned error is non-nil only when the cli_inband
|
||||||
|
// RPC itself fails.
|
||||||
|
func queryLBVIPSnapshot(ch *loggedChannel) (map[vipKey]lbVIPSnapshot, error) {
|
||||||
|
req := &vlib.CliInband{Cmd: "show lb vips verbose"}
|
||||||
|
reply := &vlib.CliInbandReply{}
|
||||||
|
if err := ch.SendRequest(req).ReceiveReply(reply); err != nil {
|
||||||
|
return nil, fmt.Errorf("cli_inband %q: %w", req.Cmd, err)
|
||||||
|
}
|
||||||
|
if reply.Retval != 0 {
|
||||||
|
return nil, fmt.Errorf("cli_inband %q: retval=%d", req.Cmd, reply.Retval)
|
||||||
|
}
|
||||||
|
return parseLBVIPSnapshot(reply.Reply), nil
|
||||||
|
}
|
||||||
|
|
||||||
|
// queryLBSticky is a thin projection of queryLBVIPSnapshot for callers that
|
||||||
|
// only care about the src_ip_sticky flag (the sync path).
|
||||||
|
func queryLBSticky(ch *loggedChannel) (map[vipKey]bool, error) {
|
||||||
|
snap, err := queryLBVIPSnapshot(ch)
|
||||||
|
if err != nil {
|
||||||
|
return nil, err
|
||||||
|
}
|
||||||
|
out := make(map[vipKey]bool, len(snap))
|
||||||
|
for k, v := range snap {
|
||||||
|
out[k] = v.sticky
|
||||||
|
}
|
||||||
|
return out, nil
|
||||||
|
}
|
||||||
|
|
||||||
|
// lbVIPHeaderRe matches the first line of each VIP block in the output of
|
||||||
|
// `show lb vips verbose`. VPP produces lines like:
|
||||||
|
//
|
||||||
|
// ip4-gre4 [1] 192.0.2.1/32 src_ip_sticky
|
||||||
|
// ip6-gre6 [2] 2001:db8::1/128
|
||||||
|
//
|
||||||
|
// Capture groups: 1 = pool index, 2 = prefix (CIDR), 3 = " src_ip_sticky" when present.
|
||||||
|
var lbVIPHeaderRe = regexp.MustCompile(
|
||||||
|
`\b(?:ip4-gre4|ip6-gre6|ip4-gre6|ip6-gre4|ip4-l3dsr|ip4-nat4|ip6-nat6)\s+\[(\d+)\]\s+(\S+)(\s+src_ip_sticky)?`,
|
||||||
|
)
|
||||||
|
|
||||||
|
// lbVIPProtoRe matches the `protocol:<n> port:<n>` sub-line that appears
|
||||||
|
// under each non-all-port VIP block.
|
||||||
|
var lbVIPProtoRe = regexp.MustCompile(`protocol:(\d+)\s+port:(\d+)`)
|
||||||
|
|
||||||
|
// parseLBVIPSnapshot turns the reply text of `show lb vips verbose` into a
|
||||||
|
// map of vipKey → lbVIPSnapshot. It walks lines sequentially: each header
|
||||||
|
// line starts a new VIP, and the following `protocol:<p> port:<port>` line
|
||||||
|
// (if any) refines the key. All-port VIPs have no protocol/port sub-line
|
||||||
|
// and are recorded with protocol=255 and port=0 — the same key format used
|
||||||
|
// by the rest of the sync path (see protocolFromConfig).
|
||||||
|
func parseLBVIPSnapshot(text string) map[vipKey]lbVIPSnapshot {
|
||||||
|
out := make(map[vipKey]lbVIPSnapshot)
|
||||||
|
var havePending bool
|
||||||
|
var curPrefix string
|
||||||
|
var curIndex uint32
|
||||||
|
var curSticky bool
|
||||||
|
var curProto uint8 = 255
|
||||||
|
var curPort uint16
|
||||||
|
|
||||||
|
flush := func() {
|
||||||
|
if !havePending || curPrefix == "" {
|
||||||
|
return
|
||||||
|
}
|
||||||
|
out[vipKey{prefix: curPrefix, protocol: curProto, port: curPort}] = lbVIPSnapshot{
|
||||||
|
index: curIndex,
|
||||||
|
sticky: curSticky,
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
for _, line := range strings.Split(text, "\n") {
|
||||||
|
if m := lbVIPHeaderRe.FindStringSubmatch(line); m != nil {
|
||||||
|
flush()
|
||||||
|
havePending = true
|
||||||
|
if idx, err := strconv.ParseUint(m[1], 10, 32); err == nil {
|
||||||
|
curIndex = uint32(idx)
|
||||||
|
} else {
|
||||||
|
curIndex = 0
|
||||||
|
}
|
||||||
|
curPrefix = canonicalCIDR(m[2])
|
||||||
|
curSticky = m[3] != ""
|
||||||
|
curProto = 255
|
||||||
|
curPort = 0
|
||||||
|
continue
|
||||||
|
}
|
||||||
|
if !havePending {
|
||||||
|
continue
|
||||||
|
}
|
||||||
|
if m := lbVIPProtoRe.FindStringSubmatch(line); m != nil {
|
||||||
|
if p, err := strconv.Atoi(m[1]); err == nil {
|
||||||
|
curProto = uint8(p)
|
||||||
|
}
|
||||||
|
if p, err := strconv.Atoi(m[2]); err == nil {
|
||||||
|
curPort = uint16(p)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
flush()
|
||||||
|
return out
|
||||||
|
}
|
||||||
|
|
||||||
|
// canonicalCIDR parses a CIDR string and returns it in Go's canonical
|
||||||
|
// net.IPNet.String() form so it matches the keys produced by makeVIPKey.
|
||||||
|
// Returns the input unchanged if parsing fails — the header regex should
|
||||||
|
// have validated it, but a mismatch shouldn't panic the sync path.
|
||||||
|
func canonicalCIDR(s string) string {
|
||||||
|
_, ipnet, err := net.ParseCIDR(s)
|
||||||
|
if err != nil {
|
||||||
|
return s
|
||||||
|
}
|
||||||
|
return ipnet.String()
|
||||||
|
}
|
||||||
|
|
||||||
// ---- utility ---------------------------------------------------------------
|
// ---- utility ---------------------------------------------------------------
|
||||||
|
|
||||||
func makeVIPKey(prefix *net.IPNet, protocol uint8, port uint16) vipKey {
|
func makeVIPKey(prefix *net.IPNet, protocol uint8, port uint16) vipKey {
|
||||||
@@ -618,13 +720,3 @@ func encapName(e lb_types.LbEncapType) string {
|
|||||||
}
|
}
|
||||||
return fmt.Sprintf("%d", e)
|
return fmt.Sprintf("%d", e)
|
||||||
}
|
}
|
||||||
|
|
||||||
func clampWeight(w int) uint8 {
|
|
||||||
if w < 0 {
|
|
||||||
return 0
|
|
||||||
}
|
|
||||||
if w > 100 {
|
|
||||||
return 100
|
|
||||||
}
|
|
||||||
return uint8(w)
|
|
||||||
}
|
|
||||||
|
|||||||
@@ -10,141 +10,45 @@ import (
|
|||||||
"git.ipng.ch/ipng/vpp-maglev/internal/health"
|
"git.ipng.ch/ipng/vpp-maglev/internal/health"
|
||||||
)
|
)
|
||||||
|
|
||||||
// TestAsFromBackend locks down the state → (weight, flush) truth table.
|
// TestParseLBVIPSnapshot pins the parser for `show lb vips verbose` output.
|
||||||
// This is the single source of truth for how maglevd decides what to
|
// The text below is a synthetic sample that mirrors format_lb_vip_detailed
|
||||||
// program into VPP for each backend state. If this test needs updating
|
// in src/plugins/lb/lb.c: a header line per VIP optionally carrying the
|
||||||
// the behavior has deliberately changed.
|
// src_ip_sticky token, followed by a protocol:/port: sub-line for non all-
|
||||||
func TestAsFromBackend(t *testing.T) {
|
// port VIPs. If VPP changes this format the test will fail loudly — the
|
||||||
cases := []struct {
|
// scrape is a temporary workaround until lb_vip_v2_dump exists.
|
||||||
name string
|
func TestParseLBVIPSnapshot(t *testing.T) {
|
||||||
poolIdx int
|
text := ` ip4-gre4 [1] 192.0.2.1/32 src_ip_sticky
|
||||||
activePool int
|
new_size:1024
|
||||||
state health.State
|
protocol:6 port:80
|
||||||
cfgWeight int
|
counters:
|
||||||
wantWeight uint8
|
ip4-gre4 [2] 192.0.2.2/32
|
||||||
wantFlush bool
|
new_size:1024
|
||||||
}{
|
protocol:17 port:53
|
||||||
// up in active pool → configured weight, no flush
|
ip6-gre6 [3] 2001:db8::1/128 src_ip_sticky
|
||||||
{"up active w100", 0, 0, health.StateUp, 100, 100, false},
|
new_size:1024
|
||||||
{"up active w50", 0, 0, health.StateUp, 50, 50, false},
|
protocol:6 port:443
|
||||||
{"up active w0", 0, 0, health.StateUp, 0, 0, false},
|
ip4-gre4 [4] 192.0.2.3/32
|
||||||
{"up active clamp-high", 0, 0, health.StateUp, 150, 100, false},
|
new_size:1024
|
||||||
{"up active clamp-low", 0, 0, health.StateUp, -5, 0, false},
|
`
|
||||||
|
got := parseLBVIPSnapshot(text)
|
||||||
// up in non-active pool → standby (weight 0), no flush
|
want := map[vipKey]lbVIPSnapshot{
|
||||||
{"up standby pool0 active=1", 0, 1, health.StateUp, 100, 0, false},
|
{prefix: "192.0.2.1/32", protocol: 6, port: 80}: {index: 1, sticky: true},
|
||||||
{"up standby pool1 active=0", 1, 0, health.StateUp, 100, 0, false},
|
{prefix: "192.0.2.2/32", protocol: 17, port: 53}: {index: 2, sticky: false},
|
||||||
{"up standby pool2 active=0", 2, 0, health.StateUp, 100, 0, false},
|
{prefix: "2001:db8::1/128", protocol: 6, port: 443}: {index: 3, sticky: true},
|
||||||
|
{prefix: "192.0.2.3/32", protocol: 255, port: 0}: {index: 4, sticky: false}, // all-port VIP
|
||||||
// up in secondary, promoted because pool[1] is now active
|
|
||||||
{"up failover pool1 active=1", 1, 1, health.StateUp, 100, 100, false},
|
|
||||||
|
|
||||||
// unknown → off, drain
|
|
||||||
{"unknown pool0 active=0", 0, 0, health.StateUnknown, 100, 0, false},
|
|
||||||
{"unknown pool1 active=0", 1, 0, health.StateUnknown, 100, 0, false},
|
|
||||||
|
|
||||||
// down → off, drain (probe might be wrong)
|
|
||||||
{"down pool0 active=0", 0, 0, health.StateDown, 100, 0, false},
|
|
||||||
{"down pool1 active=1", 1, 1, health.StateDown, 100, 0, false},
|
|
||||||
|
|
||||||
// paused → off, drain (graceful maintenance)
|
|
||||||
{"paused pool0 active=0", 0, 0, health.StatePaused, 100, 0, false},
|
|
||||||
|
|
||||||
// disabled → off, flush (hard stop)
|
|
||||||
{"disabled pool0 active=0", 0, 0, health.StateDisabled, 100, 0, true},
|
|
||||||
{"disabled pool1 active=1", 1, 1, health.StateDisabled, 100, 0, true},
|
|
||||||
}
|
}
|
||||||
|
if len(got) != len(want) {
|
||||||
for _, tc := range cases {
|
t.Errorf("got %d entries, want %d: %#v", len(got), len(want), got)
|
||||||
t.Run(tc.name, func(t *testing.T) {
|
|
||||||
w, f := asFromBackend(tc.poolIdx, tc.activePool, tc.state, tc.cfgWeight)
|
|
||||||
if w != tc.wantWeight {
|
|
||||||
t.Errorf("weight: got %d, want %d", w, tc.wantWeight)
|
|
||||||
}
|
}
|
||||||
if f != tc.wantFlush {
|
for k, v := range want {
|
||||||
t.Errorf("flush: got %v, want %v", f, tc.wantFlush)
|
g, ok := got[k]
|
||||||
|
if !ok {
|
||||||
|
t.Errorf("missing key %+v", k)
|
||||||
|
continue
|
||||||
}
|
}
|
||||||
})
|
if g != v {
|
||||||
|
t.Errorf("key %+v: got %+v, want %+v", k, g, v)
|
||||||
}
|
}
|
||||||
}
|
|
||||||
|
|
||||||
// TestActivePoolIndex locks down the priority-failover selector: the first
|
|
||||||
// pool containing at least one up backend is the active pool. Default 0.
|
|
||||||
func TestActivePoolIndex(t *testing.T) {
|
|
||||||
mkFE := func(pools ...[]string) config.Frontend {
|
|
||||||
out := make([]config.Pool, len(pools))
|
|
||||||
for i, p := range pools {
|
|
||||||
out[i] = config.Pool{Name: "p", Backends: map[string]config.PoolBackend{}}
|
|
||||||
for _, name := range p {
|
|
||||||
out[i].Backends[name] = config.PoolBackend{Weight: 100}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
return config.Frontend{Pools: out}
|
|
||||||
}
|
|
||||||
|
|
||||||
cases := []struct {
|
|
||||||
name string
|
|
||||||
fe config.Frontend
|
|
||||||
states map[string]health.State
|
|
||||||
want int
|
|
||||||
}{
|
|
||||||
{
|
|
||||||
name: "pool0 has up, pool1 standby",
|
|
||||||
fe: mkFE([]string{"a", "b"}, []string{"c", "d"}),
|
|
||||||
states: map[string]health.State{"a": health.StateUp, "b": health.StateDown, "c": health.StateUp, "d": health.StateUp},
|
|
||||||
want: 0,
|
|
||||||
},
|
|
||||||
{
|
|
||||||
name: "pool0 all down, pool1 has up → failover",
|
|
||||||
fe: mkFE([]string{"a", "b"}, []string{"c", "d"}),
|
|
||||||
states: map[string]health.State{"a": health.StateDown, "b": health.StateDown, "c": health.StateUp, "d": health.StateUp},
|
|
||||||
want: 1,
|
|
||||||
},
|
|
||||||
{
|
|
||||||
name: "pool0 all disabled, pool1 has up → failover",
|
|
||||||
fe: mkFE([]string{"a", "b"}, []string{"c"}),
|
|
||||||
states: map[string]health.State{"a": health.StateDisabled, "b": health.StateDisabled, "c": health.StateUp},
|
|
||||||
want: 1,
|
|
||||||
},
|
|
||||||
{
|
|
||||||
name: "pool0 all paused, pool1 has up → failover",
|
|
||||||
fe: mkFE([]string{"a"}, []string{"c"}),
|
|
||||||
states: map[string]health.State{"a": health.StatePaused, "c": health.StateUp},
|
|
||||||
want: 1,
|
|
||||||
},
|
|
||||||
{
|
|
||||||
name: "pool0 all unknown (startup), pool1 up → pool1",
|
|
||||||
fe: mkFE([]string{"a"}, []string{"c"}),
|
|
||||||
states: map[string]health.State{"a": health.StateUnknown, "c": health.StateUp},
|
|
||||||
want: 1,
|
|
||||||
},
|
|
||||||
{
|
|
||||||
name: "nothing up anywhere → default 0",
|
|
||||||
fe: mkFE([]string{"a"}, []string{"c"}),
|
|
||||||
states: map[string]health.State{"a": health.StateDown, "c": health.StateDown},
|
|
||||||
want: 0,
|
|
||||||
},
|
|
||||||
{
|
|
||||||
name: "1 up in pool0 is enough",
|
|
||||||
fe: mkFE([]string{"a", "b", "c"}, []string{"d"}),
|
|
||||||
states: map[string]health.State{"a": health.StateDown, "b": health.StateDown, "c": health.StateUp, "d": health.StateUp},
|
|
||||||
want: 0,
|
|
||||||
},
|
|
||||||
{
|
|
||||||
name: "three tiers, pool0 and pool1 both empty → pool2",
|
|
||||||
fe: mkFE([]string{"a"}, []string{"b"}, []string{"c"}),
|
|
||||||
states: map[string]health.State{"a": health.StateDown, "b": health.StateDown, "c": health.StateUp},
|
|
||||||
want: 2,
|
|
||||||
},
|
|
||||||
}
|
|
||||||
|
|
||||||
for _, tc := range cases {
|
|
||||||
t.Run(tc.name, func(t *testing.T) {
|
|
||||||
got := activePoolIndex(tc.fe, tc.states)
|
|
||||||
if got != tc.want {
|
|
||||||
t.Errorf("got pool %d, want pool %d", got, tc.want)
|
|
||||||
}
|
|
||||||
})
|
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -161,8 +65,10 @@ func (f *fakeStateSource) BackendState(name string) (health.State, bool) {
|
|||||||
}
|
}
|
||||||
|
|
||||||
// TestDesiredFromFrontendFailover is the end-to-end integration test for
|
// TestDesiredFromFrontendFailover is the end-to-end integration test for
|
||||||
// priority-failover: given a frontend with two pools, the desired weights
|
// priority-failover in the VPP sync path: given a frontend with two pools,
|
||||||
// flip between pools based on which has any up backends.
|
// the desired weights flip between pools based on which has any up backends.
|
||||||
|
// This exercises vpp.desiredFromFrontend which wraps the pure helpers in
|
||||||
|
// the health package; those helpers are unit-tested separately in health.
|
||||||
func TestDesiredFromFrontendFailover(t *testing.T) {
|
func TestDesiredFromFrontendFailover(t *testing.T) {
|
||||||
ip := func(s string) net.IP { return net.ParseIP(s).To4() }
|
ip := func(s string) net.IP { return net.ParseIP(s).To4() }
|
||||||
cfg := &config.Config{
|
cfg := &config.Config{
|
||||||
|
|||||||
@@ -63,11 +63,17 @@ func (r *Reconciler) Run(ctx context.Context) {
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
// handle reconciles one event. Operates only on events that carry a
|
// handle reconciles one event. Operates only on backend-transition events
|
||||||
// frontend name (the checker emits one event per frontend that references
|
// that carry a frontend name (the checker emits one event per frontend that
|
||||||
// the backend, so a backend shared across multiple frontends produces
|
// references the backend, so a backend shared across multiple frontends
|
||||||
// multiple events and all relevant VIPs are reconciled).
|
// produces multiple events and all relevant VIPs are reconciled).
|
||||||
|
// Frontend-transition events are observational only — the dataplane work
|
||||||
|
// they would imply has already been done by the backend-transition event
|
||||||
|
// that triggered them.
|
||||||
func (r *Reconciler) handle(ev checker.Event) {
|
func (r *Reconciler) handle(ev checker.Event) {
|
||||||
|
if ev.FrontendTransition != nil {
|
||||||
|
return // frontend-only event; no dataplane work
|
||||||
|
}
|
||||||
if ev.FrontendName == "" {
|
if ev.FrontendName == "" {
|
||||||
return
|
return
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -23,6 +23,7 @@ service Maglev {
|
|||||||
rpc GetVPPInfo(GetVPPInfoRequest) returns (VPPInfo);
|
rpc GetVPPInfo(GetVPPInfoRequest) returns (VPPInfo);
|
||||||
rpc GetVPPLBState(GetVPPLBStateRequest) returns (VPPLBState);
|
rpc GetVPPLBState(GetVPPLBStateRequest) returns (VPPLBState);
|
||||||
rpc SyncVPPLBState(SyncVPPLBStateRequest) returns (SyncVPPLBStateResponse);
|
rpc SyncVPPLBState(SyncVPPLBStateRequest) returns (SyncVPPLBStateResponse);
|
||||||
|
rpc GetVPPLBCounters(GetVPPLBCountersRequest) returns (VPPLBCounters);
|
||||||
}
|
}
|
||||||
|
|
||||||
// ---- requests ---------------------------------------------------------------
|
// ---- requests ---------------------------------------------------------------
|
||||||
@@ -108,6 +109,7 @@ message VPPLBVIP {
|
|||||||
string encap = 4; // gre4|gre6|l3dsr|nat4|nat6
|
string encap = 4; // gre4|gre6|l3dsr|nat4|nat6
|
||||||
uint32 flow_table_length = 5;
|
uint32 flow_table_length = 5;
|
||||||
repeated VPPLBAS application_servers = 6;
|
repeated VPPLBAS application_servers = 6;
|
||||||
|
bool src_ip_sticky = 7; // source-IP based sticky session (scraped via cli_inband)
|
||||||
}
|
}
|
||||||
|
|
||||||
message VPPLBState {
|
message VPPLBState {
|
||||||
@@ -126,6 +128,44 @@ message SyncVPPLBStateRequest {
|
|||||||
|
|
||||||
message SyncVPPLBStateResponse {}
|
message SyncVPPLBStateResponse {}
|
||||||
|
|
||||||
|
// ---- VPP load-balancer runtime counters ------------------------------------
|
||||||
|
|
||||||
|
// GetVPPLBCountersRequest asks maglevd for the most recent per-VIP and
|
||||||
|
// per-backend counter snapshot. The data is served from an in-process
|
||||||
|
// cache that is refreshed every ~5 seconds server-side; the call itself
|
||||||
|
// is cheap and does not hit VPP.
|
||||||
|
message GetVPPLBCountersRequest {}
|
||||||
|
|
||||||
|
// VPPLBVIPCounters is the point-in-time counter row for a single VIP.
|
||||||
|
// The four lb_* fields are the LB plugin's SimpleCounters (packets only);
|
||||||
|
// packets / bytes come from the VPP FIB's combined counter at the VIP's
|
||||||
|
// host prefix (/net/route/to).
|
||||||
|
message VPPLBVIPCounters {
|
||||||
|
string prefix = 1; // CIDR, e.g. 192.0.2.1/32
|
||||||
|
string protocol = 2; // tcp | udp | any
|
||||||
|
uint32 port = 3;
|
||||||
|
uint64 next_packet = 4; // "/packet from existing sessions"
|
||||||
|
uint64 first_packet = 5; // "/first session packet"
|
||||||
|
uint64 untracked_packet = 6; // "/untracked packet"
|
||||||
|
uint64 no_server = 7; // "/no server configured"
|
||||||
|
uint64 packets = 8; // /net/route/to (FIB, summed across workers)
|
||||||
|
uint64 bytes = 9; // /net/route/to (FIB, summed across workers)
|
||||||
|
}
|
||||||
|
|
||||||
|
// VPPLBBackendCounters is the FIB combined counter for a single backend's
|
||||||
|
// host prefix, summed across worker threads.
|
||||||
|
message VPPLBBackendCounters {
|
||||||
|
string backend = 1; // backend name from config
|
||||||
|
string address = 2; // backend IP address
|
||||||
|
uint64 packets = 3;
|
||||||
|
uint64 bytes = 4;
|
||||||
|
}
|
||||||
|
|
||||||
|
message VPPLBCounters {
|
||||||
|
repeated VPPLBVIPCounters vips = 1;
|
||||||
|
repeated VPPLBBackendCounters backends = 2;
|
||||||
|
}
|
||||||
|
|
||||||
message SetWeightRequest {
|
message SetWeightRequest {
|
||||||
string frontend = 1;
|
string frontend = 1;
|
||||||
string pool = 2;
|
string pool = 2;
|
||||||
@@ -166,6 +206,7 @@ message FrontendInfo {
|
|||||||
uint32 port = 4;
|
uint32 port = 4;
|
||||||
repeated PoolInfo pools = 5;
|
repeated PoolInfo pools = 5;
|
||||||
string description = 6;
|
string description = 6;
|
||||||
|
bool src_ip_sticky = 7; // VPP LB uses src-IP-based stickiness for this VIP
|
||||||
}
|
}
|
||||||
|
|
||||||
message ListBackendsResponse {
|
message ListBackendsResponse {
|
||||||
@@ -245,8 +286,12 @@ message BackendEvent {
|
|||||||
TransitionRecord transition = 2;
|
TransitionRecord transition = 2;
|
||||||
}
|
}
|
||||||
|
|
||||||
// FrontendEvent is reserved for future frontend-level events.
|
// FrontendEvent is emitted when a frontend's aggregate state changes.
|
||||||
message FrontendEvent {}
|
// Frontends have three states: unknown, up, down. See docs/healthchecks.md.
|
||||||
|
message FrontendEvent {
|
||||||
|
string frontend_name = 1;
|
||||||
|
TransitionRecord transition = 2;
|
||||||
|
}
|
||||||
|
|
||||||
// Event is the envelope returned by WatchEvents.
|
// Event is the envelope returned by WatchEvents.
|
||||||
message Event {
|
message Event {
|
||||||
|
|||||||
Reference in New Issue
Block a user