Files
vpp-maglev/tests/01-maglevd/01-healthcheck.robot
Pim van Pelt 4ab3096c8b Add Prometheus metrics endpoint; containerize integration tests
Prometheus metrics (internal/metrics/, cmd/maglevd/)
- New --metrics-addr flag (default :9091, env MAGLEV_METRICS_ADDR)
  serving /metrics via promhttp.
- Gauge metrics scraped on demand via a custom prometheus.Collector:
  maglev_backend_state, maglev_backend_health, maglev_backend_enabled,
  maglev_frontend_pool_backend_weight.
- Inline counter/histogram metrics updated per probe:
  maglev_probe_total (by backend, type, result, code),
  maglev_probe_duration_seconds (by backend, type),
  maglev_backend_transitions_total (by backend, from, to).
- StateSource interface in metrics package breaks the import cycle
  with checker; checker.Checker satisfies it via GetBackendInfo.

Integration tests
- Run maglevd inside a containerlab node (debian:trixie-slim with
  build/ bind-mounted) instead of on the host. Eliminates port
  collisions with any host maglevd.
- maglevc commands run via docker exec into the maglevd container.
- Add 6 Prometheus test cases: endpoint reachable, all backends
  report state=up, probe counters non-zero, duration histogram
  populated, pool weights correct, transition counters present.
2026-04-11 20:50:59 +02:00

169 lines
6.6 KiB
Plaintext

*** Settings ***
Library OperatingSystem
Resource ../common.robot
Suite Setup Setup Suite
Suite Teardown Cleanup Suite
*** Variables ***
${lab-name} maglevd-test
${lab-file} maglevd-lab/maglevd.clab.yml
${runtime} docker
${MAGLEVD_NODE} clab-maglevd-test-maglevd
${METRICS_URL} http://172.20.30.2:9091/metrics
*** Test Cases ***
Deploy maglevd-test lab
[Documentation] Deploy the containerlab topology. The maglevd node starts
... automatically as PID 1 via start.sh and begins probing the nginx
... backends immediately.
${rc} ${output} = Run And Return Rc And Output
... ${CLAB_BIN} --runtime ${runtime} deploy -t ${CURDIR}/${lab-file}
Log ${output}
Should Be Equal As Integers ${rc} 0
Sleep 3s Wait for nginx containers and probes to converge
All backends reach up state
[Template] Backend Should Be Up
nginx1
nginx2
nginx3
Health checks are reaching all backends
[Template] Probe Count Should Be Positive
nginx1
nginx2
nginx3
Pause backend stops probing
Maglevc set backend nginx1 pause
Backend Should Have State nginx1 paused
Sleep 1s
${before} = Get Probe Count nginx1
Sleep 2s Wait to confirm no new probes arrive
${after} = Get Probe Count nginx1
Should Be True ${after} == ${before}
... Probe count for nginx1 grew while paused: ${before} → ${after}
Resume backend restarts probing
Maglevc set backend nginx1 resume
${before} = Get Probe Count nginx1
Sleep 2s Wait for resumed probes to accumulate
${after} = Get Probe Count nginx1
Should Be True ${after} > ${before}
... Probe count for nginx1 did not grow after resume: ${before} → ${after}
Wait Until Keyword Succeeds 5s 500ms
... Backend Should Be Up nginx1
Disable backend stops probing
Maglevc set backend nginx2 disable
Backend Should Have State nginx2 removed
Backend Should Be Disabled nginx2
Sleep 1s
${before} = Get Probe Count nginx2
Sleep 2s Wait to confirm probes stopped
${after} = Get Probe Count nginx2
Should Be True ${after} == ${before}
... Probe count for nginx2 grew while disabled: ${before} → ${after}
Enable backend restarts probing
Maglevc set backend nginx2 enable
${before} = Get Probe Count nginx2
Sleep 2s Wait for re-enabled probes to accumulate
${after} = Get Probe Count nginx2
Should Be True ${after} > ${before}
... Probe count for nginx2 did not grow after enable: ${before} → ${after}
Wait Until Keyword Succeeds 5s 500ms
... Backend Should Be Up nginx2
Prometheus endpoint is reachable
${rc} ${output} = Run And Return Rc And Output
... curl -sf ${METRICS_URL}
Log ${output}
Should Be Equal As Integers ${rc} 0
Should Contain ${output} maglev_backend_state
Prometheus reports all backends up
${output} = Scrape Metrics
# Each backend should have state="up" = 1.
Should Contain ${output} maglev_backend_state{address="172.20.30.11",backend="nginx1",healthcheck="http-check",state="up"} 1
Should Contain ${output} maglev_backend_state{address="172.20.30.12",backend="nginx2",healthcheck="http-check",state="up"} 1
Should Contain ${output} maglev_backend_state{address="172.20.30.13",backend="nginx3",healthcheck="http-check",state="up"} 1
Prometheus reports probe counters
${output} = Scrape Metrics
Should Match Regexp ${output} maglev_probe_total\\{backend="nginx1".*result="success".*\\}\\s+[1-9]
Should Match Regexp ${output} maglev_probe_total\\{backend="nginx2".*result="success".*\\}\\s+[1-9]
Should Match Regexp ${output} maglev_probe_total\\{backend="nginx3".*result="success".*\\}\\s+[1-9]
Prometheus reports probe duration histogram
${output} = Scrape Metrics
Should Match Regexp ${output} maglev_probe_duration_seconds_count\\{backend="nginx1".*\\}\\s+[1-9]
Prometheus reports pool weights
${output} = Scrape Metrics
Should Contain ${output} maglev_frontend_pool_backend_weight{backend="nginx1",frontend="http-vip",pool="primary"} 100
Should Contain ${output} maglev_frontend_pool_backend_weight{backend="nginx3",frontend="http-vip",pool="fallback"} 100
Prometheus reports transition counters
${output} = Scrape Metrics
# All backends transitioned unknown → up during startup.
Should Match Regexp ${output} maglev_backend_transitions_total\\{backend="nginx1",from="unknown",to="up"\\}\\s+[1-9]
*** Keywords ***
Setup Suite
${arch} = Run go env GOARCH
Set Suite Variable ${ARCH} ${arch}
Cleanup Suite
Run docker logs ${MAGLEVD_NODE} > ${EXECDIR}/tests/out/maglevd.log 2>&1
Run ${CLAB_BIN} --runtime ${runtime} destroy -t ${CURDIR}/${lab-file} --cleanup
Maglevc
[Documentation] Run a maglevc command inside the maglevd container.
[Arguments] ${cmd}
${rc} ${output} = Run And Return Rc And Output
... docker exec ${MAGLEVD_NODE} /opt/maglev/build/${ARCH}/maglevc --color\=false ${cmd}
Log ${output}
Should Be Equal As Integers ${rc} 0
RETURN ${output}
Backend Should Be Up
[Arguments] ${name}
${output} = Maglevc show backends ${name}
Should Match Regexp ${output} state\\s+up
Backend Should Have State
[Arguments] ${name} ${expected_state}
${output} = Maglevc show backends ${name}
Should Match Regexp ${output} state\\s+${expected_state}
Backend Should Be Disabled
[Arguments] ${name}
${output} = Maglevc show backends ${name}
Should Match Regexp ${output} enabled\\s+false
Get Probe Count
[Documentation] Return the number of HTTP health-check requests seen in a backend's nginx log.
[Arguments] ${name}
${output} = Run docker logs clab-${lab-name}-${name} 2>/dev/null | grep -c "GET /" || echo 0
${count} = Convert To Integer ${output.strip()}
RETURN ${count}
Probe Count Should Be Positive
[Arguments] ${name}
${count} = Get Probe Count ${name}
Should Be True ${count} > 0
... No health-check requests found in nginx logs for ${name}
Scrape Metrics
[Documentation] Fetch the Prometheus /metrics endpoint from the maglevd container.
${rc} ${output} = Run And Return Rc And Output
... curl -sf ${METRICS_URL}
Should Be Equal As Integers ${rc} 0
RETURN ${output}