Fix pause to cancel probe goroutine; add Robot Framework integration tests

Pause semantics
- PauseBackend now cancels the probe goroutine so no HTTP/TCP/ICMP
  traffic is sent while the backend is paused. Previously the goroutine
  kept running and results were silently discarded.
- ResumeBackend launches a fresh probe goroutine on the existing worker,
  preserving transition history. The backend re-enters unknown state.

Integration tests (tests/01-maglevd/)
- Containerlab topology with 3 nginx:alpine backends on a dedicated
  management network (172.20.30.0/24) with static IPs.
- maglevd config with 200ms HTTP health-check interval for fast test
  convergence (rise=2, fall=2).
- 8 test cases: deploy lab, start maglevd, all backends reach up,
  nginx logs confirm probes arriving, pause stops probes (probe count
  stable), resume restarts probes, disable stops probes, enable
  restarts probes.

VPP dataplane test (tests/02-vpp-lb/)
- Rewrite 01-e2e-lab.robot to match the actual single-VPP topology:
  test client-to-server ping through VPP bridge domains and verify
  nginx is serving on all app servers. The previous version referenced
  a non-existent topology file and tested OSPF/BFD between two VPP
  nodes that don't exist in this lab.

Build infrastructure
- Add 'make robot-test' target with TEST= for suite selection.
- Add tests/.venv target for Robot Framework virtualenv.
- Make IMAGE optional in rf-run.sh.
- Add .gitignore entries for test output, venv, logs, and clab state.
This commit is contained in:
2026-04-11 20:16:22 +02:00
parent 3bd30b69f4
commit 8bde00eb61
20 changed files with 519 additions and 7 deletions

View File

@@ -0,0 +1,135 @@
*** Settings ***
Library OperatingSystem
Library Process
Resource ../common.robot
Suite Setup Setup Suite
Suite Teardown Cleanup Suite
*** Variables ***
${lab-name} maglevd-test
${lab-file} maglevd-lab/maglevd.clab.yml
${config-file} maglevd-lab/maglev.yaml
${runtime} docker
${GRPC_PORT} 9091
*** Test Cases ***
Deploy maglevd-test lab
${rc} ${output} = Run And Return Rc And Output
... ${CLAB_BIN} --runtime ${runtime} deploy -t ${CURDIR}/${lab-file}
Log ${output}
Should Be Equal As Integers ${rc} 0
Start maglevd
${handle} = Start Process ${MAGLEVD}
... --config ${CURDIR}/${config-file}
... --grpc-addr :${GRPC_PORT}
... --log-level debug
... alias=maglevd stdout=${EXECDIR}/tests/out/maglevd.log
... stderr=STDOUT
Set Suite Variable ${MAGLEVD_HANDLE} ${handle}
Sleep 3s Wait for nginx containers and probes to converge
All backends reach up state
[Template] Backend Should Be Up
nginx1
nginx2
nginx3
Health checks are reaching all backends
[Template] Probe Count Should Be Positive
nginx1
nginx2
nginx3
Pause backend stops probing
Maglevc set backend nginx1 pause
Backend Should Have State nginx1 paused
Sleep 1s
${before} = Get Probe Count nginx1
Sleep 2s Wait to confirm no new probes arrive
${after} = Get Probe Count nginx1
Should Be True ${after} == ${before}
... Probe count for nginx1 grew while paused: ${before} → ${after}
Resume backend restarts probing
Maglevc set backend nginx1 resume
${before} = Get Probe Count nginx1
Sleep 2s Wait for resumed probes to accumulate
${after} = Get Probe Count nginx1
Should Be True ${after} > ${before}
... Probe count for nginx1 did not grow after resume: ${before} → ${after}
Wait Until Keyword Succeeds 5s 500ms
... Backend Should Be Up nginx1
Disable backend stops probing
Maglevc set backend nginx2 disable
Backend Should Have State nginx2 removed
Backend Should Be Disabled nginx2
Sleep 1s
${before} = Get Probe Count nginx2
Sleep 2s Wait to confirm probes stopped
${after} = Get Probe Count nginx2
Should Be True ${after} == ${before}
... Probe count for nginx2 grew while disabled: ${before} → ${after}
Enable backend restarts probing
Maglevc set backend nginx2 enable
${before} = Get Probe Count nginx2
Sleep 2s Wait for re-enabled probes to accumulate
${after} = Get Probe Count nginx2
Should Be True ${after} > ${before}
... Probe count for nginx2 did not grow after enable: ${before} → ${after}
Wait Until Keyword Succeeds 5s 500ms
... Backend Should Be Up nginx2
*** Keywords ***
Setup Suite
${arch} = Run go env GOARCH
Set Suite Variable ${ARCH} ${arch}
Set Suite Variable ${MAGLEVD} ${EXECDIR}/build/${ARCH}/maglevd
Set Suite Variable ${MAGLEVC} ${EXECDIR}/build/${ARCH}/maglevc
Cleanup Suite
Run Keyword And Ignore Error Terminate Process maglevd kill=true
Run ${CLAB_BIN} --runtime ${runtime} destroy -t ${CURDIR}/${lab-file} --cleanup
Maglevc
[Documentation] Run a maglevc command and return its output.
[Arguments] ${cmd}
${rc} ${output} = Run And Return Rc And Output
... ${MAGLEVC} --server\=localhost:${GRPC_PORT} --color\=false ${cmd}
Log ${output}
Should Be Equal As Integers ${rc} 0
RETURN ${output}
Backend Should Be Up
[Arguments] ${name}
${output} = Maglevc show backends ${name}
Should Match Regexp ${output} state\\s+up
Backend Should Have State
[Arguments] ${name} ${expected_state}
${output} = Maglevc show backends ${name}
Should Match Regexp ${output} state\\s+${expected_state}
Backend Should Be Disabled
[Arguments] ${name}
${output} = Maglevc show backends ${name}
Should Match Regexp ${output} enabled\\s+false
Get Probe Count
[Documentation] Return the number of HTTP health-check requests seen in a backend's nginx log.
[Arguments] ${name}
${output} = Run docker logs clab-${lab-name}-${name} 2>/dev/null | grep -c "GET /" || echo 0
${count} = Convert To Integer ${output.strip()}
RETURN ${count}
Probe Count Should Be Positive
[Arguments] ${name}
${count} = Get Probe Count ${name}
Should Be True ${count} > 0
... No health-check requests found in nginx logs for ${name}

View File

@@ -0,0 +1,43 @@
maglev:
healthchecker:
transition-history: 5
healthchecks:
http-check:
type: http
port: 80
params:
path: /
response-code: "200"
interval: 200ms
fast-interval: 100ms
down-interval: 1s
timeout: 1s
rise: 2
fall: 2
backends:
nginx1:
address: 172.20.30.11
healthcheck: http-check
nginx2:
address: 172.20.30.12
healthcheck: http-check
nginx3:
address: 172.20.30.13
healthcheck: http-check
frontends:
http-vip:
description: "Test HTTP VIP"
address: 192.0.2.1
protocol: tcp
port: 80
pools:
- name: primary
backends:
nginx1: {}
nginx2: {}
- name: fallback
backends:
nginx3: {}

View File

@@ -0,0 +1,20 @@
name: maglevd-test
mgmt:
network: maglevd-test-net
ipv4-subnet: 172.20.30.0/24
topology:
nodes:
nginx1:
kind: linux
image: nginx:alpine
mgmt-ipv4: 172.20.30.11
nginx2:
kind: linux
image: nginx:alpine
mgmt-ipv4: 172.20.30.12
nginx3:
kind: linux
image: nginx:alpine
mgmt-ipv4: 172.20.30.13

View File

@@ -0,0 +1,63 @@
*** Settings ***
Library OperatingSystem
Resource ../common.robot
Suite Teardown Run Keyword Cleanup
*** Variables ***
${lab-name} e2e-maglev
${lab-file-name} e2e-lab/maglev.clab.yml
${runtime} docker
*** Test Cases ***
Deploy ${lab-name} lab
Log ${CURDIR}
${rc} ${output} = Run And Return Rc And Output
... ${CLAB_BIN} --runtime ${runtime} deploy -t ${CURDIR}/${lab-file-name}
Log ${output}
Should Be Equal As Integers ${rc} 0
Wait for VPP dataplane startup
Sleep 5s
Client cl1 can ping app server as1 via VPP
${rc} ${output} = Run And Return Rc And Output
... ${CLAB_BIN} --runtime ${runtime} exec -t ${CURDIR}/${lab-file-name} --label clab-node-name\=cl1 --cmd "ping -c 3 -W 2 10.82.98.82"
Log ${output}
Should Be Equal As Integers ${rc} 0
Should Not Contain ${output} 0 received
Client cl2 can ping app server as2 via VPP
${rc} ${output} = Run And Return Rc And Output
... ${CLAB_BIN} --runtime ${runtime} exec -t ${CURDIR}/${lab-file-name} --label clab-node-name\=cl2 --cmd "ping -c 3 -W 2 10.82.98.83"
Log ${output}
Should Be Equal As Integers ${rc} 0
Should Not Contain ${output} 0 received
App server as1 can reach app server as3 via VPP
${rc} ${output} = Run And Return Rc And Output
... ${CLAB_BIN} --runtime ${runtime} exec -t ${CURDIR}/${lab-file-name} --label clab-node-name\=as1 --cmd "ping -c 3 -W 2 10.82.98.84"
Log ${output}
Should Be Equal As Integers ${rc} 0
Should Not Contain ${output} 0 received
App servers have nginx running
[Template] Nginx Should Be Serving
as1 10.82.98.82
as2 10.82.98.83
as3 10.82.98.84
*** Keywords ***
Cleanup
Run ${CLAB_BIN} --runtime ${runtime} destroy -t ${CURDIR}/${lab-file-name} --cleanup
Nginx Should Be Serving
[Arguments] ${node} ${ip}
${rc} ${output} = Run And Return Rc And Output
... ${CLAB_BIN} --runtime ${runtime} exec -t ${CURDIR}/${lab-file-name} --label clab-node-name\=${node} --cmd "wget -q -O- http://${ip}/"
Log ${output}
Should Be Equal As Integers ${rc} 0
Should Contain ${output} ${node}

View File

@@ -0,0 +1,9 @@
#!/bin/sh
MYIP=$(ip addr show dev eth1 | awk '/inet .*scope/ { print $2}' | cut -f1 -d/)
ip tunnel add maglev0 mode gre local $MYIP
ip link set maglev0 up mtu 1500
ip addr add 10.82.98.255/32 dev maglev0
echo "This is $(hostname -f)" >> /usr/share/nginx/html/index.html

View File

@@ -0,0 +1 @@
../as1/rc.local

View File

@@ -0,0 +1 @@
../as1/rc.local

View File

@@ -0,0 +1,12 @@
comment { You can add commands here that will execute after vppcfg.vpp }
lb conf ip4-src-address 10.82.98.0 ip6-src-address 2001:db8:8298:: buckets 524288
lb vip 10.82.98.255/32 protocol tcp port 80
lb as 10.82.98.255/32 protocol tcp port 80 10.82.98.82
lb as 10.82.98.255/32 protocol tcp port 80 10.82.98.83
lb as 10.82.98.255/32 protocol tcp port 80 10.82.98.84
lb vip 10.82.98.255/32 protocol tcp port 443 src_ip_sticky
lb as 10.82.98.255/32 protocol tcp port 443 10.82.98.82
lb as 10.82.98.255/32 protocol tcp port 443 10.82.98.83
lb as 10.82.98.255/32 protocol tcp port 443 10.82.98.84

View File

@@ -0,0 +1,45 @@
loopbacks:
loop0:
description: "Core: vpp1"
lcp: loop0
addresses: [10.82.98.0/32, 2001:db8:8298::/128]
loop1:
description: "Core: Maglev VIP"
lcp: maglev0
loop2:
description: "BVI: clients"
mtu: 1500
lcp: bvi101
addresses: [10.82.98.65/28, 2001:db8:8298:101::1/64]
loop3:
description: "BVI: application servers"
mtu: 2026
lcp: bvi102
addresses: [10.82.98.81/28, 2001:db8:8298:102::1/64]
bridgedomains:
bd101:
description: "Clients"
mtu: 1500
bvi: loop2
interfaces: [ eth1, eth2 ]
bd102:
description: "Application Servers"
mtu: 2026
bvi: loop3
interfaces: [ eth3, eth4, eth5 ]
interfaces:
eth1:
description: "To cl1:eth1"
mtu: 1500
eth2:
description: "To cl2:eth1"
mtu: 1500
eth3:
description: "To as1:eth1"
mtu: 2026
eth4:
description: "To as2:eth1"
mtu: 2026
eth5:
description: "To as3:eth1"
mtu: 2026

View File

@@ -0,0 +1,64 @@
name: e2e-maglev
topology:
kinds:
fdio_vpp:
image: git.ipng.ch/ipng/vpp-containerlab:latest
startup-config: config/__clabNodeName__/vppcfg.yaml
binds:
- config/__clabNodeName__/manual-post.vpp:/config/vpp/config/manual-post.vpp:rw
linux:
image: ghcr.io/srl-labs/network-multitool:latest
binds:
- config/__clabNodeName__/rc.local:/config/rc.local:rw
nodes:
vpp1:
kind: fdio_vpp
cl1:
kind: linux
exec:
- ip addr add 10.82.98.66/28 dev eth1
- ip route add 10.82.98.0/24 via 10.82.98.65
- ip addr add 2001:db8:8298:101::2/64 dev eth1
- ip route add 2001:db8:8298::/48 via 2001:db8:8298:101::1
- sh /config/rc.local
cl2:
kind: linux
exec:
- ip addr add 10.82.98.67/28 dev eth1
- ip route add 10.82.98.0/24 via 10.82.98.65
- ip addr add 2001:db8:8298:101::3/64 dev eth1
- ip route add 2001:db8:8298::/48 via 2001:db8:8298:101::1
- sh /config/rc.local
as1:
kind: linux
exec:
- ip addr add 10.82.98.82/28 dev eth1
- ip route add 10.82.98.0/24 via 10.82.98.81
- ip addr add 2001:db8:8298:102::2/64 dev eth1
- ip route add 2001:db8:8298::/48 via 2001:db8:8298:102::1
- sh /config/rc.local
as2:
kind: linux
exec:
- ip addr add 10.82.98.83/28 dev eth1
- ip route add 10.82.98.0/24 via 10.82.98.81
- ip addr add 2001:db8:8298:102::3/64 dev eth1
- ip route add 2001:db8:8298::/48 via 2001:db8:8298:102::1
- sh /config/rc.local
as3:
kind: linux
exec:
- ip addr add 10.82.98.84/28 dev eth1
- ip route add 10.82.98.0/24 via 10.82.98.81
- ip addr add 2001:db8:8298:102::4/64 dev eth1
- ip route add 2001:db8:8298::/48 via 2001:db8:8298:102::1
- sh /config/rc.local
links:
- endpoints: ["vpp1:eth1", "cl1:eth1"]
- endpoints: ["vpp1:eth2", "cl2:eth1"]
- endpoints: ["vpp1:eth3", "as1:eth1"]
- endpoints: ["vpp1:eth4", "as2:eth1"]
- endpoints: ["vpp1:eth5", "as3:eth1"]

2
tests/common.robot Normal file
View File

@@ -0,0 +1,2 @@
*** Variables ***
${CLAB_BIN} containerlab

2
tests/requirements.txt Normal file
View File

@@ -0,0 +1,2 @@
robotframework
robotframework-sshlibrary

48
tests/rf-run.sh Executable file
View File

@@ -0,0 +1,48 @@
#!/bin/bash
# Run Robot Framework tests for vpp-containerlab.
# Arguments:
# $1 - container runtime: [docker, podman]
# $2 - test suite path (directory or .robot file)
#
# Environment variables:
# CLAB_BIN - path to containerlab binary (default: containerlab)
# IMAGE - docker image to use in topology (must be set)
set -e
if [ -z "${CLAB_BIN}" ]; then
CLAB_BIN=containerlab
fi
# IMAGE is optional — some test suites (e.g. 02-maglevd) don't need it.
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
mkdir -p "${SCRIPT_DIR}/out"
source "${SCRIPT_DIR}/.venv/bin/activate"
function get_logname() {
path=$1
filename=$(basename "$path")
if [[ "$filename" == *.* ]]; then
dirname=$(dirname "$path")
basename=$(basename "$path" | cut -d. -f1)
echo "${dirname##*/}-${basename}"
else
echo "${filename}"
fi
}
IMAGE_VAR=""
if [ -n "${IMAGE}" ]; then
IMAGE_VAR="--variable IMAGE:${IMAGE}"
fi
robot --consolecolors on -r none \
--variable CLAB_BIN:"${CLAB_BIN}" \
--variable runtime:"$1" \
${IMAGE_VAR} \
-l "${SCRIPT_DIR}/out/$(get_logname $2)-$1-log" \
--output "${SCRIPT_DIR}/out/$(get_logname $2)-$1-out.xml" \
"$2"

44
tests/ssh.robot Normal file
View File

@@ -0,0 +1,44 @@
*** Settings ***
Library SSHLibrary
*** Keywords ***
Login via SSH with username and password
[Arguments]
... ${address}=${None}
... ${port}=22
... ${username}=${None}
... ${password}=${None}
# seconds to try and successfully login
... ${try_for}=4
... ${conn_timeout}=3
FOR ${i} IN RANGE ${try_for}
SSHLibrary.Open Connection ${address} timeout=${conn_timeout}
${status}= Run Keyword And Return Status SSHLibrary.Login ${username} ${password}
IF ${status} BREAK
Sleep 1s
END
IF $status!=True
Fail Unable to connect to ${address} via SSH in ${try_for} attempts
END
Log Exited the loop.
Login via SSH with public key
[Arguments]
... ${address}=${None}
... ${port}=22
... ${username}=${None}
... ${keyfile}=${None}
... ${try_for}=4
... ${conn_timeout}=3
FOR ${i} IN RANGE ${try_for}
SSHLibrary.Open Connection ${address} timeout=${conn_timeout}
${status}= Run Keyword And Return Status SSHLibrary.Login With Public Key
... ${username} ${keyfile}
IF ${status} BREAK
Sleep 1s
END
IF $status!=True
Fail Unable to connect to ${address} via SSH in ${try_for} attempts
END
Log Exited the loop.