SO_BINDTODEVICE pins both ingress *and* egress to the bound interface — the kernel uses the listening socket's device binding when choosing the output interface for the SYN-ACK, which is sent before accept() returns and therefore can't be fixed up in userspace. That's fatal for maglev / DSR deployments where the SYN arrives through a GRE tunnel but the return path has to leave via the default route; the SYN-ACK goes out the GRE and is dropped by the uplink, so every new connection times out. Rework the listen plumbing so the module never touches SO_BINDTODEVICE. init_module now enables IP_PKTINFO and IPV6_RECVPKTINFO on every HTTP listening socket and resolves each configured `device=` name to an ifindex. At request time resolve_source calls getsockopt(IP_PKTOPTIONS) on the accepted fd to read the per-connection in(6)_pktinfo cmsg the kernel stashed during the handshake, then matches (ifindex, family) against the bindings table. The listening sockets remain plain wildcards, so the return path follows the normal routing table and DSR works. The wrapper also no longer clones or rebinds sockets: it still dedups per (cscf, sockaddr) so multiple device-tagged listens in a single server block coexist, and dedups bindings on (device, family) so the same device can carry different tags for v4 and v6 (e.g. tag2-v4 / tag2-v6) but not pointlessly duplicate when a listen include is shared across server blocks. Drive-by fixes to unblock `make pkg-deb` after a prior `make build-asan`: - debian/rules overrides dh_clean to exclude build/, since nginx-asan's install creates nobody:0700 temp dirs dh_clean can't traverse. - Makefile's build-asan removes those unused runtime temp dirs so the tree is clean afterwards. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
358 lines
15 KiB
Plaintext
358 lines
15 KiB
Plaintext
# SPDX-License-Identifier: Apache-2.0
|
||
*** Settings ***
|
||
Documentation End-to-end tests for ngx_http_ipng_stats_module.
|
||
... Deploys a 3-node containerlab topology and validates
|
||
... attribution, counters, histograms, filters, variables,
|
||
... and reload semantics.
|
||
Library OperatingSystem
|
||
Library String
|
||
Suite Setup Deploy Lab
|
||
Suite Teardown Cleanup Lab
|
||
|
||
*** Variables ***
|
||
${lab-name} ipng-stats-test
|
||
${lab-file} lab/ipng-stats.clab.yml
|
||
${runtime} docker
|
||
${CLAB_BIN} sudo containerlab
|
||
${SERVER} clab-${lab-name}-server
|
||
${CLIENT1} clab-${lab-name}-client1
|
||
${CLIENT2} clab-${lab-name}-client2
|
||
${SCRAPE_URL} http://172.20.40.2:9113/.well-known/ipng/statsz
|
||
${SERVER_MGMT} http://172.20.40.2:9180
|
||
|
||
*** Test Cases ***
|
||
|
||
# --- Basic functionality ---
|
||
|
||
Module loads
|
||
[Documentation] nginx -t passes with the module loaded.
|
||
${output} = Docker Exec ${SERVER} nginx -t 2>&1
|
||
Should Contain ${output} syntax is ok
|
||
|
||
Shared-listen-include across multiple server blocks
|
||
[Documentation] Three server blocks all pull in the same
|
||
... ipng-listens.inc (see docs/user-guide.md). nginx
|
||
... must start without "conflicting server name" or
|
||
... "duplicate listen options" warnings, and the
|
||
... module must end up with exactly one listening
|
||
... socket per address family on port 8080 (one for
|
||
... v4 wildcard, one for v6) — not one per (server
|
||
... block × device × family), which would exhaust
|
||
... the fd table on a real host.
|
||
${output} = Docker Exec ${SERVER} nginx -t 2>&1
|
||
Should Not Contain ${output} conflicting server name
|
||
Should Not Contain ${output} duplicate listen
|
||
${listens} = Docker Exec ${SERVER} ss -tlnH
|
||
${count} = Get Regexp Matches ${listens} :8080\\s
|
||
Length Should Be ${count} 2
|
||
... Expected 2 listening sockets on port 8080 (v4+v6 wildcards); got ${count}
|
||
|
||
Prometheus scrape
|
||
[Documentation] Scrape returns HELP/TYPE preamble.
|
||
${output} = Scrape Prometheus
|
||
Should Contain ${output} nginx-ipng-stats-plugin
|
||
Should Contain ${output} nginx_ipng_requests_total
|
||
|
||
JSON scrape
|
||
[Documentation] Accept: application/json returns valid JSON with schema.
|
||
${rc} ${output} = Run And Return Rc And Output
|
||
... curl -sf -H 'Accept: application/json' ${SCRAPE_URL} | python3 -m json.tool
|
||
Should Be Equal As Integers ${rc} 0
|
||
Should Contain ${output} "schema": 2
|
||
|
||
# --- Per-device attribution ---
|
||
|
||
Attribute tag1 via eth1 (v4)
|
||
[Documentation] IPv4 traffic on server:eth1 carries source_tag=tag1.
|
||
Send Fast Requests ${CLIENT1} 10.0.1.1 5
|
||
Wait For Flush
|
||
${output} = Scrape Prometheus
|
||
Should Contain ${output} source_tag="tag1"
|
||
Should Contain ${output} vip="10.0.1.1"
|
||
|
||
Attribute tag2-v4 via eth2 (v4)
|
||
[Documentation] IPv4 traffic on server:eth2 carries source_tag=tag2-v4.
|
||
Send Fast Requests ${CLIENT2} 10.0.2.1 5
|
||
Wait For Flush
|
||
${output} = Scrape Prometheus
|
||
Should Contain ${output} source_tag="tag2-v4"
|
||
Should Contain ${output} vip="10.0.2.1"
|
||
|
||
Attribute tag1 via eth1 (v6)
|
||
[Documentation] IPv6 traffic on server:eth1 carries source_tag=tag1
|
||
... — same tag as v4, demonstrating that tag= can be
|
||
... shared across address families for one device.
|
||
Send Fast Requests v6 ${CLIENT1} 2001:db8:1::1 5
|
||
Wait For Flush
|
||
${output} = Scrape With Filter source_tag=tag1
|
||
Should Contain ${output} source_tag="tag1"
|
||
Should Contain ${output} vip="2001:db8:1::1"
|
||
|
||
Attribute tag2-v6 via eth2 (v6)
|
||
[Documentation] IPv6 traffic on server:eth2 carries source_tag=tag2-v6
|
||
... — distinct from the eth2 v4 tag, demonstrating
|
||
... per-(device, family) attribution.
|
||
Send Fast Requests v6 ${CLIENT2} 2001:db8:2::1 5
|
||
Wait For Flush
|
||
${output} = Scrape Prometheus
|
||
Should Contain ${output} source_tag="tag2-v6"
|
||
Should Contain ${output} vip="2001:db8:2::1"
|
||
|
||
Direct traffic tagged
|
||
[Documentation] Mgmt-interface traffic carries source_tag=direct.
|
||
${rc} ${output} = Run And Return Rc And Output
|
||
... curl -sf ${SERVER_MGMT}/
|
||
Should Be Equal As Integers ${rc} 0
|
||
Wait For Flush
|
||
${output} = Scrape Prometheus
|
||
Should Contain ${output} source_tag="direct"
|
||
|
||
# --- Status code tracking ---
|
||
|
||
Per-class code counters
|
||
[Documentation] 4xx and 2xx appear as class-bucketed code= labels.
|
||
Docker Exec Ignore Rc ${CLIENT1} curl -s http://10.0.1.1:8080/notfound
|
||
Docker Exec Ignore Rc ${CLIENT1} curl -s http://10.0.1.1:8080/notfound
|
||
Wait For Flush
|
||
${output} = Scrape With Filter source_tag=tag1
|
||
Should Contain ${output} code="4xx"
|
||
Should Contain ${output} code="2xx"
|
||
|
||
# --- Duration histogram ---
|
||
|
||
Duration histogram
|
||
[Documentation] proxy_pass to a 50 ms backend populates sum and buckets.
|
||
Send Slow Requests ${CLIENT1} 10.0.1.1 3
|
||
Wait For Flush
|
||
${prom} = Scrape With Filter source_tag=tag1
|
||
Should Match Regexp ${prom} request_duration_seconds_sum\\{[^}]*\\}\\s+\\d+\\.\\d*[1-9]
|
||
|
||
${rc} ${json} = Run And Return Rc And Output
|
||
... curl -sf -H 'Accept: application/json' '${SCRAPE_URL}?source_tag=tag1' | python3 -m json.tool
|
||
Should Be Equal As Integers ${rc} 0
|
||
Should Contain ${json} request_duration_ms
|
||
Should Contain ${json} buckets
|
||
|
||
# --- Scrape filters ---
|
||
|
||
Filter by source_tag
|
||
[Documentation] ?source_tag=tag1 returns tag1 only; tag2-v4 only.
|
||
${output} = Scrape With Filter source_tag=tag1
|
||
Should Contain ${output} source_tag="tag1"
|
||
Should Not Contain ${output} source_tag="tag2-v4"
|
||
|
||
${output} = Scrape With Filter source_tag=tag2-v4
|
||
Should Contain ${output} source_tag="tag2-v4"
|
||
Should Not Contain ${output} source_tag="tag1"
|
||
|
||
Filter by VIP
|
||
[Documentation] ?vip=10.0.1.1 excludes 10.0.2.1.
|
||
${output} = Scrape With Filter vip=10.0.1.1
|
||
Should Contain ${output} vip="10.0.1.1"
|
||
Should Not Contain ${output} vip="10.0.2.1"
|
||
|
||
Filter combined
|
||
[Documentation] source_tag + vip intersection.
|
||
${output} = Scrape With Filter source_tag=tag1&vip=10.0.1.1
|
||
Should Contain ${output} source_tag="tag1"
|
||
Should Contain ${output} vip="10.0.1.1"
|
||
Should Not Contain ${output} source_tag="tag2-v4"
|
||
|
||
Filter unknown tag
|
||
[Documentation] Unknown source_tag returns empty data set.
|
||
${output} = Scrape With Filter source_tag=nonexistent
|
||
Should Not Contain ${output} nginx_ipng_requests_total{
|
||
|
||
# --- nginx variable ---
|
||
|
||
Variable in access log
|
||
[Documentation] $ipng_source_tag appears as tag1, tag2-v4, direct in log.
|
||
${output} = Docker Exec ${SERVER} cat /var/log/nginx/access.log
|
||
Should Match Regexp ${output} src=tag1
|
||
Should Match Regexp ${output} src=tag2-v4
|
||
Should Match Regexp ${output} src=direct
|
||
|
||
UDP logtail
|
||
[Documentation] ipng_stats_logtail udp:// sends log lines to a local
|
||
... nc listener; captured file has all sources and VIPs.
|
||
${output} = Docker Exec ${SERVER} cat /var/log/nginx/logtail-udp.log
|
||
Should Match Regexp ${output} tag1
|
||
Should Match Regexp ${output} tag2-v4
|
||
Should Match Regexp ${output} direct
|
||
Should Match Regexp ${output} 10\\.0\\.1\\.1
|
||
Should Match Regexp ${output} 10\\.0\\.2\\.1
|
||
# Tab-separated format
|
||
Should Match Regexp ${output} \\t
|
||
|
||
Logtail if= filter
|
||
[Documentation] Requests to /notfound are suppressed from logtail by
|
||
... the if=$logtail_enabled condition, but still counted.
|
||
${output} = Docker Exec ${SERVER} cat /var/log/nginx/logtail-udp.log
|
||
Should Not Contain ${output} /notfound
|
||
# But /notfound IS in the regular access log (not filtered there).
|
||
${access} = Docker Exec ${SERVER} cat /var/log/nginx/access.log
|
||
Should Contain ${access} /notfound
|
||
|
||
VIP in access log
|
||
[Documentation] $server_addr resolves to real IPs, not 0.0.0.0.
|
||
${output} = Docker Exec ${SERVER} cat /var/log/nginx/access.log
|
||
Should Contain ${output} vip=10.0.1.1
|
||
Should Contain ${output} vip=10.0.2.1
|
||
Should Not Contain ${output} vip=0.0.0.0
|
||
|
||
# --- Reload resilience ---
|
||
|
||
Counters survive reload
|
||
[Documentation] Shared-memory zone persists across nginx -s reload.
|
||
${before} = Get Request Count tag1
|
||
Docker Exec ${SERVER} nginx -s reload
|
||
Sleep 2s Wait for new workers
|
||
${after} = Get Request Count tag1
|
||
Should Be True ${after} >= ${before}
|
||
... Counters dropped after reload: before=${before} after=${after}
|
||
|
||
Traffic after reload
|
||
[Documentation] New requests are counted after reload.
|
||
Send Fast Requests ${CLIENT1} 10.0.1.1 3
|
||
Wait For Flush
|
||
${output} = Scrape With Filter source_tag=tag1
|
||
Should Contain ${output} source_tag="tag1"
|
||
|
||
# --- Counter correctness ---
|
||
|
||
Per-(device, family) request count accuracy
|
||
[Documentation] 10 requests on each of the four (device, family)
|
||
... combinations yields tag1=20, tag2-v4=10, tag2-v6=10.
|
||
... Demonstrates that one device can combine v4+v6 under
|
||
... a single tag while another device can split them.
|
||
${before_tag1} = Get Request Count tag1
|
||
${before_tag2v4} = Get Request Count tag2-v4
|
||
${before_tag2v6} = Get Request Count tag2-v6
|
||
|
||
Send Fast Requests ${CLIENT1} 10.0.1.1 10
|
||
Send Fast Requests v6 ${CLIENT1} 2001:db8:1::1 10
|
||
Send Fast Requests ${CLIENT2} 10.0.2.1 10
|
||
Send Fast Requests v6 ${CLIENT2} 2001:db8:2::1 10
|
||
Wait For Flush
|
||
|
||
${after_tag1} = Get Request Count tag1
|
||
${after_tag2v4} = Get Request Count tag2-v4
|
||
${after_tag2v6} = Get Request Count tag2-v6
|
||
|
||
${delta_tag1} = Evaluate ${after_tag1} - ${before_tag1}
|
||
${delta_tag2v4} = Evaluate ${after_tag2v4} - ${before_tag2v4}
|
||
${delta_tag2v6} = Evaluate ${after_tag2v6} - ${before_tag2v6}
|
||
|
||
Should Be Equal As Integers ${delta_tag1} 20
|
||
Should Be Equal As Integers ${delta_tag2v4} 10
|
||
Should Be Equal As Integers ${delta_tag2v6} 10
|
||
|
||
*** Keywords ***
|
||
|
||
# --- Lab lifecycle ---
|
||
|
||
Deploy Lab
|
||
Require Deb Build
|
||
Run ${CLAB_BIN} --runtime ${runtime} destroy -t ${CURDIR}/${lab-file} --cleanup 2>&1 || true
|
||
${rc} ${output} = Run And Return Rc And Output
|
||
... ${CLAB_BIN} --runtime ${runtime} deploy -t ${CURDIR}/${lab-file}
|
||
Log ${output}
|
||
Should Be Equal As Integers ${rc} 0
|
||
Wait Until Keyword Succeeds 90s 3s Server Is Ready
|
||
Wait Until Keyword Succeeds 60s 3s Client Can Reach Server ${CLIENT1} 10.0.1.1
|
||
Wait Until Keyword Succeeds 60s 3s Client Can Reach Server ${CLIENT2} 10.0.2.1
|
||
|
||
Require Deb Build
|
||
[Documentation] Fail fast with an actionable message if the user
|
||
... forgot to run `make pkg-deb` before invoking this
|
||
... suite. The server container dpkg-installs the
|
||
... built .deb via its bind-mount of build/.
|
||
${rc} ${output} = Run And Return Rc And Output
|
||
... bash -c 'ls ${EXECDIR}/build/libnginx-mod-http-ipng-stats_*.deb 2>/dev/null'
|
||
Run Keyword If ${rc} != 0
|
||
... Fail Module .deb not found — run `make pkg-deb` first.
|
||
|
||
Server Is Ready
|
||
${rc} ${output} = Run And Return Rc And Output
|
||
... curl -sf ${SCRAPE_URL}
|
||
Should Be Equal As Integers ${rc} 0
|
||
|
||
Client Can Reach Server
|
||
[Arguments] ${client} ${server_ip}
|
||
${rc} ${output} = Run And Return Rc And Output
|
||
... docker exec ${client} curl -sf http://${server_ip}:8080/
|
||
Should Be Equal As Integers ${rc} 0
|
||
|
||
Cleanup Lab
|
||
Run docker logs ${SERVER} > ${EXECDIR}/tests/out/server-docker.log 2>&1
|
||
Run docker exec ${SERVER} cat /var/log/nginx/access.log > ${EXECDIR}/tests/out/server-access.log 2>&1
|
||
Run docker exec ${SERVER} cat /var/log/nginx/error.log > ${EXECDIR}/tests/out/server-error.log 2>&1
|
||
Run docker exec ${SERVER} cat /var/log/nginx/logtail-udp.log > ${EXECDIR}/tests/out/server-logtail-udp.log 2>&1
|
||
Run docker exec ${SERVER} ip addr > ${EXECDIR}/tests/out/server-ip-addr.log 2>&1
|
||
Run docker exec ${SERVER} ip route > ${EXECDIR}/tests/out/server-ip-route.log 2>&1
|
||
Run ${CLAB_BIN} --runtime ${runtime} destroy -t ${CURDIR}/${lab-file} --cleanup
|
||
|
||
# --- Traffic generation ---
|
||
|
||
Send Fast Requests
|
||
[Arguments] ${client} ${server_ip} ${count}
|
||
FOR ${i} IN RANGE ${count}
|
||
Docker Exec ${client} curl -sf http://${server_ip}:8080/
|
||
END
|
||
|
||
Send Fast Requests v6
|
||
[Arguments] ${client} ${server_ip} ${count}
|
||
FOR ${i} IN RANGE ${count}
|
||
Docker Exec ${client} curl -sf http://[${server_ip}]:8080/
|
||
END
|
||
|
||
Send Slow Requests
|
||
[Arguments] ${client} ${server_ip} ${count}
|
||
FOR ${i} IN RANGE ${count}
|
||
Docker Exec ${client} curl -sf http://${server_ip}:8080/slow
|
||
END
|
||
|
||
Wait For Flush
|
||
Sleep 2s
|
||
|
||
# --- Scraping ---
|
||
|
||
Scrape Prometheus
|
||
${rc} ${output} = Run And Return Rc And Output
|
||
... curl -sf ${SCRAPE_URL}
|
||
Should Be Equal As Integers ${rc} 0
|
||
RETURN ${output}
|
||
|
||
Scrape With Filter
|
||
[Arguments] ${filter}
|
||
${rc} ${output} = Run And Return Rc And Output
|
||
... curl -sf '${SCRAPE_URL}?${filter}'
|
||
Should Be Equal As Integers ${rc} 0
|
||
RETURN ${output}
|
||
|
||
Get Request Count
|
||
[Arguments] ${source}
|
||
${output} = Scrape With Filter source_tag=${source}
|
||
${matches} = Get Regexp Matches ${output}
|
||
... nginx_ipng_requests_total\\{[^}]*\\}\\s+(\\d+) 1
|
||
${total} = Set Variable 0
|
||
FOR ${m} IN @{matches}
|
||
${total} = Evaluate ${total} + ${m}
|
||
END
|
||
RETURN ${total}
|
||
|
||
# --- Container helpers ---
|
||
|
||
Docker Exec
|
||
[Arguments] ${container} ${cmd}
|
||
${rc} ${output} = Run And Return Rc And Output
|
||
... docker exec ${container} ${cmd}
|
||
Should Be Equal As Integers ${rc} 0
|
||
RETURN ${output}
|
||
|
||
Docker Exec Ignore Rc
|
||
[Arguments] ${container} ${cmd}
|
||
${rc} ${output} = Run And Return Rc And Output
|
||
... docker exec ${container} ${cmd}
|
||
RETURN ${output}
|