Files
nginx-ipng-stats-plugin/tests/01-module/01-e2e.robot
Pim van Pelt b3ad74cbde Reduce scrape cardinality: class codes, per-(source,vip) histograms, byte histograms
Collapses the status-code dimension of the counter key into six class
lanes (1xx..5xx/unknown) so per-(source,vip) counter cardinality no
longer grows with the number of distinct three-digit responses nginx
serves. Histogram series drop the code label entirely and aggregate
across classes. Adds nginx_ipng_latency_total with a code class label
so average latency per class can still be computed off the scrape.
Adds nginx_ipng_bytes_{in,out} histograms with configurable boundaries
via the new ipng_stats_byte_buckets directive. Bumps JSON schema to 2.

Operators who need full three-digit-code resolution should consume the
ipng_stats_logtail stream off-host; the stats zone intentionally trades
that resolution for a bounded scrape size.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 14:36:16 +02:00

290 lines
11 KiB
Plaintext

# SPDX-License-Identifier: Apache-2.0
*** Settings ***
Documentation End-to-end tests for ngx_http_ipng_stats_module.
... Deploys a 3-node containerlab topology and validates
... attribution, counters, histograms, filters, variables,
... and reload semantics.
Library OperatingSystem
Library String
Suite Setup Deploy Lab
Suite Teardown Cleanup Lab
*** Variables ***
${lab-name} ipng-stats-test
${lab-file} lab/ipng-stats.clab.yml
${runtime} docker
${CLAB_BIN} sudo containerlab
${SERVER} clab-${lab-name}-server
${CLIENT1} clab-${lab-name}-client1
${CLIENT2} clab-${lab-name}-client2
${SCRAPE_URL} http://172.20.40.2:9113/.well-known/ipng/statsz
${SERVER_MGMT} http://172.20.40.2:8080
*** Test Cases ***
# --- Basic functionality ---
Module loads
[Documentation] nginx -t passes with the module loaded.
${output} = Docker Exec ${SERVER} nginx -t 2>&1
Should Contain ${output} syntax is ok
Prometheus scrape
[Documentation] Scrape returns HELP/TYPE preamble.
${output} = Scrape Prometheus
Should Contain ${output} nginx-ipng-stats-plugin
Should Contain ${output} nginx_ipng_requests_total
JSON scrape
[Documentation] Accept: application/json returns valid JSON with schema.
${rc} ${output} = Run And Return Rc And Output
... curl -sf -H 'Accept: application/json' ${SCRAPE_URL} | python3 -m json.tool
Should Be Equal As Integers ${rc} 0
Should Contain ${output} "schema": 1
# --- Per-device attribution ---
Attribute cl1 via eth1
[Documentation] Traffic on server:eth1 carries source_tag=cl1, vip=10.0.1.1.
Send Fast Requests ${CLIENT1} 10.0.1.1 5
Wait For Flush
${output} = Scrape Prometheus
Should Contain ${output} source_tag="cl1"
Should Contain ${output} vip="10.0.1.1"
Attribute cl2 via eth2
[Documentation] Traffic on server:eth2 carries source_tag=cl2, vip=10.0.2.1.
Send Fast Requests ${CLIENT2} 10.0.2.1 5
Wait For Flush
${output} = Scrape Prometheus
Should Contain ${output} source_tag="cl2"
Should Contain ${output} vip="10.0.2.1"
Direct traffic tagged
[Documentation] Mgmt-interface traffic carries source_tag=direct.
${rc} ${output} = Run And Return Rc And Output
... curl -sf ${SERVER_MGMT}/
Should Be Equal As Integers ${rc} 0
Wait For Flush
${output} = Scrape Prometheus
Should Contain ${output} source_tag="direct"
# --- Status code tracking ---
Per-class code counters
[Documentation] 4xx and 2xx appear as class-bucketed code= labels.
Docker Exec Ignore Rc ${CLIENT1} curl -s http://10.0.1.1:8080/notfound
Docker Exec Ignore Rc ${CLIENT1} curl -s http://10.0.1.1:8080/notfound
Wait For Flush
${output} = Scrape With Filter source_tag=cl1
Should Contain ${output} code="4xx"
Should Contain ${output} code="2xx"
# --- Duration histogram ---
Duration histogram
[Documentation] proxy_pass to a 50 ms backend populates sum and buckets.
Send Slow Requests ${CLIENT1} 10.0.1.1 3
Wait For Flush
${prom} = Scrape With Filter source_tag=cl1
Should Match Regexp ${prom} request_duration_seconds_sum\\{[^}]*\\}\\s+\\d+\\.\\d*[1-9]
${rc} ${json} = Run And Return Rc And Output
... curl -sf -H 'Accept: application/json' '${SCRAPE_URL}?source_tag=cl1' | python3 -m json.tool
Should Be Equal As Integers ${rc} 0
Should Contain ${json} request_duration_ms
Should Contain ${json} buckets
# --- Scrape filters ---
Filter by source_tag
[Documentation] ?source_tag=cl1 returns cl1 only; cl2 only.
${output} = Scrape With Filter source_tag=cl1
Should Contain ${output} source_tag="cl1"
Should Not Contain ${output} source_tag="cl2"
${output} = Scrape With Filter source_tag=cl2
Should Contain ${output} source_tag="cl2"
Should Not Contain ${output} source_tag="cl1"
Filter by VIP
[Documentation] ?vip=10.0.1.1 excludes 10.0.2.1.
${output} = Scrape With Filter vip=10.0.1.1
Should Contain ${output} vip="10.0.1.1"
Should Not Contain ${output} vip="10.0.2.1"
Filter combined
[Documentation] source_tag + vip intersection.
${output} = Scrape With Filter source_tag=cl1&vip=10.0.1.1
Should Contain ${output} source_tag="cl1"
Should Contain ${output} vip="10.0.1.1"
Should Not Contain ${output} source_tag="cl2"
Filter unknown tag
[Documentation] Unknown source_tag returns empty data set.
${output} = Scrape With Filter source_tag=nonexistent
Should Not Contain ${output} nginx_ipng_requests_total{
# --- nginx variable ---
Variable in access log
[Documentation] $ipng_source_tag appears as cl1, cl2, direct in log.
${output} = Docker Exec ${SERVER} cat /var/log/nginx/access.log
Should Match Regexp ${output} src=cl1
Should Match Regexp ${output} src=cl2
Should Match Regexp ${output} src=direct
UDP logtail
[Documentation] ipng_stats_logtail udp:// sends log lines to a local
... nc listener; captured file has all sources and VIPs.
${output} = Docker Exec ${SERVER} cat /var/log/nginx/logtail-udp.log
Should Match Regexp ${output} cl1
Should Match Regexp ${output} cl2
Should Match Regexp ${output} direct
Should Match Regexp ${output} 10\\.0\\.1\\.1
Should Match Regexp ${output} 10\\.0\\.2\\.1
# Tab-separated format
Should Match Regexp ${output} \\t
Logtail if= filter
[Documentation] Requests to /notfound are suppressed from logtail by
... the if=$logtail_enabled condition, but still counted.
${output} = Docker Exec ${SERVER} cat /var/log/nginx/logtail-udp.log
Should Not Contain ${output} /notfound
# But /notfound IS in the regular access log (not filtered there).
${access} = Docker Exec ${SERVER} cat /var/log/nginx/access.log
Should Contain ${access} /notfound
VIP in access log
[Documentation] $server_addr resolves to real IPs, not 0.0.0.0.
${output} = Docker Exec ${SERVER} cat /var/log/nginx/access.log
Should Contain ${output} vip=10.0.1.1
Should Contain ${output} vip=10.0.2.1
Should Not Contain ${output} vip=0.0.0.0
# --- Reload resilience ---
Counters survive reload
[Documentation] Shared-memory zone persists across nginx -s reload.
${before} = Get Request Count cl1
Docker Exec ${SERVER} nginx -s reload
Sleep 2s Wait for new workers
${after} = Get Request Count cl1
Should Be True ${after} >= ${before}
... Counters dropped after reload: before=${before} after=${after}
Traffic after reload
[Documentation] New requests are counted after reload.
Send Fast Requests ${CLIENT1} 10.0.1.1 3
Wait For Flush
${output} = Scrape With Filter source_tag=cl1
Should Contain ${output} source_tag="cl1"
# --- Counter correctness ---
Request count accuracy
[Documentation] 10 requests per client yields exactly 10 delta.
${before_cl1} = Get Request Count cl1
${before_cl2} = Get Request Count cl2
Send Fast Requests ${CLIENT1} 10.0.1.1 10
Send Fast Requests ${CLIENT2} 10.0.2.1 10
Wait For Flush
${after_cl1} = Get Request Count cl1
${after_cl2} = Get Request Count cl2
${delta_cl1} = Evaluate ${after_cl1} - ${before_cl1}
${delta_cl2} = Evaluate ${after_cl2} - ${before_cl2}
Should Be Equal As Integers ${delta_cl1} 10
Should Be Equal As Integers ${delta_cl2} 10
*** Keywords ***
# --- Lab lifecycle ---
Deploy Lab
Run ${CLAB_BIN} --runtime ${runtime} destroy -t ${CURDIR}/${lab-file} --cleanup 2>&1 || true
${rc} ${output} = Run And Return Rc And Output
... ${CLAB_BIN} --runtime ${runtime} deploy -t ${CURDIR}/${lab-file}
Log ${output}
Should Be Equal As Integers ${rc} 0
Wait Until Keyword Succeeds 90s 3s Server Is Ready
Wait Until Keyword Succeeds 60s 3s Client Can Reach Server ${CLIENT1} 10.0.1.1
Wait Until Keyword Succeeds 60s 3s Client Can Reach Server ${CLIENT2} 10.0.2.1
Server Is Ready
${rc} ${output} = Run And Return Rc And Output
... curl -sf ${SCRAPE_URL}
Should Be Equal As Integers ${rc} 0
Client Can Reach Server
[Arguments] ${client} ${server_ip}
${rc} ${output} = Run And Return Rc And Output
... docker exec ${client} curl -sf http://${server_ip}:8080/
Should Be Equal As Integers ${rc} 0
Cleanup Lab
Run docker logs ${SERVER} > ${EXECDIR}/tests/out/server-docker.log 2>&1
Run docker exec ${SERVER} cat /var/log/nginx/access.log > ${EXECDIR}/tests/out/server-access.log 2>&1
Run docker exec ${SERVER} cat /var/log/nginx/error.log > ${EXECDIR}/tests/out/server-error.log 2>&1
Run docker exec ${SERVER} cat /var/log/nginx/logtail-udp.log > ${EXECDIR}/tests/out/server-logtail-udp.log 2>&1
Run docker exec ${SERVER} ip addr > ${EXECDIR}/tests/out/server-ip-addr.log 2>&1
Run docker exec ${SERVER} ip route > ${EXECDIR}/tests/out/server-ip-route.log 2>&1
Run ${CLAB_BIN} --runtime ${runtime} destroy -t ${CURDIR}/${lab-file} --cleanup
# --- Traffic generation ---
Send Fast Requests
[Arguments] ${client} ${server_ip} ${count}
FOR ${i} IN RANGE ${count}
Docker Exec ${client} curl -sf http://${server_ip}:8080/
END
Send Slow Requests
[Arguments] ${client} ${server_ip} ${count}
FOR ${i} IN RANGE ${count}
Docker Exec ${client} curl -sf http://${server_ip}:8080/slow
END
Wait For Flush
Sleep 2s
# --- Scraping ---
Scrape Prometheus
${rc} ${output} = Run And Return Rc And Output
... curl -sf ${SCRAPE_URL}
Should Be Equal As Integers ${rc} 0
RETURN ${output}
Scrape With Filter
[Arguments] ${filter}
${rc} ${output} = Run And Return Rc And Output
... curl -sf '${SCRAPE_URL}?${filter}'
Should Be Equal As Integers ${rc} 0
RETURN ${output}
Get Request Count
[Arguments] ${source}
${output} = Scrape With Filter source_tag=${source}
${matches} = Get Regexp Matches ${output}
... nginx_ipng_requests_total\\{[^}]*\\}\\s+(\\d+) 1
${total} = Set Variable 0
FOR ${m} IN @{matches}
${total} = Evaluate ${total} + ${m}
END
RETURN ${total}
# --- Container helpers ---
Docker Exec
[Arguments] ${container} ${cmd}
${rc} ${output} = Run And Return Rc And Output
... docker exec ${container} ${cmd}
Should Be Equal As Integers ${rc} 0
RETURN ${output}
Docker Exec Ignore Rc
[Arguments] ${container} ${cmd}
${rc} ${output} = Run And Return Rc And Output
... docker exec ${container} ${cmd}
RETURN ${output}