Self-heal device= → ifindex attribution and expose plugin meta
counters in the scrape.
ipng_stats_rescan_interval (default 60s, 0 to disable) runs a
per-worker timer that re-resolves every binding via if_nametoindex,
so interface teardown/recreate (e.g. GRE tunnel reprovision) picks
up the new ifindex without requiring an nginx reload.
nginx_ipng_ifindex_misses_total increments whenever a cmsg-reported
ingress ifindex doesn't match any binding — making stale mappings
observable. Also expose the existing zone_full_events and
flushes_total shared-memory counters, which were tracked but never
emitted. JSON output gains a top-level "meta" object; schema stays
at 2 (additive change).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
SO_BINDTODEVICE pins both ingress *and* egress to the bound
interface — the kernel uses the listening socket's device
binding when choosing the output interface for the SYN-ACK,
which is sent before accept() returns and therefore can't be
fixed up in userspace. That's fatal for maglev / DSR
deployments where the SYN arrives through a GRE tunnel but the
return path has to leave via the default route; the SYN-ACK
goes out the GRE and is dropped by the uplink, so every new
connection times out.
Rework the listen plumbing so the module never touches
SO_BINDTODEVICE. init_module now enables IP_PKTINFO and
IPV6_RECVPKTINFO on every HTTP listening socket and resolves
each configured `device=` name to an ifindex. At request time
resolve_source calls getsockopt(IP_PKTOPTIONS) on the accepted
fd to read the per-connection in(6)_pktinfo cmsg the kernel
stashed during the handshake, then matches (ifindex, family)
against the bindings table. The listening sockets remain plain
wildcards, so the return path follows the normal routing table
and DSR works.
The wrapper also no longer clones or rebinds sockets: it still
dedups per (cscf, sockaddr) so multiple device-tagged listens
in a single server block coexist, and dedups bindings on
(device, family) so the same device can carry different tags
for v4 and v6 (e.g. tag2-v4 / tag2-v6) but not pointlessly
duplicate when a listen include is shared across server blocks.
Drive-by fixes to unblock `make pkg-deb` after a prior
`make build-asan`:
- debian/rules overrides dh_clean to exclude build/, since
nginx-asan's install creates nobody:0700 temp dirs dh_clean
can't traverse.
- Makefile's build-asan removes those unused runtime temp dirs
so the tree is clean afterwards.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Nginx's config-level duplicate-listen check rejected the
documented pattern of `listen 80 device=X ipng_source_tag=A;
listen 80 device=Y ipng_source_tag=B;` with "a duplicate listen
0.0.0.0:80", and even when the dedup was bypassed the kernel
refused the second bind() because the first socket was already
holding the port without SO_BINDTODEVICE.
The listen wrapper now detects same-sockaddr duplicates before
the core handler sees them and records them with `needs_clone=1`.
In init_module, phase 1 clones an ngx_listening_t for each such
duplicate, phase 3 closes every inherited naked fd, and phase 4
rebinds every target with SO_REUSEADDR + SO_REUSEPORT +
SO_BINDTODEVICE set before bind(). SO_REUSEPORT keeps
`nginx -s reload` from colliding with the still-bound sockets
held by old workers during graceful drain; IPV6_V6ONLY matches
nginx's default so the IPv6 listen doesn't claim the IPv4
wildcard and collide with sibling IPv4-specific listens.
Restructure 01-module to cover the pattern end-to-end: four
device-pinned listens on port 8080 (eth1 shares tag `tag1`
across v4 and v6; eth2 splits into `tag2-v4` / `tag2-v6`),
clients and server both get IPv6 addresses, and a new
"Per-(device, family) request count accuracy" case proves that
10 requests on each of the four combinations yields tag1=20,
tag2-v4=10, tag2-v6=10. Mgmt/direct traffic moves to port 9180
so it no longer clashes with the shared-port wildcards.
Document the constraint in docs/user-guide.md: all listens on
a given port must carry `device=`, and direct traffic belongs
on a separate port.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Introduces a VERSION variable in the top-level Makefile as the
authoritative source for the module's reported version. A new
version-header target writes src/version.h only when the content
would change, so no-op rebuilds don't rewrite the file. The C source
#includes that header in place of a hardcoded #define; the
user-guide's install example is wildcarded
(libnginx-mod-http-ipng-stats_*_amd64.deb) so it doesn't drift.
The design doc still references v0.2.0 by name — operators read it as
a point-in-time description, not a moving target.
debian/changelog keeps its own 0.2.0-1 entry because dpkg reads the
package version from there directly; the e2e test is updated to match
the JSON schema bump to 2.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Collapses the status-code dimension of the counter key into six class
lanes (1xx..5xx/unknown) so per-(source,vip) counter cardinality no
longer grows with the number of distinct three-digit responses nginx
serves. Histogram series drop the code label entirely and aggregate
across classes. Adds nginx_ipng_latency_total with a code class label
so average latency per class can still be computed off the scrape.
Adds nginx_ipng_bytes_{in,out} histograms with configurable boundaries
via the new ipng_stats_byte_buckets directive. Bumps JSON schema to 2.
Operators who need full three-digit-code resolution should consume the
ipng_stats_logtail stream off-host; the stats zone intentionally trades
that resolution for a bounded scrape size.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- ipng_stats_logtail now accepts an optional if=$variable parameter
that suppresses log lines when the variable is empty or "0",
following the same semantics as nginx's access_log if=. The
condition is checked before format rendering for zero overhead on
filtered requests. Filtered requests are still counted by stats.
- Log format examples updated to include $scheme for http/https
visibility, and renamed to ipng_stats_logtail to match production.
- Robot test added for the if= filter (19 tests, 19 pass).
- FR-8.5 added to design doc for the if= semantics.
Full implementation of the nginx dynamic module with:
- SO_BINDTODEVICE-based per-interface traffic attribution
- Per-worker lock-free counters flushed to shared memory
- Prometheus text and JSON scrape endpoint at configurable location
- UDP-only global logtail (ipng_stats_logtail) for fire-and-forget
access log streaming
- $ipng_source_tag nginx variable for use in log_format/map
- Histogram buckets, EWMA rate gauges, zone meta-metrics
- Debian packaging (libnginx-mod-http-ipng-stats)
- Robot Framework end-to-end tests via containerlab
- SPDX Apache-2.0 headers on all source files