Commit Graph

6 Commits

Author SHA1 Message Date
Pim van Pelt
450391af6b Switch per-device attribution from SO_BINDTODEVICE to IP_PKTINFO
SO_BINDTODEVICE pins both ingress *and* egress to the bound
interface — the kernel uses the listening socket's device
binding when choosing the output interface for the SYN-ACK,
which is sent before accept() returns and therefore can't be
fixed up in userspace. That's fatal for maglev / DSR
deployments where the SYN arrives through a GRE tunnel but the
return path has to leave via the default route; the SYN-ACK
goes out the GRE and is dropped by the uplink, so every new
connection times out.

Rework the listen plumbing so the module never touches
SO_BINDTODEVICE. init_module now enables IP_PKTINFO and
IPV6_RECVPKTINFO on every HTTP listening socket and resolves
each configured `device=` name to an ifindex. At request time
resolve_source calls getsockopt(IP_PKTOPTIONS) on the accepted
fd to read the per-connection in(6)_pktinfo cmsg the kernel
stashed during the handshake, then matches (ifindex, family)
against the bindings table. The listening sockets remain plain
wildcards, so the return path follows the normal routing table
and DSR works.

The wrapper also no longer clones or rebinds sockets: it still
dedups per (cscf, sockaddr) so multiple device-tagged listens
in a single server block coexist, and dedups bindings on
(device, family) so the same device can carry different tags
for v4 and v6 (e.g. tag2-v4 / tag2-v6) but not pointlessly
duplicate when a listen include is shared across server blocks.

Drive-by fixes to unblock `make pkg-deb` after a prior
`make build-asan`:
- debian/rules overrides dh_clean to exclude build/, since
  nginx-asan's install creates nobody:0700 temp dirs dh_clean
  can't traverse.
- Makefile's build-asan removes those unused runtime temp dirs
  so the tree is clean afterwards.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 15:00:46 +02:00
Pim van Pelt
cf7a538ee6 PRE-RELEASE v0.4.0
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 13:45:12 +02:00
Pim van Pelt
ef821e577b PRE-RELEASE v0.3.0
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 11:49:49 +02:00
Pim van Pelt
fdef2a552b Harden scrape rendering and add AddressSanitizer test suite
Move all heap allocation out of the slab-mutex critical section in
render_prom/render_json: snapshot cardinality under a brief lock,
allocate aggs/snaps/string tables outside the lock, then re-acquire
only to deep-copy strings and walk the LRU into the pre-allocated
buffers. A worker crash during output buffer allocation can no
longer leave the shared-memory zone locked, and a corrupt cardinality
count is caught by a 10k sanity cap rather than causing a runaway
ngx_pcalloc.

Add build-asan and tests/02-asan/: a full sanitizer-instrumented
nginx + module built via apt-source, and a 2-node containerlab
Robot suite that drives reload storms, concurrent scrape-during-reload,
and intern-table growth, failing if AddressSanitizer or UBSan
reports anything on stderr. The two Robot suites now check for
their required build artifacts up front so `make robot-test` no
longer rebuilds them on every invocation.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 10:58:51 +02:00
Pim van Pelt
cdcbb07c9a Release v0.2.0: single-source the version, wildcard it in docs
Introduces a VERSION variable in the top-level Makefile as the
authoritative source for the module's reported version. A new
version-header target writes src/version.h only when the content
would change, so no-op rebuilds don't rewrite the file. The C source
#includes that header in place of a hardcoded #define; the
user-guide's install example is wildcarded
(libnginx-mod-http-ipng-stats_*_amd64.deb) so it doesn't drift.

The design doc still references v0.2.0 by name — operators read it as
a point-in-time description, not a moving target.

debian/changelog keeps its own 0.2.0-1 entry because dpkg reads the
package version from there directly; the e2e test is updated to match
the JSON schema bump to 2.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 14:45:15 +02:00
Pim van Pelt
5a7e2f77f1 Add ngx_http_ipng_stats_module: per-VIP, per-device traffic counters
Full implementation of the nginx dynamic module with:
- SO_BINDTODEVICE-based per-interface traffic attribution
- Per-worker lock-free counters flushed to shared memory
- Prometheus text and JSON scrape endpoint at configurable location
- UDP-only global logtail (ipng_stats_logtail) for fire-and-forget
  access log streaming
- $ipng_source_tag nginx variable for use in log_format/map
- Histogram buckets, EWMA rate gauges, zone meta-metrics
- Debian packaging (libnginx-mod-http-ipng-stats)
- Robot Framework end-to-end tests via containerlab
- SPDX Apache-2.0 headers on all source files
2026-04-16 17:41:23 +02:00