Add ngx_http_ipng_stats_module: per-VIP, per-device traffic counters
Full implementation of the nginx dynamic module with: - SO_BINDTODEVICE-based per-interface traffic attribution - Per-worker lock-free counters flushed to shared memory - Prometheus text and JSON scrape endpoint at configurable location - UDP-only global logtail (ipng_stats_logtail) for fire-and-forget access log streaming - $ipng_source_tag nginx variable for use in log_format/map - Histogram buckets, EWMA rate gauges, zone meta-metrics - Debian packaging (libnginx-mod-http-ipng-stats) - Robot Framework end-to-end tests via containerlab - SPDX Apache-2.0 headers on all source files
This commit is contained in:
384
docs/user-guide.md
Normal file
384
docs/user-guide.md
Normal file
@@ -0,0 +1,384 @@
|
||||
<!-- SPDX-License-Identifier: Apache-2.0 -->
|
||||
# nginx-ipng-stats-plugin — User Guide
|
||||
|
||||
This document walks an operator through installing the plugin, deploying it on a single nginx host serving traffic that arrives on
|
||||
distinct interfaces (GRE tunnels, VLANs, bonded links, or plain ethernet), verifying that counters are flowing, and hooking up the
|
||||
scrape endpoint to Prometheus and other consumers.
|
||||
|
||||
It covers (NFR-7.1):
|
||||
|
||||
1. Installing the Debian package.
|
||||
2. Setting up interfaces for per-device attribution (GRE tunnel example).
|
||||
3. Writing a minimal nginx configuration.
|
||||
4. Verifying with `curl`.
|
||||
5. Scraping from Prometheus.
|
||||
6. Setting up a global logtail access log.
|
||||
7. Integrating with scrape consumers.
|
||||
|
||||
For a directive-by-directive reference, read [`config-guide.md`](config-guide.md) alongside this guide.
|
||||
|
||||
## 1. Install the package
|
||||
|
||||
On Debian Trixie (and newer), the module is distributed as `libnginx-mod-http-ipng-stats`. The package depends on the stock `nginx`
|
||||
package and loads cleanly into it without recompiling nginx itself.
|
||||
|
||||
```
|
||||
sudo apt install ./libnginx-mod-http-ipng-stats_0.1.0-1_amd64.deb
|
||||
```
|
||||
|
||||
The package will:
|
||||
|
||||
- Drop `ngx_http_ipng_stats_module.so` into `/usr/lib/nginx/modules/`.
|
||||
- Place a `load_module` stanza in `/etc/nginx/modules-available/50-mod-http-ipng-stats.conf`.
|
||||
- Symlink it into `/etc/nginx/modules-enabled/` so nginx picks it up on the next reload.
|
||||
- Run `nginx -t` and, if the test fails, remove the `modules-enabled` symlink and print a warning — so a broken upgrade never leaves
|
||||
you with an nginx that cannot start.
|
||||
|
||||
Confirm the module is loaded:
|
||||
|
||||
```
|
||||
nginx -V 2>&1 | grep -o ngx_http_ipng_stats_module
|
||||
```
|
||||
|
||||
## 2. Set up interfaces for per-device attribution
|
||||
|
||||
The plugin attributes traffic by watching which interface the request came in on, using `SO_BINDTODEVICE` on per-interface listening
|
||||
sockets. For this to work, each traffic source that should be tracked separately MUST arrive on its own interface.
|
||||
|
||||
This works with any kind of Linux interface — GRE tunnels, VLANs, VXLANs, bonded links, or plain ethernet. This guide uses GRE
|
||||
tunnels as the example, but the module does not care about the interface type.
|
||||
|
||||
This guide doesn't prescribe a specific networking layer — use whatever your host already uses (`systemd-networkd`, Netplan,
|
||||
`/etc/network/interfaces`, or a hand-rolled script). The only hard requirement is:
|
||||
|
||||
- Each traffic source that should be separately attributed gets its own interface on the nginx host.
|
||||
- Interfaces follow a consistent naming pattern. For GRE tunnels we recommend `gre-<tag>`, e.g. `gre-mg1`, `gre-mg2`.
|
||||
- The VIPs are bound to a local dummy or loopback interface so the kernel accepts packets destined for them.
|
||||
|
||||
For example, with `systemd-networkd`, a GRE tunnel to a remote peer at `2001:db8::1` from this host at `2001:db8::100` looks like:
|
||||
|
||||
```
|
||||
# /etc/systemd/network/10-gre-mg1.netdev
|
||||
[NetDev]
|
||||
Name=gre-mg1
|
||||
Kind=ip6gre
|
||||
|
||||
[Tunnel]
|
||||
Local=2001:db8::100
|
||||
Remote=2001:db8::1
|
||||
TTL=64
|
||||
```
|
||||
|
||||
```
|
||||
# /etc/systemd/network/10-gre-mg1.network
|
||||
[Match]
|
||||
Name=gre-mg1
|
||||
|
||||
[Network]
|
||||
LinkLocalAddressing=no
|
||||
```
|
||||
|
||||
Repeat for each additional tunnel. A trimmed-down variant of this scheme is what IPng uses in production.
|
||||
|
||||
Verify the interfaces exist and carry traffic:
|
||||
|
||||
```
|
||||
ip -6 tunnel show | grep gre-mg
|
||||
ip -6 -s link show gre-mg1
|
||||
```
|
||||
|
||||
## 3. Write the nginx configuration
|
||||
|
||||
The plugin needs three things in `nginx.conf`:
|
||||
|
||||
1. A shared-memory zone for counters (`ipng_stats_zone`).
|
||||
2. A set of `listen` directives — a wildcard fallback plus one device-bound listener per attributed interface.
|
||||
3. A scrape location serving the `ipng_stats` handler.
|
||||
|
||||
A minimal working configuration looks like this:
|
||||
|
||||
```nginx
|
||||
load_module modules/ngx_http_ipng_stats_module.so;
|
||||
|
||||
events {
|
||||
worker_connections 4096;
|
||||
}
|
||||
|
||||
http {
|
||||
ipng_stats_zone ipng:4m;
|
||||
ipng_stats_flush_interval 1s;
|
||||
ipng_stats_default_source direct;
|
||||
|
||||
# A normal vhost. The fallback listen lines serve direct web traffic;
|
||||
# the included file adds one device-bound listen per attributed interface.
|
||||
server {
|
||||
listen 80;
|
||||
listen [::]:80;
|
||||
include /etc/nginx/ipng-stats/listens.conf;
|
||||
|
||||
server_name _;
|
||||
root /var/www/html;
|
||||
}
|
||||
|
||||
# A second server block exposing the scrape endpoint on a locked-down port.
|
||||
server {
|
||||
listen 127.0.0.1:9113;
|
||||
listen [::1]:9113;
|
||||
|
||||
location = /.well-known/ipng/statsz {
|
||||
ipng_stats;
|
||||
allow 127.0.0.1;
|
||||
allow ::1;
|
||||
allow 2001:db8::/48; # your scrape consumers
|
||||
deny all;
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
And `/etc/nginx/ipng-stats/listens.conf` — the hand-maintained include file — is two lines per attributed interface (one per address
|
||||
family):
|
||||
|
||||
```nginx
|
||||
listen 80 device=gre-mg1 ipng_source_tag=mg1;
|
||||
listen [::]:80 device=gre-mg1 ipng_source_tag=mg1;
|
||||
listen 80 device=gre-mg2 ipng_source_tag=mg2;
|
||||
listen [::]:80 device=gre-mg2 ipng_source_tag=mg2;
|
||||
listen 80 device=gre-mg3 ipng_source_tag=mg3;
|
||||
listen [::]:80 device=gre-mg3 ipng_source_tag=mg3;
|
||||
listen 80 device=gre-mg4 ipng_source_tag=mg4;
|
||||
listen [::]:80 device=gre-mg4 ipng_source_tag=mg4;
|
||||
```
|
||||
|
||||
Test and reload:
|
||||
|
||||
```
|
||||
sudo nginx -t
|
||||
sudo nginx -s reload
|
||||
```
|
||||
|
||||
If `nginx -t` complains about an unknown `listen` parameter (`device=` or `ipng_source_tag=`), the module isn't loaded — check step 1.
|
||||
|
||||
### Why wildcard listens?
|
||||
|
||||
You do not need to enumerate VIPs in `listen`. A wildcard `listen 80 device=gre-mg1 ipng_source_tag=mg1;` accepts any local address
|
||||
served through the `gre-mg1` interface, and nginx routes per-request to the right vhost by `server_name` / `Host:` header. Adding a new
|
||||
VIP is a `server_name` change; adding a new interface is an append to `listens.conf`.
|
||||
|
||||
### Why both a wildcard and device-bound listens?
|
||||
|
||||
The fallback `listen 80;` / `listen [::]:80;` catches traffic arriving on any interface that isn't one of your attributed interfaces —
|
||||
for example, real clients hitting your host directly over `eth0`. The kernel's TCP socket lookup prefers the most-specific
|
||||
(device-matching) listener, so a SYN on `gre-mg1` always lands on the `mg1` socket, and a SYN on `eth0` always lands on the fallback.
|
||||
No races, no stealing. Direct traffic is counted under the tag set by `ipng_stats_default_source` (`direct` by default).
|
||||
|
||||
## 4. Verify with curl
|
||||
|
||||
Generate some traffic (or wait for real traffic), then scrape the endpoint locally:
|
||||
|
||||
```
|
||||
curl -s http://127.0.0.1:9113/.well-known/ipng/statsz
|
||||
```
|
||||
|
||||
Default output is Prometheus text format:
|
||||
|
||||
```
|
||||
# HELP nginx_ipng_requests_total Total HTTP requests, per (source_tag, vip, code).
|
||||
# TYPE nginx_ipng_requests_total counter
|
||||
nginx_ipng_requests_total{source_tag="mg1",vip="192.0.2.10",code="200"} 12345
|
||||
nginx_ipng_requests_total{source_tag="mg1",vip="192.0.2.10",code="404"} 17
|
||||
nginx_ipng_requests_total{source_tag="mg2",vip="192.0.2.10",code="200"} 9876
|
||||
nginx_ipng_requests_total{source_tag="direct",vip="192.0.2.10",code="200"} 42
|
||||
# HELP nginx_ipng_bytes_in_total Request bytes received, per (source_tag, vip, code).
|
||||
# TYPE nginx_ipng_bytes_in_total counter
|
||||
nginx_ipng_bytes_in_total{source_tag="mg1",vip="192.0.2.10",code="200"} 9876543
|
||||
# ... and so on
|
||||
```
|
||||
|
||||
For JSON output instead, set the `Accept` header:
|
||||
|
||||
```
|
||||
curl -s -H 'Accept: application/json' http://127.0.0.1:9113/.well-known/ipng/statsz | jq .
|
||||
```
|
||||
|
||||
To filter server-side to a single source tag:
|
||||
|
||||
```
|
||||
curl -s 'http://127.0.0.1:9113/.well-known/ipng/statsz?source_tag=mg1'
|
||||
curl -s 'http://127.0.0.1:9113/.well-known/ipng/statsz?source_tag=mg1&vip=192.0.2.10'
|
||||
```
|
||||
|
||||
If you see `source_tag="direct"` entries with non-zero counts and you expected all traffic to come in via attributed interfaces,
|
||||
something is routing around them — typically an interface that isn't in `listens.conf`, or an interface that's down.
|
||||
|
||||
## 5. Scrape from Prometheus
|
||||
|
||||
The same endpoint serves Prometheus text by default. Add a scrape job:
|
||||
|
||||
```yaml
|
||||
# /etc/prometheus/prometheus.yml
|
||||
scrape_configs:
|
||||
- job_name: nginx-ipng
|
||||
scrape_interval: 15s
|
||||
static_configs:
|
||||
- targets:
|
||||
- 'nginx-backend-1.example.com:9113'
|
||||
- 'nginx-backend-2.example.com:9113'
|
||||
metrics_path: /.well-known/ipng/statsz
|
||||
```
|
||||
|
||||
You'll want to add `nginx-backend-*` to your `allow` rules in the scrape server block, or front the plugin with a TLS-terminating
|
||||
reverse proxy. The module does not ship its own auth; the nginx `allow`/`deny` ACL is your access control.
|
||||
|
||||
Typical PromQL queries:
|
||||
|
||||
```
|
||||
# Requests per second per source, per VIP:
|
||||
sum by (source_tag, vip) (rate(nginx_ipng_requests_total[1m]))
|
||||
|
||||
# 5xx error rate per VIP, aggregated across all sources:
|
||||
sum by (vip) (rate(nginx_ipng_requests_total{code=~"5.."}[5m]))
|
||||
/
|
||||
sum by (vip) (rate(nginx_ipng_requests_total[5m]))
|
||||
|
||||
# p95 request duration per (source_tag, vip):
|
||||
histogram_quantile(0.95,
|
||||
sum by (source_tag, vip, le) (rate(nginx_ipng_request_duration_seconds_bucket[5m])))
|
||||
```
|
||||
|
||||
## 6. Set up a global logtail access log
|
||||
|
||||
Operators who want a single unified access log covering all traffic — regardless of which `server` block handled the request — normally
|
||||
have to repeat `access_log` in every `server {}` block or rely on a catch-all virtual host. The `ipng_stats_logtail` directive removes
|
||||
that requirement: one line at the `http` level registers a global log-phase writer that fires unconditionally for every request (FR-8.1).
|
||||
|
||||
The logtail sends each buffer flush as a single UDP datagram to a `host:port`. Zero disk I/O, no backpressure, no blocking if the
|
||||
receiver is down. This makes it ideal for fire-and-forget analytics pipelines where delivery guarantees are unnecessary and disk writes
|
||||
would add unwanted I/O pressure. For file-based access logging, use nginx's built-in `access_log` directive.
|
||||
|
||||
### Define the log format
|
||||
|
||||
Add a `log_format` declaration inside the `http { ... }` block, **before** the `ipng_stats_logtail` directive that references it:
|
||||
|
||||
```nginx
|
||||
log_format logtail '$host\t$remote_addr\t$ipng_source_tag\t$server_addr\t'
|
||||
'$request_method\t$request_uri\t$status\t$body_bytes_sent\t'
|
||||
'$request_time';
|
||||
```
|
||||
|
||||
Any nginx variable is usable here, including `$ipng_source_tag` (the device attribution tag, FR-6.1) and `$server_addr` (the VIP
|
||||
that received the request).
|
||||
|
||||
### Configuration
|
||||
|
||||
```nginx
|
||||
http {
|
||||
ipng_stats_zone ipng:4m;
|
||||
|
||||
log_format logtail '$host\t$remote_addr\t$ipng_source_tag\t$server_addr\t'
|
||||
'$request_method\t$request_uri\t$status\t$body_bytes_sent\t'
|
||||
'$request_time';
|
||||
|
||||
ipng_stats_logtail logtail udp://127.0.0.1:9514 buffer=16k flush=1s;
|
||||
|
||||
server { ... }
|
||||
}
|
||||
```
|
||||
|
||||
- **`logtail`** (first argument) — the `log_format` name.
|
||||
- **`udp://127.0.0.1:9514`** — destination as a `udp://host:port` URI. `host` must be a literal IPv4 address (no hostnames, no IPv6
|
||||
in v0.1).
|
||||
- **`buffer=16k`** — per-worker write buffer. Lines are held in memory until the buffer fills, the flush timer fires, or the worker
|
||||
exits. Default is `64k`; minimum is `1k` (FR-8.3).
|
||||
- **`flush=1s`** — maximum age of buffered data before it is sent. Default is `1s`; minimum is `100ms` (FR-8.3).
|
||||
|
||||
Each buffer flush becomes a single `sendto()` on a per-worker `SOCK_DGRAM` socket. When the flush timer fires (or the buffer fills),
|
||||
the entire buffered payload is sent as one datagram — no file open, no `write()`, no `fsync()`. If no receiver is listening, the kernel
|
||||
drops the datagram silently and the worker carries on. This is by design: the logtail exists for non-critical analytics pipes where
|
||||
lost datagrams are acceptable and disk I/O is not.
|
||||
|
||||
**Constraints (v0.1):**
|
||||
|
||||
- `host` must be a literal IPv4 address. Hostnames and IPv6 are not supported yet.
|
||||
- Large `buffer=` values produce large datagrams. On the loopback interface the practical ceiling is ~64 KB, well above typical
|
||||
configured buffer sizes. On routed paths, path MTU applies.
|
||||
- There is no acknowledgment, retry, or sequence number. If the receiver is down, the data is gone.
|
||||
|
||||
**Starting a receiver** is trivial:
|
||||
|
||||
```bash
|
||||
# Quick one-shot inspection:
|
||||
nc -u -l 127.0.0.1 9514
|
||||
```
|
||||
|
||||
For a production-ready logtail consumer, see [`nginx-logtail`](https://git.ipng.ch/ipng/nginx-logtail), which receives the UDP
|
||||
datagram stream and processes it into structured log output.
|
||||
|
||||
A typical received log line (with the format above, tab-separated) looks like:
|
||||
|
||||
```
|
||||
example.com 203.0.113.42 mg1 192.0.2.10 GET /index.html 200 4321 0.003
|
||||
```
|
||||
|
||||
The third field (`mg1`) comes from `$ipng_source_tag` — free per-device attribution in every log line.
|
||||
|
||||
### Why this complements per-server `access_log`
|
||||
|
||||
A conventional nginx access log requires the operator to repeat `access_log /path/to/file logtail;` in every `server {}` block that
|
||||
should be captured. This is error-prone: adding a new vhost and forgetting the directive means that vhost's traffic is silently absent
|
||||
from the log. `ipng_stats_logtail` is installed at the module's log-phase hook, which nginx calls for every request with no per-server
|
||||
configuration required.
|
||||
|
||||
See [`config-guide.md`](config-guide.md#ipng_stats_logtail-format_name-udphostport-buffersize-flushduration) for the full directive
|
||||
reference and FR-8 for the requirements behind this feature.
|
||||
|
||||
## 7. Integrate with scrape consumers
|
||||
|
||||
The scrape endpoint (`ipng_stats;`) serves both Prometheus text and JSON from a single location. Any HTTP client that can issue a GET
|
||||
request can consume it. Two integration patterns are common:
|
||||
|
||||
### Prometheus
|
||||
|
||||
See section 5 above. Prometheus scrapes the endpoint at a configured interval and stores the time series. This is the simplest
|
||||
integration and covers most monitoring and alerting use cases.
|
||||
|
||||
### Custom consumers
|
||||
|
||||
The `?source_tag=<tag>` query parameter lets a consumer filter the scrape response to only the traffic attributed to a specific source.
|
||||
This is useful when multiple consumers share the same nginx backends — each consumer scrapes with its own tag and never sees the
|
||||
others' traffic.
|
||||
|
||||
The JSON output (`Accept: application/json`) includes a top-level `schema` field for versioning, making it straightforward to parse
|
||||
from any language.
|
||||
|
||||
Once wired, a consumer can derive from the scrape data:
|
||||
|
||||
- Live QPS per backend (from the EWMA gauges).
|
||||
- Status-code mix per backend (from the counter families).
|
||||
- p50/p95 latency per backend (from the duration histogram).
|
||||
- Traffic volume per backend (from the bytes counters).
|
||||
|
||||
For an example of this pattern in a GRE tunnel fleet, see [`vpp-maglev`](https://git.ipng.ch/ipng/vpp-maglev), whose frontend scrapes
|
||||
each nginx backend filtered by source tag to show per-backend traffic alongside health state.
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
**`nginx -t` reports "unknown listen parameter: device=" or "unknown listen parameter: ipng_source_tag=".** The module isn't loaded.
|
||||
Check `/etc/nginx/modules-enabled/` for the `50-mod-http-ipng-stats.conf` symlink and re-run `nginx -t`.
|
||||
|
||||
**All traffic is attributed to `direct` even though device-bound interfaces exist.** The interface names don't match the `device=`
|
||||
values in `listens.conf`, or the interfaces aren't up. Run `ip -br link` and confirm the interface names match.
|
||||
|
||||
**Counters reset after every reload.** They should survive `nginx -s reload`. If they don't, check that the `ipng_stats_zone` name in
|
||||
`nginx.conf` is stable across reloads — renaming the zone forces a new shared-memory segment.
|
||||
|
||||
**`nginx_ipng_zone_full_events_total` is non-zero.** The shared-memory zone is too small for your VIP count. Increase the size in
|
||||
`ipng_stats_zone ipng:<size>` (default 4 MB is enough for ~hundreds of VIPs with the full status-code set).
|
||||
|
||||
**`curl http://127.0.0.1:9113/.well-known/ipng/statsz` returns "403 Forbidden".** The `allow`/`deny` ACL is blocking your source address. Either add
|
||||
yourself or scrape from a host already in the allow list.
|
||||
|
||||
## Where to go next
|
||||
|
||||
- [`config-guide.md`](config-guide.md) — every directive and `listen` parameter with contexts, allowed values, and defaults.
|
||||
- [`design.md`](design.md) — full design document, including the attribution model, hot-path cost analysis, and failure modes.
|
||||
Reference in New Issue
Block a user