Merge remote-tracking branch 'origin/main'

# Conflicts:
#	Makefile
#	debian/changelog
#	tests/01-module/01-e2e.robot
This commit is contained in:
2026-04-23 09:28:56 +02:00
6 changed files with 230 additions and 39 deletions

122
docs/nginx-issues.md Normal file
View File

@@ -0,0 +1,122 @@
# Upstream nginx quirks that affect this plugin
This file catalogs behaviors of nginx itself (not bugs in this plugin) that
have a visible effect on the counters the plugin exposes. It exists so a
future reader can page the context back in quickly instead of re-deriving it
from the nginx source.
All `path:line` citations below are into the **nginx source tree** (checked
against tip as of 2026-04), not into this repository.
## TL;DR
- `r->connection->sent` is **per-request** (reset between HTTP/1.1 keepalive
requests, per-stream on HTTP/2). Safe to `+=` in the log phase.
- `r->request_length` is **not** a faithful count of bytes received from the
client. It is only updated on specific code paths inside nginx, and the
discard-body path — which is taken on every request that nginx rejects
before a handler reads the body — bypasses it entirely. On a workload with
lots of early rejections (rate-limit 403, auth_request denial, 413 with
fixed Content-Length, `return 4xx;` on POST), `$request_length` and
therefore `nginx_ipng_bytes_in_total` under-count the wire traffic
substantially.
## `r->request_length` — where it is, and is not, updated
| Path | Updates `request_length`? | Source |
| ----------------------------------------------------- | ------------------------- | --------------------------------------------------- |
| Buffered body (`ngx_http_do_read_client_request_body`) | yes | `ngx_http_request_body.c:397` (after every `recv`), preread at `:120` |
| Unbuffered body (`ngx_http_read_unbuffered_request_body`) | yes (HTTP/1 — delegates to the buffered path) | `ngx_http_request_body.c:265` |
| Discard body (`ngx_http_discard_request_body``ngx_http_read_discarded_request_body`) | **NO** | `ngx_http_request_body.c:782-841` — reads into a stack buffer, never touches `request_length` |
| Chunked HTTP/1 | yes (indirect) | Wire bytes counted at `:397` before the chunked parser runs |
| HTTP/2 DATA payload | yes (payload only) | `ngx_http_v2_filter_request_body` at `v2/ngx_http_v2.c:4201` |
| HTTP/2 DATA via `stream->skip_data` | **NO** | `v2/ngx_http_v2.c:1077-1082` — skipped stream data is silently dropped |
| HTTP/2 frame headers, SETTINGS, WINDOW_UPDATE, PING, PRIORITY, RST_STREAM, GOAWAY, padding | **NO** | Not counted anywhere |
| HTTP/2 HEADERS / CONTINUATION | yes (decoded size only) | `v2/ngx_http_v2.c:1333, 1968` |
| HTTP/3 DATA | yes (payload only) | `v3/ngx_http_v3_request.c:1559, 1614, 1660, 1668` |
| Pipelined-request carry-over in `header_in` | **subtracted** back out | `ngx_http_request_body.c:541` (`r->request_length -= n`) |
| TLS handshake / record framing | never visible to nginx | Below `c->recv` |
### 413 `client_max_body_size` specifically
- **Fixed Content-Length:** `ngx_http_core_module.c:1006-1018` detects the
overrun right after headers are parsed and calls
`ngx_http_discard_request_body()` before finalizing 413. That path does
not bump `request_length`, so `$request_length` ends up reflecting only
the request line + headers. Body bytes that actually streamed in before
nginx closed are invisible.
- **Chunked HTTP/1:** `ngx_http_request_body.c:1142-1154` trips 413 inside
the body filter after the wire bytes have already been counted at `:397`,
so they are included — but only up to what nginx had `recv`ed before
returning.
- **HTTP/2 chunked:** `v2/ngx_http_v2.c:4216-4223` returns 413 after the
`r->request_length +=` at `:4201`, so partial-body bytes do count.
### The common "rate-limit 403" pattern
`limit_req` rejects in PREACCESS. On rejection, nginx finalizes with 403,
which calls `ngx_http_discard_request_body()`. On HTTP/1 this sends the
body into the stack-buffer discard reader (not counted). On HTTP/2 it sets
the stream's `skip_data` flag at `v2/ngx_http_v2.c:1077-1082`, so the
client's DATA frames are parsed and their payload silently dropped (also
not counted). Either way, `$request_length` reflects headers only.
This is the dominant source of under-counting on rate-limited abusive
workloads. Observed example: a Certificate Transparency client hammering
`POST /ct/v1/add-chain` at ~460 req/s, each carrying a ~10 KB cert chain
in the body, all hitting 403. 460 × 10 KB = ~4.6 MB/s of inbound bytes
completely absent from `$request_length`.
## `r->connection->sent` — this one is fine
- HTTP/1.1 keepalive: reset at `ngx_http_request.c:3493` (in
`ngx_http_keepalive_handler`, which begins at `:3356`) when a new request
arrives on an idle keepalive connection, and at
`ngx_http_request.c:3226` (in `ngx_http_set_keepalive`, which begins at
`:3128`) for the pipelined-next-request case.
- HTTP/2: each request has a fake per-stream connection `fc` initialized
with `fc->sent = 0` at `v2/ngx_http_v2.c:3039-3044`, so
`r->connection->sent` is per-stream. Framing overhead on the real TCP
connection (HEADERS, SETTINGS, WINDOW_UPDATE, PING, GOAWAY) is invisible
to `r->connection->sent` — that is consistent with how
`$bytes_sent` behaves and is fine for per-request accounting.
- `ngx_http_log_module.c:880` (`ngx_http_log_bytes_sent`) just prints
`r->connection->sent`, confirming this is the same value `$bytes_sent`
exposes.
The plugin's `slot->bytes_out += r->connection->sent` in the log handler is
therefore correct.
## What the plugin does about it
### Option A (implemented): Content-Length fallback on discard
In the log handler, when `r->discard_body` is set and
`r->headers_in.content_length_n > 0`, the plugin adds `content_length_n` to
`bin_sz` instead of trusting `r->request_length`.
- **Covers** the rate-limit-403 / auth-denial / 413-fixed-CL / early
`return 4xx;` cases on any request where the client advertised a
Content-Length (HTTP/1 or HTTP/2).
- **Does not cover** chunked requests with no Content-Length header (those
already count whatever bytes nginx `recv`ed before closing; the
post-close bytes are gone in any accounting scheme).
- **Over-counts** when a client announces a Content-Length then hangs up
partway through (nginx never received the remainder). In practice,
abusive clients like the CT log hammer case send the full body before
nginx's RST propagates, so the over-count is small.
- **Under-counts** HTTP/2 framing overhead (frame headers, control frames).
That overhead is small (~1% at typical DATA frame sizes) and can be
ignored for a first cut.
See `src/ngx_http_ipng_stats_module.c` — search for `discard_body`.
### Option B (not implemented): per-connection recv wrapping
Replace `c->recv` on each accepted HTTP connection with a wrapper that
increments a per-connection byte counter, then delegates to the original.
At log phase, use `(counter_at_log - counter_at_request_start)` as the
authoritative `bytes_in`. Captures discarded bodies, HTTP/2 framing, HTTP/2
`skip_data` bodies, partial uploads, and chunked — everything that crossed
nginx's `c->recv`. TLS/TCP overhead remains invisible (below the recv
abstraction). Revisit if the A-residual gap turns out to matter.