Files
nginx-ipng-stats-plugin/docs/nginx-issues.md
Pim van pelt 05d405aba5 Count discarded POST bodies against bytes_in
Counter coverage for nginx_ipng_bytes_in_total was missing most of
the wire for rate-limited POST workloads. Observed on a CT-log-style
box: the plugin was reporting ~300 KB/s inbound while btop showed
6+ MB/s on the NIC, a ~20x gap. Root cause is in nginx itself: when
a request is rejected before a handler reads the body (limit_req
403, auth_request denial, 413 on fixed Content-Length, early
`return 4xx;`), nginx routes the body through
ngx_http_discard_request_body / HTTP/2 skip_data. Both paths drop
the bytes without ever incrementing r->request_length, so
$request_length (and therefore the plugin's bytes_in counter)
reflects only the request line and headers.

Log handler now adds r->headers_in.content_length_n to bin_sz when
r->discard_body is set and the client advertised a Content-Length.
That's a tight upper bound on the body bytes actually received —
abusive clients like the CT-log hammer case typically send the
full declared body before nginx's RST propagates. Normal 200 POSTs
are unchanged (they go through the buffered body path, which does
update r->request_length, and r->discard_body stays 0).

docs/nginx-issues.md catalogs the full analysis — which nginx paths
update r->request_length and which don't, how 413 differs between
fixed-CL and chunked and HTTP/2, why r->connection->sent is safe to
accumulate (it's reset per-request on keepalive and per-stream on
HTTP/2), and what the residual gap looks like after this fix
(HTTP/2 framing overhead, TLS/TCP, chunked requests with no
Content-Length header). Option B (per-connection recv wrapping) is
sketched there for later if the residual matters.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 09:57:45 +02:00

7.2 KiB
Raw Blame History

Upstream nginx quirks that affect this plugin

This file catalogs behaviors of nginx itself (not bugs in this plugin) that have a visible effect on the counters the plugin exposes. It exists so a future reader can page the context back in quickly instead of re-deriving it from the nginx source.

All path:line citations below are into the nginx source tree (checked against tip as of 2026-04), not into this repository.

TL;DR

  • r->connection->sent is per-request (reset between HTTP/1.1 keepalive requests, per-stream on HTTP/2). Safe to += in the log phase.
  • r->request_length is not a faithful count of bytes received from the client. It is only updated on specific code paths inside nginx, and the discard-body path — which is taken on every request that nginx rejects before a handler reads the body — bypasses it entirely. On a workload with lots of early rejections (rate-limit 403, auth_request denial, 413 with fixed Content-Length, return 4xx; on POST), $request_length and therefore nginx_ipng_bytes_in_total under-count the wire traffic substantially.

r->request_length — where it is, and is not, updated

Path Updates request_length? Source
Buffered body (ngx_http_do_read_client_request_body) yes ngx_http_request_body.c:397 (after every recv), preread at :120
Unbuffered body (ngx_http_read_unbuffered_request_body) yes (HTTP/1 — delegates to the buffered path) ngx_http_request_body.c:265
Discard body (ngx_http_discard_request_bodyngx_http_read_discarded_request_body) NO ngx_http_request_body.c:782-841 — reads into a stack buffer, never touches request_length
Chunked HTTP/1 yes (indirect) Wire bytes counted at :397 before the chunked parser runs
HTTP/2 DATA payload yes (payload only) ngx_http_v2_filter_request_body at v2/ngx_http_v2.c:4201
HTTP/2 DATA via stream->skip_data NO v2/ngx_http_v2.c:1077-1082 — skipped stream data is silently dropped
HTTP/2 frame headers, SETTINGS, WINDOW_UPDATE, PING, PRIORITY, RST_STREAM, GOAWAY, padding NO Not counted anywhere
HTTP/2 HEADERS / CONTINUATION yes (decoded size only) v2/ngx_http_v2.c:1333, 1968
HTTP/3 DATA yes (payload only) v3/ngx_http_v3_request.c:1559, 1614, 1660, 1668
Pipelined-request carry-over in header_in subtracted back out ngx_http_request_body.c:541 (r->request_length -= n)
TLS handshake / record framing never visible to nginx Below c->recv

413 client_max_body_size specifically

  • Fixed Content-Length: ngx_http_core_module.c:1006-1018 detects the overrun right after headers are parsed and calls ngx_http_discard_request_body() before finalizing 413. That path does not bump request_length, so $request_length ends up reflecting only the request line + headers. Body bytes that actually streamed in before nginx closed are invisible.
  • Chunked HTTP/1: ngx_http_request_body.c:1142-1154 trips 413 inside the body filter after the wire bytes have already been counted at :397, so they are included — but only up to what nginx had recved before returning.
  • HTTP/2 chunked: v2/ngx_http_v2.c:4216-4223 returns 413 after the r->request_length += at :4201, so partial-body bytes do count.

The common "rate-limit 403" pattern

limit_req rejects in PREACCESS. On rejection, nginx finalizes with 403, which calls ngx_http_discard_request_body(). On HTTP/1 this sends the body into the stack-buffer discard reader (not counted). On HTTP/2 it sets the stream's skip_data flag at v2/ngx_http_v2.c:1077-1082, so the client's DATA frames are parsed and their payload silently dropped (also not counted). Either way, $request_length reflects headers only.

This is the dominant source of under-counting on rate-limited abusive workloads. Observed example: a Certificate Transparency client hammering POST /ct/v1/add-chain at ~460 req/s, each carrying a ~10 KB cert chain in the body, all hitting 403. 460 × 10 KB = ~4.6 MB/s of inbound bytes completely absent from $request_length.

r->connection->sent — this one is fine

  • HTTP/1.1 keepalive: reset at ngx_http_request.c:3493 (in ngx_http_keepalive_handler, which begins at :3356) when a new request arrives on an idle keepalive connection, and at ngx_http_request.c:3226 (in ngx_http_set_keepalive, which begins at :3128) for the pipelined-next-request case.
  • HTTP/2: each request has a fake per-stream connection fc initialized with fc->sent = 0 at v2/ngx_http_v2.c:3039-3044, so r->connection->sent is per-stream. Framing overhead on the real TCP connection (HEADERS, SETTINGS, WINDOW_UPDATE, PING, GOAWAY) is invisible to r->connection->sent — that is consistent with how $bytes_sent behaves and is fine for per-request accounting.
  • ngx_http_log_module.c:880 (ngx_http_log_bytes_sent) just prints r->connection->sent, confirming this is the same value $bytes_sent exposes.

The plugin's slot->bytes_out += r->connection->sent in the log handler is therefore correct.

What the plugin does about it

Option A (implemented): Content-Length fallback on discard

In the log handler, when r->discard_body is set and r->headers_in.content_length_n > 0, the plugin adds content_length_n to bin_sz instead of trusting r->request_length.

  • Covers the rate-limit-403 / auth-denial / 413-fixed-CL / early return 4xx; cases on any request where the client advertised a Content-Length (HTTP/1 or HTTP/2).
  • Does not cover chunked requests with no Content-Length header (those already count whatever bytes nginx recved before closing; the post-close bytes are gone in any accounting scheme).
  • Over-counts when a client announces a Content-Length then hangs up partway through (nginx never received the remainder). In practice, abusive clients like the CT log hammer case send the full body before nginx's RST propagates, so the over-count is small.
  • Under-counts HTTP/2 framing overhead (frame headers, control frames). That overhead is small (~1% at typical DATA frame sizes) and can be ignored for a first cut.

See src/ngx_http_ipng_stats_module.c — search for discard_body.

Option B (not implemented): per-connection recv wrapping

Replace c->recv on each accepted HTTP connection with a wrapper that increments a per-connection byte counter, then delegates to the original. At log phase, use (counter_at_log - counter_at_request_start) as the authoritative bytes_in. Captures discarded bodies, HTTP/2 framing, HTTP/2 skip_data bodies, partial uploads, and chunked — everything that crossed nginx's c->recv. TLS/TCP overhead remains invisible (below the recv abstraction). Revisit if the A-residual gap turns out to matter.