Strip socket options on cross-cscf repeat listens (v0.7.2)

Make the shared-listen-include pattern work with `reuseport` and the
other socket-level listen options. Nginx core enforces at-most-once
per sockaddr on options that set lsopt.set=1 (reuseport, bind,
backlog=, rcvbuf=, sndbuf=, setfib=, fastopen=, accept_filter=,
deferred, ipv6only=, so_keepalive=) and emits "duplicate listen
options for <addr>" otherwise. That rule collides with a single
listens.conf included from every vhost — each vhost's include
re-submits the same options.

The listen wrapper now detects the cross-cscf case, strips those
options from cf->args before delegating to the core handler, and
logs one notice per stripped listen. The first cscf owns the
options on the kernel socket; later cscfs merge cleanly via
ngx_http_add_server. Protocol-level flags (ssl, http2, quic,
proxy_protocol) pass through untouched since nginx OR-merges those
across cscfs.

This unblocks `reuseport` for deployments that want better
new-connection spread across workers.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-04-19 16:23:58 +02:00
parent badb684431
commit 7ed77f5b22
8 changed files with 168 additions and 18 deletions

View File

@@ -199,6 +199,31 @@ register bindings without tripping nginx's duplicate-listen check. Traffic arriv
back to `ipng_stats_default_source` (`direct` by default). Keeping "direct" traffic on its own port — e.g.
`listen 198.51.100.1:8081;` — remains a fine pattern when you want a hard split, but it's no longer required.
### Shared includes with `reuseport` (or other socket-level options)
Socket-level `listen` options — `reuseport`, `bind`, `backlog=`, `rcvbuf=`, `sndbuf=`, `setfib=`, `fastopen=`, `accept_filter=`,
`deferred`, `ipv6only=`, `so_keepalive=` — belong to the one kernel socket that backs a given sockaddr, not to a particular
`server { ... }` block. Stock nginx enforces this by accepting them on at most the *first* listen per sockaddr and emitting
`duplicate listen options for <addr>` on any subsequent repeat. That rule collides with the common deployment pattern of a single
`listens.conf` included from every vhost, because each vhost's `include` re-submits the same options.
The wrapper resolves this transparently. When a sockaddr recurs under a different `server` block than the one that first
registered it, the wrapper strips socket-level options from the incoming `cf->args` before delegating to nginx's core listen
handler. The first `server` block owns the options on the kernel socket (including `reuseport`, which triggers per-worker
socket cloning); later blocks merge cleanly via `ngx_http_add_server` and inherit the same socket. The wrapper logs one
`[notice] ipng_stats: stripped socket options from duplicate listen on <addr>` per stripped listen — informational, not an
error. So this include works unchanged across as many vhosts as you like:
```nginx
listen 443 ssl reuseport device=gre-mg1 ipng_source_tag=mg1;
listen [::]:443 ssl reuseport device=gre-mg1 ipng_source_tag=mg1;
```
`reuseport` noticeably helps worker load-balancing on busy hosts: without it, a single shared listening socket forces workers
to compete for accepts and traffic routinely concentrates on one or two workers. HTTP/2 and long-lived keepalive connections
can still skew CPU toward whichever worker holds a few heavy clients — `reuseport` does not reshuffle existing connections —
but new-connection distribution across workers becomes kernel-hashed, not first-ready-wins.
## 4. Verify with curl
Generate some traffic (or wait for real traffic), then scrape the endpoint locally: