Add logtail if=$variable filtering and update log format examples
- ipng_stats_logtail now accepts an optional if=$variable parameter that suppresses log lines when the variable is empty or "0", following the same semantics as nginx's access_log if=. The condition is checked before format rendering for zero overhead on filtered requests. Filtered requests are still counted by stats. - Log format examples updated to include $scheme for http/https visibility, and renamed to ipng_stats_logtail to match production. - Robot test added for the if= filter (19 tests, 19 pass). - FR-8.5 added to design doc for the if= semantics.
This commit is contained in:
@@ -122,20 +122,21 @@ endpoint does not inflate its own counters.
|
||||
|
||||
See FR-5.5.
|
||||
|
||||
### `ipng_stats_logtail <format_name> udp://<host>:<port> [buffer=<size>] [flush=<duration>]`
|
||||
### `ipng_stats_logtail <format_name> udp://<host>:<port> [buffer=<size>] [flush=<duration>] [if=<$variable>]`
|
||||
|
||||
**Context:** `http`.
|
||||
|
||||
**Value:** `<format_name>` is the name of an existing `log_format` defined earlier in the same `http` block. The destination MUST be a
|
||||
`udp://host:port` URI. `buffer=<size>` is an optional nginx size spec (default `64k`, minimum `1k`). `flush=<duration>` is an optional
|
||||
nginx duration string (default `1s`, minimum `100ms`).
|
||||
nginx duration string (default `1s`, minimum `100ms`). `if=<$variable>` is an optional condition variable — when set, the log line is
|
||||
only emitted if the variable evaluates to a non-empty value other than `"0"`.
|
||||
|
||||
**Default:** not set — the directive is optional. When absent, no global logtail output is written.
|
||||
|
||||
**Effect:** registers a global log-phase writer that fires unconditionally for every request, regardless of `server` or `location`
|
||||
context. The named `log_format` is looked up from nginx's log module at configuration time; nginx's standard variable-expansion
|
||||
machinery renders each line, so any variable usable in a regular `log_format` — including `$ipng_source_tag` and `$server_addr` — is
|
||||
available here.
|
||||
**Effect:** registers a global log-phase writer that fires unconditionally for every request (unless suppressed by `if=`), regardless
|
||||
of `server` or `location` context. The named `log_format` is looked up from nginx's log module at configuration time; nginx's standard
|
||||
variable-expansion machinery renders each line, so any variable usable in a regular `log_format` — including `$ipng_source_tag` and
|
||||
`$server_addr` — is available here.
|
||||
|
||||
Each worker maintains a private in-memory write buffer of `buffer=<size>` bytes. Each buffer flush is transmitted as a single
|
||||
`sendto()` call on a per-worker `SOCK_DGRAM` socket that is opened at worker init and closed at worker exit. The address is resolved
|
||||
@@ -152,12 +153,31 @@ in every log line at no extra configuration cost.
|
||||
File-based access logging is intentionally not supported by this directive — use nginx's built-in `access_log` directive for that.
|
||||
|
||||
```nginx
|
||||
log_format logtail '$host\t$remote_addr\t$ipng_source_tag\t$server_addr\t'
|
||||
'$request_method\t$request_uri\t$status\t$body_bytes_sent\t'
|
||||
'$request_time';
|
||||
ipng_stats_logtail logtail udp://127.0.0.1:9514 buffer=16k flush=1s;
|
||||
log_format ipng_stats_logtail '$host\t$remote_addr\t$request_method\t$request_uri\t'
|
||||
'$status\t$body_bytes_sent\t'
|
||||
'$ipng_source_tag\t$server_addr\t$scheme';
|
||||
ipng_stats_logtail ipng_stats_logtail udp://127.0.0.1:9514 buffer=16k flush=1s;
|
||||
```
|
||||
|
||||
#### Conditional logging with `if=`
|
||||
|
||||
The `if=$variable` parameter suppresses log lines for requests where the variable is empty, not found, or `"0"`. This uses the same
|
||||
semantics as nginx's built-in `access_log ... if=` and works well with `map` blocks:
|
||||
|
||||
```nginx
|
||||
# Suppress health checks from the logtail stream:
|
||||
map $request_uri $logtail_enabled {
|
||||
~^/\.well-known/ipng/healthz 0;
|
||||
default 1;
|
||||
}
|
||||
|
||||
ipng_stats_logtail ipng_stats_logtail udp://127.0.0.1:9514 if=$logtail_enabled;
|
||||
```
|
||||
|
||||
The `map` compiles to a hash table at configuration time; at request time it costs a single hash probe, evaluated lazily only when
|
||||
the variable is read. The condition is checked before the log format is rendered, so filtered requests skip the format rendering
|
||||
entirely.
|
||||
|
||||
**Constraints and behavior:**
|
||||
|
||||
- `host` MUST be a literal IPv4 address. Hostnames and IPv6 addresses are not supported in v0.1.
|
||||
|
||||
@@ -175,10 +175,10 @@ Each requirement carries a unique identifier (`FR-X.Y` or `NFR-X.Y`) so that lat
|
||||
|
||||
**FR-8 Logtail**
|
||||
|
||||
- **FR-8.1** The module MUST support an `ipng_stats_logtail <format_name> udp://host:port [buffer=<size>] [flush=<duration>]` directive
|
||||
at the `http` level that registers a global log-phase writer which fires unconditionally for every request, regardless of which
|
||||
`server` or `location` block handled it. One directive at the `http` level is sufficient to cover all vhosts — operators MUST NOT be
|
||||
required to repeat an `access_log` directive in every `server` block to achieve a single global access log.
|
||||
- **FR-8.1** The module MUST support an `ipng_stats_logtail <format_name> udp://host:port [buffer=<size>] [flush=<duration>] [if=$var]`
|
||||
directive at the `http` level that registers a global log-phase writer which fires for every request (unless suppressed by `if=`),
|
||||
regardless of which `server` or `location` block handled it. One directive at the `http` level is sufficient to cover all vhosts —
|
||||
operators MUST NOT be required to repeat an `access_log` directive in every `server` block to achieve a single global access log.
|
||||
- **FR-8.2** The `<format_name>` argument MUST be the name of an existing nginx `log_format` defined in the same `http` block before
|
||||
this directive. The module MUST look up the compiled log format from nginx's log module at configuration time and use it to render each
|
||||
log line at request time. The module MUST NOT define its own format language; all `$variable` expansion is handled by nginx's standard
|
||||
@@ -194,6 +194,11 @@ Each requirement carries a unique identifier (`FR-X.Y` or `NFR-X.Y`) so that lat
|
||||
present are intentional; the UDP transport is designed for fire-and-forget analytics pipelines where delivery guarantees are
|
||||
unnecessary and zero disk I/O is preferred over persistence. File-based access logging is not supported by this directive — operators
|
||||
should use nginx's built-in `access_log` for that purpose.
|
||||
- **FR-8.5** The directive MAY include an `if=$variable` parameter. When present, the logtail writer MUST evaluate the named nginx
|
||||
variable at log phase and MUST suppress the log line if the variable is not found, is empty, or equals the string `"0"`. The
|
||||
condition MUST be checked before the log format is rendered, so that filtered requests incur no formatting cost. This follows the same
|
||||
semantics as nginx's built-in `access_log ... if=` and is intended for suppressing high-frequency requests (e.g. health checks) from
|
||||
the logtail stream. Filtered requests MUST still be counted by the stats module — the `if=` condition affects only logtail output.
|
||||
|
||||
### Non-Functional Requirements
|
||||
|
||||
|
||||
@@ -261,13 +261,13 @@ would add unwanted I/O pressure. For file-based access logging, use nginx's buil
|
||||
Add a `log_format` declaration inside the `http { ... }` block, **before** the `ipng_stats_logtail` directive that references it:
|
||||
|
||||
```nginx
|
||||
log_format logtail '$host\t$remote_addr\t$ipng_source_tag\t$server_addr\t'
|
||||
'$request_method\t$request_uri\t$status\t$body_bytes_sent\t'
|
||||
'$request_time';
|
||||
log_format ipng_stats_logtail '$host\t$remote_addr\t$request_method\t$request_uri\t'
|
||||
'$status\t$body_bytes_sent\t'
|
||||
'$ipng_source_tag\t$server_addr\t$scheme';
|
||||
```
|
||||
|
||||
Any nginx variable is usable here, including `$ipng_source_tag` (the device attribution tag, FR-6.1) and `$server_addr` (the VIP
|
||||
that received the request).
|
||||
Any nginx variable is usable here, including `$ipng_source_tag` (the device attribution tag, FR-6.1), `$server_addr` (the VIP
|
||||
that received the request), and `$scheme` (`http` or `https` — useful since `$server_addr` alone doesn't distinguish ports).
|
||||
|
||||
### Configuration
|
||||
|
||||
@@ -275,17 +275,17 @@ that received the request).
|
||||
http {
|
||||
ipng_stats_zone ipng:4m;
|
||||
|
||||
log_format logtail '$host\t$remote_addr\t$ipng_source_tag\t$server_addr\t'
|
||||
'$request_method\t$request_uri\t$status\t$body_bytes_sent\t'
|
||||
'$request_time';
|
||||
log_format ipng_stats_logtail '$host\t$remote_addr\t$request_method\t$request_uri\t'
|
||||
'$status\t$body_bytes_sent\t'
|
||||
'$ipng_source_tag\t$server_addr\t$scheme';
|
||||
|
||||
ipng_stats_logtail logtail udp://127.0.0.1:9514 buffer=16k flush=1s;
|
||||
ipng_stats_logtail ipng_stats_logtail udp://127.0.0.1:9514 buffer=16k flush=1s;
|
||||
|
||||
server { ... }
|
||||
}
|
||||
```
|
||||
|
||||
- **`logtail`** (first argument) — the `log_format` name.
|
||||
- **`ipng_stats_logtail`** (first argument) — the `log_format` name.
|
||||
- **`udp://127.0.0.1:9514`** — destination as a `udp://host:port` URI. `host` must be a literal IPv4 address (no hostnames, no IPv6
|
||||
in v0.1).
|
||||
- **`buffer=16k`** — per-worker write buffer. Lines are held in memory until the buffer fills, the flush timer fires, or the worker
|
||||
@@ -304,6 +304,25 @@ lost datagrams are acceptable and disk I/O is not.
|
||||
configured buffer sizes. On routed paths, path MTU applies.
|
||||
- There is no acknowledgment, retry, or sequence number. If the receiver is down, the data is gone.
|
||||
|
||||
### Filtering with `if=`
|
||||
|
||||
High-frequency requests like health checks can be suppressed from the logtail stream using the `if=$variable` parameter. Use a `map`
|
||||
block to define which requests should be logged:
|
||||
|
||||
```nginx
|
||||
map $request_uri $logtail_enabled {
|
||||
~^/\.well-known/ipng/healthz 0;
|
||||
default 1;
|
||||
}
|
||||
|
||||
ipng_stats_logtail ipng_stats_logtail udp://127.0.0.1:9514 buffer=16k flush=1s if=$logtail_enabled;
|
||||
```
|
||||
|
||||
Filtered requests are still counted by the stats module — only the logtail output is suppressed. The condition is checked before the
|
||||
log format is rendered, so filtered requests have zero logtail overhead. Multiple conditions can be combined using nested `map` blocks.
|
||||
|
||||
See [`config-guide.md`](config-guide.md#conditional-logging-with-if) for the full semantics.
|
||||
|
||||
**Starting a receiver** is trivial:
|
||||
|
||||
```bash
|
||||
@@ -317,10 +336,11 @@ datagram stream and processes it into structured log output.
|
||||
A typical received log line (with the format above, tab-separated) looks like:
|
||||
|
||||
```
|
||||
example.com 203.0.113.42 mg1 192.0.2.10 GET /index.html 200 4321 0.003
|
||||
example.com 203.0.113.42 GET /index.html 200 4321 mg1 192.0.2.10 https
|
||||
```
|
||||
|
||||
The third field (`mg1`) comes from `$ipng_source_tag` — free per-device attribution in every log line.
|
||||
The `mg1` field comes from `$ipng_source_tag` and `https` from `$scheme` — free per-device attribution and protocol visibility in
|
||||
every log line.
|
||||
|
||||
### Why this complements per-server `access_log`
|
||||
|
||||
|
||||
Reference in New Issue
Block a user