Reduce scrape cardinality: class codes, per-(source,vip) histograms, byte histograms
Collapses the status-code dimension of the counter key into six class
lanes (1xx..5xx/unknown) so per-(source,vip) counter cardinality no
longer grows with the number of distinct three-digit responses nginx
serves. Histogram series drop the code label entirely and aggregate
across classes. Adds nginx_ipng_latency_total with a code class label
so average latency per class can still be computed off the scrape.
Adds nginx_ipng_bytes_{in,out} histograms with configurable boundaries
via the new ipng_stats_byte_buckets directive. Bumps JSON schema to 2.
Operators who need full three-digit-code resolution should consume the
ipng_stats_logtail stream off-host; the stats zone intentionally trades
that resolution for a bounded scrape size.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -108,6 +108,21 @@ same set applies to every `(source, vip)` key in the module (v0.1 does not suppo
|
||||
|
||||
See FR-2.3, FR-5.4.
|
||||
|
||||
### `ipng_stats_byte_buckets <size> <size> ...`
|
||||
|
||||
**Context:** `http`.
|
||||
|
||||
**Value:** two or more strictly increasing sizes (nginx size spec: `100`, `1k`, `1m`, ...) representing byte-size histogram upper
|
||||
bounds.
|
||||
|
||||
**Default:** `100 1000 10000 100000 1000000 10000000`, plus an implicit `+Inf` bucket.
|
||||
|
||||
**Effect:** overrides the default bucket boundaries for the `nginx_ipng_bytes_in` and `nginx_ipng_bytes_out` histograms. Pick values
|
||||
that match your traffic mix — these bucket bounds feed the scrape output only, not the per-`(source, vip, class)` byte counters, which
|
||||
are exact.
|
||||
|
||||
See FR-2.3.
|
||||
|
||||
### `ipng_stats on | off`
|
||||
|
||||
**Context:** `http`, `server`, `location`.
|
||||
@@ -231,17 +246,29 @@ See FR-3.1, FR-3.2, FR-3.3, FR-3.4, FR-3.5.
|
||||
|
||||
For Prometheus, the module exports under the `nginx_ipng_` prefix.
|
||||
|
||||
The `code` label is a class bucket — one of `1xx`, `2xx`, `3xx`, `4xx`, `5xx`, or `unknown` (for codes outside `[100, 599]`). This
|
||||
keeps per-`(source, vip)` counter cardinality bounded at six lanes regardless of how many distinct three-digit responses nginx serves.
|
||||
Histogram series do not carry `code` — they aggregate across all classes for a given `(source, vip)`. Operators who need a full
|
||||
per-three-digit-code breakdown should enable `ipng_stats_logtail` and derive it from the access-log stream off the hot path.
|
||||
|
||||
| metric | type | labels | meaning |
|
||||
| --- | --- | --- | --- |
|
||||
| `nginx_ipng_requests_total` | counter | `source_tag`, `vip`, `code` | Request count per `(source, vip, status_code)`. |
|
||||
| `nginx_ipng_requests_total` | counter | `source_tag`, `vip`, `code` | Request count per `(source, vip, class)`. |
|
||||
| `nginx_ipng_bytes_in_total` | counter | `source_tag`, `vip`, `code` | Request bytes received (request line + headers + body). |
|
||||
| `nginx_ipng_bytes_out_total` | counter | `source_tag`, `vip`, `code` | Response bytes sent (status line + headers + body). |
|
||||
| `nginx_ipng_request_duration_seconds_bucket` | histogram bucket | `source_tag`, `vip`, `le` | Request duration histogram (Prometheus shape). |
|
||||
| `nginx_ipng_latency_total` | counter | `source_tag`, `vip`, `code` | Sum of request durations, in seconds. Divide by `_requests_total` for mean latency per class. |
|
||||
| `nginx_ipng_request_duration_seconds_bucket` | histogram bucket | `source_tag`, `vip`, `le` | Request duration histogram, aggregated across classes. |
|
||||
| `nginx_ipng_request_duration_seconds_sum` | histogram sum | `source_tag`, `vip` | Sum of observed durations in seconds. |
|
||||
| `nginx_ipng_request_duration_seconds_count` | histogram count | `source_tag`, `vip` | Count of observations. |
|
||||
| `nginx_ipng_upstream_response_seconds_bucket` | histogram bucket | `source_tag`, `vip`, `le` | Upstream response time histogram. |
|
||||
| `nginx_ipng_upstream_response_seconds_sum` | histogram sum | `source_tag`, `vip` | |
|
||||
| `nginx_ipng_upstream_response_seconds_count` | histogram count | `source_tag`, `vip` | |
|
||||
| `nginx_ipng_bytes_in_bucket` | histogram bucket | `source_tag`, `vip`, `le` | Request-size histogram (bytes). |
|
||||
| `nginx_ipng_bytes_in_sum` | histogram sum | `source_tag`, `vip` | Sum of request bytes (equals `bytes_in_total` summed over classes). |
|
||||
| `nginx_ipng_bytes_in_count` | histogram count | `source_tag`, `vip` | Observations. |
|
||||
| `nginx_ipng_bytes_out_bucket` | histogram bucket | `source_tag`, `vip`, `le` | Response-size histogram (bytes). |
|
||||
| `nginx_ipng_bytes_out_sum` | histogram sum | `source_tag`, `vip` | Sum of response bytes. |
|
||||
| `nginx_ipng_bytes_out_count` | histogram count | `source_tag`, `vip` | Observations. |
|
||||
| `nginx_ipng_rate_1s` | gauge | `source_tag`, `vip` | EWMA requests/sec, 1-second decay. |
|
||||
| `nginx_ipng_rate_10s` | gauge | `source_tag`, `vip` | EWMA requests/sec, 10-second decay. |
|
||||
| `nginx_ipng_rate_60s` | gauge | `source_tag`, `vip` | EWMA requests/sec, 60-second decay. |
|
||||
@@ -258,39 +285,31 @@ See FR-2.*, FR-3.7.
|
||||
|
||||
```json
|
||||
{
|
||||
"schema": 1,
|
||||
"by_source": {
|
||||
"mg1": {
|
||||
"vips": {
|
||||
"192.0.2.10": {
|
||||
"rate_1s": 42.3,
|
||||
"rate_10s": 40.1,
|
||||
"rate_60s": 39.8,
|
||||
"codes": {
|
||||
"200": { "requests": 12345, "bytes_in": 9876543, "bytes_out": 54321098 },
|
||||
"404": { "requests": 17, "bytes_in": 2048, "bytes_out": 9216 }
|
||||
},
|
||||
"request_duration_ms": {
|
||||
"buckets": { "1": 10, "5": 40, "10": 120, "25": 350, "50": 870, "100": 2100,
|
||||
"250": 3400, "500": 4000, "1000": 4100, "2500": 4120,
|
||||
"5000": 4123, "10000": 4124, "+Inf": 4124 },
|
||||
"sum_ms": 87654,
|
||||
"count": 4124
|
||||
},
|
||||
"upstream_response_ms": { "...": "..." }
|
||||
}
|
||||
}
|
||||
"schema": 2,
|
||||
"records": [
|
||||
{
|
||||
"source_tag": "mg1",
|
||||
"vip": "192.0.2.10",
|
||||
"classes": {
|
||||
"2xx": { "requests": 12345, "bytes_in": 9876543, "bytes_out": 54321098,
|
||||
"latency_ms": 87654, "upstream_latency_ms": 61234 },
|
||||
"4xx": { "requests": 17, "bytes_in": 2048, "bytes_out": 9216,
|
||||
"latency_ms": 102, "upstream_latency_ms": 0 }
|
||||
},
|
||||
"request_duration_ms": {
|
||||
"sum": 87756, "count": 12362,
|
||||
"buckets": { "1": 10, "5": 40, "10": 120, "+Inf": 12362 }
|
||||
},
|
||||
"upstream_response_ms": { "sum": 61234, "count": 12345, "buckets": { "...": "..." } },
|
||||
"bytes_in": { "count": 12362, "buckets": { "100": 200, "1000": 9000, "+Inf": 12362 } },
|
||||
"bytes_out": { "count": 12362, "buckets": { "...": "..." } }
|
||||
}
|
||||
},
|
||||
"meta": {
|
||||
"zone_bytes_used": 131072,
|
||||
"zone_bytes_total": 4194304,
|
||||
"zone_full_events": 0
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
The top-level `schema` field is versioned — breaking changes bump it, additive changes don't. Consumers SHOULD check `schema`
|
||||
The top-level `schema` field is versioned — breaking changes bump it, additive changes don't. Schema `2` collapses status codes to
|
||||
class buckets and moves histograms out of the per-class records to a per-`(source, vip)` record. Consumers SHOULD check `schema`
|
||||
before parsing.
|
||||
|
||||
See FR-3.6.
|
||||
@@ -303,6 +322,7 @@ See FR-3.6.
|
||||
| `ipng_stats_flush_interval` | ✅ | — | — | — |
|
||||
| `ipng_stats_default_source` | ✅ | — | — | — |
|
||||
| `ipng_stats_buckets` | ✅ | — | — | — |
|
||||
| `ipng_stats_byte_buckets` | ✅ | — | — | — |
|
||||
| `ipng_stats_logtail` | ✅ | — | — | — |
|
||||
| `ipng_stats on\|off` | ✅ | ✅ | ✅ | — |
|
||||
| `ipng_stats;` (handler) | — | — | ✅ | — |
|
||||
|
||||
Reference in New Issue
Block a user