Add prometheus exporter on :9100

This commit is contained in:
2026-03-24 03:49:22 +01:00
parent c7f8455188
commit 91eb56a64c
8 changed files with 486 additions and 20 deletions

View File

@@ -76,6 +76,7 @@ windows, and exposes a gRPC interface for the aggregator (and directly for the C
| Flag | Default | Description |
|-------------------|--------------|-----------------------------------------------------------|
| `--listen` | `:9090` | gRPC listen address |
| `--prom-listen` | `:9100` | Prometheus metrics address; empty string to disable |
| `--logs` | — | Comma-separated log file paths or glob patterns |
| `--logs-file` | — | File containing one log path/glob per line |
| `--source` | hostname | Name for this collector in query responses |
@@ -123,6 +124,73 @@ The collector handles logrotate automatically. On `RENAME`/`REMOVE` events it dr
descriptor to EOF (so no lines are lost), then retries opening the original path with backoff until
the new file appears. No restart or SIGHUP required.
### Prometheus metrics
The collector exposes a Prometheus-compatible `/metrics` endpoint on `--prom-listen` (default
`:9100`). Set `--prom-listen ""` to disable it entirely.
Three metrics are exported:
**`nginx_http_requests_total`** — counter, labeled `{host, method, status}`:
```
nginx_http_requests_total{host="example.com",method="GET",status="200"} 18432
nginx_http_requests_total{host="example.com",method="POST",status="201"} 304
nginx_http_requests_total{host="api.example.com",method="GET",status="429"} 57
```
**`nginx_http_response_body_bytes`** — histogram, labeled `{host}`. Observes the
`$body_bytes_sent` value for every request. Bucket upper bounds (bytes):
`256, 1024, 4096, 16384, 65536, 262144, 1048576, +Inf`.
**`nginx_http_request_duration_seconds`** — histogram, labeled `{host}`. Observes the
`$request_time` value for every request. Bucket upper bounds (seconds):
`0.005, 0.01, 0.025, 0.05, 0.1, 0.25, 0.5, 1, 2.5, 5, 10, +Inf`.
Body and request-time histograms use only the `host` label (not method/status) to keep
cardinality bounded — the label sets stay proportional to the number of virtual hosts, not
the number of unique method × status combinations.
The counter map is capped at 100 000 distinct `{host, method, status}` tuples. Entries beyond
the cap are silently dropped for the current scrape interval, so memory is bounded regardless
of traffic patterns.
**Prometheus scrape config:**
```yaml
scrape_configs:
- job_name: nginx_logtail
static_configs:
- targets:
- nginx1:9100
- nginx2:9100
- nginx3:9100
```
Or with service discovery — the collector has no special requirements beyond a reachable
TCP port.
**Example queries:**
```promql
# Request rate per host over last 5 minutes
rate(nginx_http_requests_total[5m])
# 5xx error rate fraction per host
sum by (host) (rate(nginx_http_requests_total{status=~"5.."}[5m]))
/
sum by (host) (rate(nginx_http_requests_total[5m]))
# 95th percentile response time per host
histogram_quantile(0.95,
sum by (host, le) (rate(nginx_http_request_duration_seconds_bucket[5m]))
)
# Median response body size per host
histogram_quantile(0.50,
sum by (host, le) (rate(nginx_http_response_body_bytes_bucket[5m]))
)
```
### Memory usage
The collector is designed to stay well under 1 GB: