Add prometheus exporter on :9100
This commit is contained in:
@@ -76,6 +76,7 @@ windows, and exposes a gRPC interface for the aggregator (and directly for the C
|
||||
| Flag | Default | Description |
|
||||
|-------------------|--------------|-----------------------------------------------------------|
|
||||
| `--listen` | `:9090` | gRPC listen address |
|
||||
| `--prom-listen` | `:9100` | Prometheus metrics address; empty string to disable |
|
||||
| `--logs` | — | Comma-separated log file paths or glob patterns |
|
||||
| `--logs-file` | — | File containing one log path/glob per line |
|
||||
| `--source` | hostname | Name for this collector in query responses |
|
||||
@@ -123,6 +124,73 @@ The collector handles logrotate automatically. On `RENAME`/`REMOVE` events it dr
|
||||
descriptor to EOF (so no lines are lost), then retries opening the original path with backoff until
|
||||
the new file appears. No restart or SIGHUP required.
|
||||
|
||||
### Prometheus metrics
|
||||
|
||||
The collector exposes a Prometheus-compatible `/metrics` endpoint on `--prom-listen` (default
|
||||
`:9100`). Set `--prom-listen ""` to disable it entirely.
|
||||
|
||||
Three metrics are exported:
|
||||
|
||||
**`nginx_http_requests_total`** — counter, labeled `{host, method, status}`:
|
||||
```
|
||||
nginx_http_requests_total{host="example.com",method="GET",status="200"} 18432
|
||||
nginx_http_requests_total{host="example.com",method="POST",status="201"} 304
|
||||
nginx_http_requests_total{host="api.example.com",method="GET",status="429"} 57
|
||||
```
|
||||
|
||||
**`nginx_http_response_body_bytes`** — histogram, labeled `{host}`. Observes the
|
||||
`$body_bytes_sent` value for every request. Bucket upper bounds (bytes):
|
||||
`256, 1024, 4096, 16384, 65536, 262144, 1048576, +Inf`.
|
||||
|
||||
**`nginx_http_request_duration_seconds`** — histogram, labeled `{host}`. Observes the
|
||||
`$request_time` value for every request. Bucket upper bounds (seconds):
|
||||
`0.005, 0.01, 0.025, 0.05, 0.1, 0.25, 0.5, 1, 2.5, 5, 10, +Inf`.
|
||||
|
||||
Body and request-time histograms use only the `host` label (not method/status) to keep
|
||||
cardinality bounded — the label sets stay proportional to the number of virtual hosts, not
|
||||
the number of unique method × status combinations.
|
||||
|
||||
The counter map is capped at 100 000 distinct `{host, method, status}` tuples. Entries beyond
|
||||
the cap are silently dropped for the current scrape interval, so memory is bounded regardless
|
||||
of traffic patterns.
|
||||
|
||||
**Prometheus scrape config:**
|
||||
|
||||
```yaml
|
||||
scrape_configs:
|
||||
- job_name: nginx_logtail
|
||||
static_configs:
|
||||
- targets:
|
||||
- nginx1:9100
|
||||
- nginx2:9100
|
||||
- nginx3:9100
|
||||
```
|
||||
|
||||
Or with service discovery — the collector has no special requirements beyond a reachable
|
||||
TCP port.
|
||||
|
||||
**Example queries:**
|
||||
|
||||
```promql
|
||||
# Request rate per host over last 5 minutes
|
||||
rate(nginx_http_requests_total[5m])
|
||||
|
||||
# 5xx error rate fraction per host
|
||||
sum by (host) (rate(nginx_http_requests_total{status=~"5.."}[5m]))
|
||||
/
|
||||
sum by (host) (rate(nginx_http_requests_total[5m]))
|
||||
|
||||
# 95th percentile response time per host
|
||||
histogram_quantile(0.95,
|
||||
sum by (host, le) (rate(nginx_http_request_duration_seconds_bucket[5m]))
|
||||
)
|
||||
|
||||
# Median response body size per host
|
||||
histogram_quantile(0.50,
|
||||
sum by (host, le) (rate(nginx_http_response_body_bytes_bucket[5m]))
|
||||
)
|
||||
```
|
||||
|
||||
### Memory usage
|
||||
|
||||
The collector is designed to stay well under 1 GB:
|
||||
|
||||
Reference in New Issue
Block a user