PRE-RELEASE 0.9.1: Makefile, Debian packaging, versioned UDP
Build and release tooling:
- Makefile with help as default; targets: build/build-amd64/build-arm64,
test, lint, proto, pkg-deb, docker, docker-push, clean, plus
install-deps (+ three sub-targets for apt / Go toolchain / Go tools).
- internal/version package; -ldflags -X injects Version/Commit/Date into
every binary. -version flag on all four binaries (nginx-logtail version
for the CLI).
- Dockerfile takes VERSION/COMMIT/DATE build-args and forwards them.
- .deb output lands in build/; .gitignore ignores /build/.
Debian package:
- debian/build-deb.sh packages all four static binaries into a single
nginx-logtail_<ver>_<arch>.deb using dpkg-deb.
- Binary layout: /usr/sbin/nginx-logtail-{collector,aggregator,frontend}
and /usr/bin/nginx-logtail.
- nginx-logtail(8) manpage.
- Three systemd units (collector, aggregator, frontend) shipped under
/lib/systemd/system/. Installed but never enabled or started — the
operator opts in per host.
- Collector runs as _logtail:www-data (log access); aggregator and
frontend as _logtail:_logtail. postinst creates the system user/group
idempotently.
- Single shared env file /etc/default/nginx-logtail rendered from a
template at first install with %HOSTNAME% substituted. Sensible
defaults for every COLLECTOR_*, AGGREGATOR_*, FRONTEND_* variable;
plus COLLECTOR_ARGS / AGGREGATOR_ARGS / FRONTEND_ARGS escape hatches
appended to ExecStart. Not a dpkg conffile: operator edits survive
upgrades and dpkg --purge removes it.
Versioned UDP wire format:
- ParseUDPLine dispatches on a leading "v<N>\t" tag; v1 routes to the
existing 12-field parser. Unknown/missing versions fail closed so
future v2 parsers can land before emitters are upgraded.
- Tests updated; design.md FR-2.2 rewritten to make the version tag
normative.
Docs:
- README.md gains a Quick Start (Debian / Docker Compose / from source).
- user-guide.md rewritten around Installation and Configuration: full
env-var table, UDP-only default explained, precise file/UDP log_format
layouts, note that operators can emit "0" for unknown \$is_tor / \$asn.
- Drilldown cycle, frontend filter table, and CLI --group-by list all
include source_tag. UDP counters documented in the Prometheus section.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -127,15 +127,18 @@ Each requirement carries a unique identifier (`FR-X.Y` or `NFR-X.Y`) so that lat
|
||||
| 8 | `$is_tor` | `is_tor` (optional) |
|
||||
| 9 | `$asn` | `asn` (optional) |
|
||||
|
||||
- **FR-2.2 UDP format.** The collector MUST accept datagrams in the following tab-separated layout, as emitted by
|
||||
`nginx-ipng-stats-plugin`'s `ipng_stats_logtail` directive:
|
||||
- **FR-2.2 UDP format.** The collector MUST accept datagrams in a versioned tab-separated layout, as emitted by
|
||||
`nginx-ipng-stats-plugin`'s `ipng_stats_logtail` directive. Every datagram MUST begin with a literal version tag
|
||||
(`v<N>\t`) so the collector can route each packet to the appropriate parser. Only `v1` is defined in this revision;
|
||||
unknown versions MUST be counted as parse failures and dropped.
|
||||
|
||||
```nginx
|
||||
log_format ipng_stats_logtail '$host\t$remote_addr\t$request_method\t$request_uri\t$status\t$body_bytes_sent\t$request_time\t$is_tor\t$asn\t$ipng_source_tag\t$server_addr\t$scheme';
|
||||
log_format ipng_stats_logtail 'v1\t$host\t$remote_addr\t$request_method\t$request_uri\t$status\t$body_bytes_sent\t$request_time\t$is_tor\t$asn\t$ipng_source_tag\t$server_addr\t$scheme';
|
||||
```
|
||||
|
||||
Exactly 12 tab-separated fields are required. `$server_addr` and `$scheme` MUST be parsed but dropped; they are reserved for
|
||||
future use. Malformed datagrams MUST be counted (FR-8.5) and silently dropped.
|
||||
The v1 payload MUST have exactly 12 tab-separated fields after the `v1` tag (13 fields total). `$server_addr` and
|
||||
`$scheme` MUST be parsed but dropped; they are reserved for future use. Malformed datagrams (wrong version, wrong
|
||||
field count, bad IP) MUST be counted (FR-8.5) and silently dropped.
|
||||
|
||||
- **FR-2.3** The file tailer MUST set `source_tag="direct"` on every record it parses. The UDP listener MUST propagate
|
||||
`$ipng_source_tag` verbatim. This is the only difference in downstream processing between the two ingest paths.
|
||||
@@ -556,7 +559,8 @@ transitions. No per-request logging.
|
||||
- **UDP datagram loss.** Any datagram dropped in-kernel (socket buffer full, network drop) does not register as a parse failure; it
|
||||
is simply invisible. Operators should size `SO_RCVBUF` appropriately; the collector already requests 4 MiB.
|
||||
- **Malformed log lines.** File format: lines with <8 tab-separated fields are silently skipped; an invalid IP also drops the line.
|
||||
UDP: packets without exactly 12 fields are counted as received-but-not-success and dropped.
|
||||
UDP: packets without a recognised `v<N>\t` prefix, or with the wrong field count for the claimed version, or with a bad IP, are
|
||||
counted as received-but-not-success and dropped.
|
||||
- **Clock skew between collectors.** Trend sparklines derived from merged data assume collectors are roughly NTP-synced. Per-bucket
|
||||
alignment is to the local minute / 5-minute boundary of each collector.
|
||||
- **gRPC traffic over untrusted links.** The system does not ship TLS; operators should front the gRPC ports with a TLS-terminating
|
||||
|
||||
@@ -14,16 +14,126 @@ Components:
|
||||
|
||||
| Binary | Runs on | Role |
|
||||
|---------------|------------------|----------------------------------------------------|
|
||||
| `collector` | each nginx host | Tails log files, aggregates in memory, serves gRPC |
|
||||
| `collector` | each nginx host | Tails log files and/or UDP datagrams, aggregates in memory, serves gRPC |
|
||||
| `aggregator` | central host | Merges all collectors, serves unified gRPC |
|
||||
| `frontend` | central host | HTTP dashboard with drilldown UI |
|
||||
| `cli` | operator laptop | Shell queries against collector or aggregator |
|
||||
|
||||
Every binary accepts `-version` (or `nginx-logtail version` for the CLI) and prints its version,
|
||||
git commit, and build date.
|
||||
|
||||
---
|
||||
|
||||
## nginx Configuration
|
||||
## Installation
|
||||
|
||||
Add the `logtail` log format to your `nginx.conf` and apply it to each `server` block:
|
||||
Three flavors. `make help` lists every target; `make install-deps` sets up a fresh build box
|
||||
(apt deps, Go toolchain, `protoc-gen-go`, `golangci-lint`).
|
||||
|
||||
### Debian package
|
||||
|
||||
```bash
|
||||
make pkg-deb # produces nginx-logtail_<ver>_{amd64,arm64}.deb
|
||||
sudo dpkg -i nginx-logtail_*_amd64.deb
|
||||
```
|
||||
|
||||
The package installs:
|
||||
|
||||
| Path | Contents |
|
||||
|---------------------------------------------------------------|---------------------------------------------------|
|
||||
| `/usr/sbin/nginx-logtail-{collector,aggregator,frontend}` | Service binaries |
|
||||
| `/usr/bin/nginx-logtail` | CLI |
|
||||
| `/lib/systemd/system/nginx-logtail-*.service` | Three systemd units |
|
||||
| `/usr/share/man/man8/nginx-logtail.8.gz` | Manpage (`man 8 nginx-logtail`) |
|
||||
| `/usr/share/nginx-logtail/default.template` | Defaults template |
|
||||
| `/etc/default/nginx-logtail` | **Generated on first install** from the template |
|
||||
|
||||
The postinst creates a system user/group `_logtail` if absent and renders the template into
|
||||
`/etc/default/nginx-logtail` with the short hostname substituted. **None of the services are
|
||||
enabled or started automatically** — installing the package is safe on any host. Operators
|
||||
opt in per service:
|
||||
|
||||
```bash
|
||||
sudo systemctl enable --now nginx-logtail-collector.service # on each nginx host
|
||||
sudo systemctl enable --now nginx-logtail-aggregator.service # on the central host
|
||||
sudo systemctl enable --now nginx-logtail-frontend.service # on the central host
|
||||
```
|
||||
|
||||
The collector runs as `_logtail:www-data` so it can read nginx access logs that are
|
||||
group-readable by `www-data`; aggregator and frontend run as `_logtail:_logtail`.
|
||||
|
||||
### Docker / Docker Compose
|
||||
|
||||
The repo's `docker-compose.yml` runs the aggregator and frontend together from a single image
|
||||
that contains all four binaries.
|
||||
|
||||
```bash
|
||||
make docker # builds git.ipng.ch/ipng/nginx-logtail:v<ver> + :latest, native arch
|
||||
make docker-push # multi-arch (amd64+arm64) buildx push
|
||||
|
||||
AGGREGATOR_COLLECTORS=nginx1:9090,nginx2:9090 docker compose up -d
|
||||
# frontend on :8080, aggregator gRPC on :9091
|
||||
```
|
||||
|
||||
Each container explicitly selects its binary via `command: ["/usr/local/bin/<binary>"]`.
|
||||
|
||||
### From source
|
||||
|
||||
```bash
|
||||
git clone https://git.ipng.ch/ipng/nginx-logtail
|
||||
cd nginx-logtail
|
||||
make build # -> build/<arch>/{collector,aggregator,frontend,cli}
|
||||
make test
|
||||
./build/*/cli version
|
||||
```
|
||||
|
||||
Requires Go ≥ 1.24 (see `go.mod`). No CGO, no external runtime dependencies.
|
||||
|
||||
---
|
||||
|
||||
## Configuration
|
||||
|
||||
### /etc/default/nginx-logtail
|
||||
|
||||
The Debian package ships one shared environment file read by all three systemd units via
|
||||
`EnvironmentFile=-/etc/default/nginx-logtail`. It enumerates every flag the three daemons
|
||||
accept as a `COLLECTOR_*`, `AGGREGATOR_*`, or `FRONTEND_*` env var. Defaults on first install
|
||||
are sensible for a single-host deployment:
|
||||
|
||||
| Variable | First-install default | Purpose |
|
||||
|----------------------------|------------------------------|---------------------------------------------------|
|
||||
| `COLLECTOR_LISTEN` | `:9090` | gRPC listen address |
|
||||
| `COLLECTOR_PROM_LISTEN` | `:9100` | Prometheus metrics; set `""` to disable |
|
||||
| `COLLECTOR_LOGS` | *(empty — UDP-only)* | Comma-sep log paths/globs |
|
||||
| `COLLECTOR_LOGS_FILE` | *(empty)* | File with one path/glob per line |
|
||||
| `COLLECTOR_SOURCE` | `$(hostname -s)` at install | Display name in query responses |
|
||||
| `COLLECTOR_V4PREFIX` | `24` | IPv4 bucket prefix |
|
||||
| `COLLECTOR_V6PREFIX` | `48` | IPv6 bucket prefix |
|
||||
| `COLLECTOR_SCAN_INTERVAL` | `10s` | Log-glob rescan cadence |
|
||||
| `COLLECTOR_LOGTAIL_PORT` | `9514` | UDP port for `ipng_stats_logtail` (0 disables) |
|
||||
| `COLLECTOR_LOGTAIL_BIND` | `127.0.0.1` | UDP bind address |
|
||||
| `AGGREGATOR_LISTEN` | `:9091` | gRPC listen address |
|
||||
| `AGGREGATOR_COLLECTORS` | `localhost:9090` | Comma-sep collectors (mandatory) |
|
||||
| `AGGREGATOR_SOURCE` | `$(hostname -s)` at install | Display name |
|
||||
| `FRONTEND_LISTEN` | `:8080` | HTTP dashboard address |
|
||||
| `FRONTEND_TARGET` | `localhost:9091` | Default gRPC endpoint |
|
||||
| `FRONTEND_N` | `25` | Default table row count |
|
||||
| `FRONTEND_REFRESH` | `30` | Meta-refresh seconds; `0` disables |
|
||||
|
||||
At least one of `COLLECTOR_LOGS`, `COLLECTOR_LOGS_FILE`, or `COLLECTOR_LOGTAIL_PORT > 0` must
|
||||
be set, otherwise the collector refuses to start. The shipped default (`COLLECTOR_LOGS=` empty
|
||||
plus `COLLECTOR_LOGTAIL_PORT=9514`) makes the collector UDP-only — no file tailer goroutine
|
||||
is launched when no log patterns are supplied.
|
||||
|
||||
Three escape-hatch variables — `COLLECTOR_ARGS`, `AGGREGATOR_ARGS`, `FRONTEND_ARGS` — are
|
||||
appended verbatim to each unit's `ExecStart` argv. Use them for flags without an env-var form,
|
||||
or for temporary overrides, without editing the unit.
|
||||
|
||||
The file is **not a dpkg conffile**: postinst writes it only when absent, so operator edits
|
||||
survive upgrades, and `dpkg --purge` removes it.
|
||||
|
||||
### nginx — file-based ingest
|
||||
|
||||
Add the `logtail` format and attach it to whichever `server` blocks you want tracked:
|
||||
|
||||
```nginx
|
||||
http {
|
||||
@@ -37,64 +147,128 @@ http {
|
||||
}
|
||||
```
|
||||
|
||||
The format is tab-separated with fixed field positions. Query strings are stripped from the URI
|
||||
by the collector at ingest time — only the path is tracked.
|
||||
Tab-separated, fixed field order, ten fields. The precise layout:
|
||||
|
||||
`$is_tor` must be set to `1` when the client IP is a TOR exit node and `0` otherwise (typically
|
||||
populated by a custom nginx variable or a Lua script that checks the IP against a TOR exit list).
|
||||
The field is optional for backward compatibility — log lines without it are accepted and treated
|
||||
as `is_tor=0`.
|
||||
| # | Field | Ingested into |
|
||||
|---|-------------------|--------------------------|
|
||||
| 0 | `$host` | `website` |
|
||||
| 1 | `$remote_addr` | `client_prefix` (truncated) |
|
||||
| 2 | `$msec` | *(discarded)* |
|
||||
| 3 | `$request_method` | Prom `method` label |
|
||||
| 4 | `$request_uri` | `http_request_uri` (query stripped) |
|
||||
| 5 | `$status` | `http_response` |
|
||||
| 6 | `$body_bytes_sent`| Prom body histogram |
|
||||
| 7 | `$request_time` | Prom duration histogram |
|
||||
| 8 | `$is_tor` | `is_tor` (optional) |
|
||||
| 9 | `$asn` | `asn` (optional) |
|
||||
|
||||
`$asn` must be set to the client's AS number as a decimal integer (e.g. from MaxMind GeoIP2's
|
||||
`$geoip2_data_autonomous_system_number`). The field is optional — log lines without it default
|
||||
to `asn=0`.
|
||||
`$is_tor` is `1` if the client IP is a TOR exit node and `0` otherwise (typically populated
|
||||
via a Lua script or `$geoip2_data_*`). `$asn` is the client AS number as a decimal integer
|
||||
(e.g. MaxMind GeoIP2's `$geoip2_data_autonomous_system_number`).
|
||||
|
||||
---
|
||||
**If either is unknown, emit `0`.** A literal `0` in `$is_tor` parses as `false`; a literal
|
||||
`0` in `$asn` parses as ASN `0`, which you can exclude at query time with `--asn '!=0'` / the
|
||||
`asn!=0` filter expression. Operators who don't have TOR or GeoIP data can simply emit `0` for
|
||||
both columns and everything works.
|
||||
|
||||
## Building
|
||||
Both fields are also **positionally optional** for backward compatibility — older 8-field
|
||||
lines are accepted and default to `false` / `0`. Records from the file tailer are always
|
||||
tagged `source_tag="direct"`.
|
||||
|
||||
```bash
|
||||
git clone https://git.ipng.ch/ipng/nginx-logtail
|
||||
cd nginx-logtail
|
||||
go build ./cmd/collector/
|
||||
go build ./cmd/aggregator/
|
||||
go build ./cmd/frontend/
|
||||
go build ./cmd/cli/
|
||||
Then point the collector at the log files via `COLLECTOR_LOGS` — comma-separated paths or
|
||||
glob patterns. Make sure the files are group-readable by `www-data` (the collector's primary
|
||||
group in the systemd unit).
|
||||
|
||||
### nginx — UDP ingest (`nginx-ipng-stats-plugin`)
|
||||
|
||||
If the nginx host runs [`nginx-ipng-stats-plugin`](https://git.ipng.ch/ipng/nginx-ipng-stats-plugin),
|
||||
the plugin's `ipng_stats_logtail` directive emits one UDP datagram per request directly to
|
||||
the collector, no log file involved. The wire format is **versioned** — every datagram starts
|
||||
with a literal `v1\t` prefix so the collector can ship new parser versions (v2, v3, …) before
|
||||
emitters are upgraded and route each packet accordingly.
|
||||
|
||||
```nginx
|
||||
http {
|
||||
log_format ipng_stats_logtail
|
||||
'v1\t$host\t$remote_addr\t$request_method\t$request_uri\t$status\t$body_bytes_sent\t$request_time\t$is_tor\t$asn\t$ipng_source_tag\t$server_addr\t$scheme';
|
||||
|
||||
ipng_stats_logtail ipng_stats_logtail udp://127.0.0.1:9514 buffer=64k flush=1s;
|
||||
}
|
||||
```
|
||||
|
||||
Requires Go 1.21+. No CGO, no external runtime dependencies.
|
||||
Precise v1 layout — 13 tab-separated fields total (version prefix + 12 payload fields):
|
||||
|
||||
| # | Field | Ingested into |
|
||||
|---|-------------------|------------------------------|
|
||||
| 0 | `v1` | version tag |
|
||||
| 1 | `$host` | `website` |
|
||||
| 2 | `$remote_addr` | `client_prefix` (truncated) |
|
||||
| 3 | `$request_method` | Prom `method` label |
|
||||
| 4 | `$request_uri` | `http_request_uri` (query stripped) |
|
||||
| 5 | `$status` | `http_response` |
|
||||
| 6 | `$body_bytes_sent`| Prom body histogram |
|
||||
| 7 | `$request_time` | Prom duration histogram |
|
||||
| 8 | `$is_tor` | `is_tor` |
|
||||
| 9 | `$asn` | `asn` |
|
||||
| 10| `$ipng_source_tag`| `source_tag` |
|
||||
| 11| `$server_addr` | *(parsed and discarded)* |
|
||||
| 12| `$scheme` | *(parsed and discarded)* |
|
||||
|
||||
Compared to the file format: the version tag is added, `$msec` is dropped, and three fields
|
||||
are appended — `$ipng_source_tag` (propagated into the data model), `$server_addr` and
|
||||
`$scheme` (reserved for future use).
|
||||
|
||||
**Unknown `$is_tor` / `$asn`: emit `0`.** Same convention as the file format — operators
|
||||
without TOR or GeoIP data can emit `0` for both columns and everything works. A literal `0`
|
||||
in `$is_tor` is `false`; a literal `0` in `$asn` is ASN `0`, filterable at query time.
|
||||
|
||||
All 13 fields are required for v1 — malformed packets (wrong version, wrong field count, bad
|
||||
IP) are silently dropped and counted via `logtail_udp_packets_received_total` minus
|
||||
`logtail_udp_loglines_success_total`. Both paths (file + UDP) can feed the same collector
|
||||
simultaneously; they converge on the same aggregation pipeline.
|
||||
|
||||
---
|
||||
|
||||
## Collector
|
||||
|
||||
Runs on each nginx machine. Tails log files, maintains in-memory top-K counters across six time
|
||||
Runs on each nginx machine. Ingests logs from files (via `fsnotify`) and/or UDP datagrams
|
||||
(from `nginx-ipng-stats-plugin`), maintains in-memory top-K counters across six time
|
||||
windows, and exposes a gRPC interface for the aggregator (and directly for the CLI).
|
||||
|
||||
### Flags
|
||||
|
||||
| Flag | Default | Description |
|
||||
|-------------------|--------------|-----------------------------------------------------------|
|
||||
| `--listen` | `:9090` | gRPC listen address |
|
||||
| `--prom-listen` | `:9100` | Prometheus metrics address; empty string to disable |
|
||||
| `--logs` | — | Comma-separated log file paths or glob patterns |
|
||||
| `--logs-file` | — | File containing one log path/glob per line |
|
||||
| `--source` | hostname | Name for this collector in query responses |
|
||||
| `--v4prefix` | `24` | IPv4 prefix length for client bucketing (e.g. /24 → /23) |
|
||||
| `--v6prefix` | `48` | IPv6 prefix length for client bucketing |
|
||||
| `--scan-interval` | `10s` | How often to rescan glob patterns for new/removed files |
|
||||
| Flag | Default | Description |
|
||||
|-------------------|---------------|-------------------------------------------------------------------|
|
||||
| `--listen` | `:9090` | gRPC listen address |
|
||||
| `--prom-listen` | `:9100` | Prometheus metrics address; empty string to disable |
|
||||
| `--logs` | — | Comma-separated log file paths or glob patterns |
|
||||
| `--logs-file` | — | File containing one log path/glob per line |
|
||||
| `--source` | hostname | Name for this collector in query responses |
|
||||
| `--v4prefix` | `24` | IPv4 prefix length for client bucketing |
|
||||
| `--v6prefix` | `48` | IPv6 prefix length for client bucketing |
|
||||
| `--scan-interval` | `10s` | How often to rescan glob patterns for new/removed files |
|
||||
| `--logtail-port` | `0` (off) | UDP port receiving `ipng_stats_logtail` datagrams |
|
||||
| `--logtail-bind` | `127.0.0.1` | UDP bind address |
|
||||
| `--version` | — | Print version, commit, build date and exit |
|
||||
|
||||
At least one of `--logs` or `--logs-file` is required.
|
||||
At least one of `--logs`, `--logs-file`, or `--logtail-port > 0` is required; otherwise the
|
||||
collector refuses to start.
|
||||
|
||||
### Examples
|
||||
|
||||
```bash
|
||||
# UDP-only (nginx-ipng-stats-plugin feed)
|
||||
./collector --logtail-port 9514
|
||||
|
||||
# Single file
|
||||
./collector --logs /var/log/nginx/access.log
|
||||
|
||||
# Multiple files via glob (one inotify instance regardless of count)
|
||||
./collector --logs "/var/log/nginx/*/access.log"
|
||||
|
||||
# Files and UDP at the same time
|
||||
./collector --logs "/var/log/nginx/*.log" --logtail-port 9514
|
||||
|
||||
# Many files via a config file
|
||||
./collector --logs-file /etc/nginx-logtail/logs.conf
|
||||
|
||||
@@ -129,30 +303,30 @@ the new file appears. No restart or SIGHUP required.
|
||||
The collector exposes a Prometheus-compatible `/metrics` endpoint on `--prom-listen` (default
|
||||
`:9100`). Set `--prom-listen ""` to disable it entirely.
|
||||
|
||||
Three metrics are exported:
|
||||
**Per-host series:**
|
||||
|
||||
**`nginx_http_requests_total`** — counter, labeled `{host, method, status}`:
|
||||
```
|
||||
nginx_http_requests_total{host="example.com",method="GET",status="200"} 18432
|
||||
nginx_http_requests_total{host="example.com",method="POST",status="201"} 304
|
||||
nginx_http_requests_total{host="api.example.com",method="GET",status="429"} 57
|
||||
```
|
||||
- `nginx_http_requests_total{host, method, status}` — counter. Map capped at 250 000 distinct
|
||||
label sets; new entries beyond the cap are dropped until the map is rolled over.
|
||||
- `nginx_http_response_body_bytes_{bucket,count,sum}{host, le}` — histogram of
|
||||
`$body_bytes_sent`. Buckets (bytes): `256, 1024, 4096, 16384, 65536, 262144, 1048576, +Inf`.
|
||||
- `nginx_http_request_duration_seconds_{bucket,count,sum}{host, le}` — histogram of
|
||||
`$request_time`. Buckets (seconds): `0.005, 0.01, 0.025, 0.05, 0.1, 0.25, 0.5, 1, 2.5, 5,
|
||||
10, +Inf`. Not split by `source_tag` (duration histogram stays per-host to avoid cardinality
|
||||
blow-up).
|
||||
|
||||
**`nginx_http_response_body_bytes`** — histogram, labeled `{host}`. Observes the
|
||||
`$body_bytes_sent` value for every request. Bucket upper bounds (bytes):
|
||||
`256, 1024, 4096, 16384, 65536, 262144, 1048576, +Inf`.
|
||||
**Per-`source_tag` roll-ups** (parallel series, not a cross-product with `host`):
|
||||
|
||||
**`nginx_http_request_duration_seconds`** — histogram, labeled `{host}`. Observes the
|
||||
`$request_time` value for every request. Bucket upper bounds (seconds):
|
||||
`0.005, 0.01, 0.025, 0.05, 0.1, 0.25, 0.5, 1, 2.5, 5, 10, +Inf`.
|
||||
- `nginx_http_requests_by_source_total{source_tag}` — counter.
|
||||
- `nginx_http_response_body_bytes_by_source_{bucket,count,sum}{source_tag, le}` — histogram.
|
||||
|
||||
Body and request-time histograms use only the `host` label (not method/status) to keep
|
||||
cardinality bounded — the label sets stay proportional to the number of virtual hosts, not
|
||||
the number of unique method × status combinations.
|
||||
**UDP ingest counters** — lets operators distinguish parse failures from back-pressure drops:
|
||||
|
||||
The counter map is capped at 100 000 distinct `{host, method, status}` tuples. Entries beyond
|
||||
the cap are silently dropped for the current scrape interval, so memory is bounded regardless
|
||||
of traffic patterns.
|
||||
- `logtail_udp_packets_received_total` — datagrams read off the socket.
|
||||
- `logtail_udp_loglines_success_total` — parsed OK.
|
||||
- `logtail_udp_loglines_consumed_total` — forwarded to the store (not dropped).
|
||||
|
||||
`received - success` is the parse-failure rate; `success - consumed` is the back-pressure
|
||||
drop rate. Alert on either being non-zero.
|
||||
|
||||
**Prometheus scrape config:**
|
||||
|
||||
@@ -221,25 +395,22 @@ Data is served from two tiered ring buffers:
|
||||
History is lost on restart — the collector resumes tailing immediately but all ring buffers start
|
||||
empty. The fine ring fills in 1 hour; the coarse ring fills in 24 hours.
|
||||
|
||||
### Systemd unit example
|
||||
### Running under systemd
|
||||
|
||||
```ini
|
||||
[Unit]
|
||||
Description=nginx-logtail collector
|
||||
After=network.target
|
||||
The Debian package ships `nginx-logtail-collector.service` ready to run under the `_logtail`
|
||||
system user with `Group=www-data` (for log-file access). Every flag comes from
|
||||
`/etc/default/nginx-logtail`. To operate it:
|
||||
|
||||
[Service]
|
||||
ExecStart=/usr/local/bin/collector \
|
||||
--logs-file /etc/nginx-logtail/logs.conf \
|
||||
--listen :9090 \
|
||||
--source %H
|
||||
Restart=on-failure
|
||||
RestartSec=5
|
||||
|
||||
[Install]
|
||||
WantedBy=multi-user.target
|
||||
```bash
|
||||
sudo $EDITOR /etc/default/nginx-logtail # set COLLECTOR_LOGS / COLLECTOR_LOGTAIL_PORT
|
||||
sudo systemctl enable --now nginx-logtail-collector.service
|
||||
sudo systemctl status nginx-logtail-collector.service
|
||||
sudo journalctl -u nginx-logtail-collector.service -f
|
||||
```
|
||||
|
||||
If you run from source without the package, compose a unit from the packaged template at
|
||||
`debian/nginx-logtail-collector.service`.
|
||||
|
||||
---
|
||||
|
||||
## Aggregator
|
||||
@@ -326,13 +497,13 @@ the selected dimension and time window.
|
||||
**Window tabs** — switch between `1m / 5m / 15m / 60m / 6h / 24h`. Only the window changes;
|
||||
all active filters are preserved.
|
||||
|
||||
**Dimension tabs** — switch between grouping by `website / asn / prefix / status / uri`.
|
||||
**Dimension tabs** — switch between grouping by `website / asn / prefix / status / uri / source`.
|
||||
|
||||
**Drilldown** — click any table row to add that value as a filter and advance to the next
|
||||
dimension in the hierarchy:
|
||||
|
||||
```
|
||||
website → client prefix → request URI → HTTP status → ASN → website (cycles)
|
||||
website → client prefix → request URI → HTTP status → ASN → source_tag → website (cycles)
|
||||
```
|
||||
|
||||
Example: click `example.com` in the website view to see which client prefixes are hitting it;
|
||||
@@ -364,6 +535,7 @@ Supported fields and operators:
|
||||
| `prefix` | `=` | `prefix=1.2.3.0/24` |
|
||||
| `is_tor` | `=` `!=` | `is_tor=1`, `is_tor!=0` |
|
||||
| `asn` | `=` `!=` `>` `>=` `<` `<=` | `asn=8298`, `asn>=1000` |
|
||||
| `source_tag` | `=` | `source_tag=direct`, `source_tag=cdn` |
|
||||
|
||||
`is_tor=1` and `is_tor!=0` are equivalent (TOR traffic only). `is_tor=0` and `is_tor!=1` are
|
||||
equivalent (non-TOR traffic only).
|
||||
@@ -389,8 +561,9 @@ accept RE2 regular expressions. The breadcrumb strip shows them as `website~=gou
|
||||
`uri~=^/api/` with the usual `×` remove link.
|
||||
|
||||
**URL sharing** — all filter state is in the URL query string (`w`, `by`, `f_website`,
|
||||
`f_prefix`, `f_uri`, `f_status`, `f_website_re`, `f_uri_re`, `f_is_tor`, `f_asn`, `n`). Copy
|
||||
the URL to share an exact view with another operator, or bookmark a recurring query.
|
||||
`f_prefix`, `f_uri`, `f_status`, `f_website_re`, `f_uri_re`, `f_is_tor`, `f_asn`,
|
||||
`f_source_tag`, `n`). Copy the URL to share an exact view with another operator, or bookmark
|
||||
a recurring query.
|
||||
|
||||
**JSON output** — append `&raw=1` to any URL to receive the TopN result as JSON instead of
|
||||
HTML. Useful for scripting without the CLI binary:
|
||||
@@ -447,14 +620,15 @@ logtail-cli targets [flags] list targets known to the queried endpoint
|
||||
| `--uri-re` | — | Filter: RE2 regex against request URI |
|
||||
| `--is-tor` | — | Filter: `1` or `!=0` = TOR only; `0` or `!=1` = non-TOR only |
|
||||
| `--asn` | — | Filter: ASN expression (`12345`, `!=65000`, `>=1000`, `<64512`, …) |
|
||||
| `--source-tag`| — | Filter: exact `ipng_source_tag` (e.g. `direct`, `cdn`) |
|
||||
|
||||
### `topn` flags
|
||||
|
||||
| Flag | Default | Description |
|
||||
|---------------|------------|----------------------------------------------------------|
|
||||
| `--n` | `10` | Number of entries |
|
||||
| `--window` | `5m` | `1m` `5m` `15m` `60m` `6h` `24h` |
|
||||
| `--group-by` | `website` | `website` `prefix` `uri` `status` `asn` |
|
||||
| Flag | Default | Description |
|
||||
|---------------|------------|-----------------------------------------------------------------------|
|
||||
| `--n` | `10` | Number of entries |
|
||||
| `--window` | `5m` | `1m` `5m` `15m` `60m` `6h` `24h` |
|
||||
| `--group-by` | `website` | `website` `prefix` `uri` `status` `asn` `source_tag` |
|
||||
|
||||
### `trend` flags
|
||||
|
||||
|
||||
Reference in New Issue
Block a user