Files

Pim van Pelt 143aad9063 PRE-RELEASE 0.9.1: Makefile, Debian packaging, versioned UDP

Build and release tooling:
- Makefile with help as default; targets: build/build-amd64/build-arm64,
  test, lint, proto, pkg-deb, docker, docker-push, clean, plus
  install-deps (+ three sub-targets for apt / Go toolchain / Go tools).
- internal/version package; -ldflags -X injects Version/Commit/Date into
  every binary. -version flag on all four binaries (nginx-logtail version
  for the CLI).
- Dockerfile takes VERSION/COMMIT/DATE build-args and forwards them.
- .deb output lands in build/; .gitignore ignores /build/.

Debian package:
- debian/build-deb.sh packages all four static binaries into a single
  nginx-logtail_<ver>_<arch>.deb using dpkg-deb.
- Binary layout: /usr/sbin/nginx-logtail-{collector,aggregator,frontend}
  and /usr/bin/nginx-logtail.
- nginx-logtail(8) manpage.
- Three systemd units (collector, aggregator, frontend) shipped under
  /lib/systemd/system/. Installed but never enabled or started — the
  operator opts in per host.
- Collector runs as _logtail:www-data (log access); aggregator and
  frontend as _logtail:_logtail. postinst creates the system user/group
  idempotently.
- Single shared env file /etc/default/nginx-logtail rendered from a
  template at first install with %HOSTNAME% substituted. Sensible
  defaults for every COLLECTOR_*, AGGREGATOR_*, FRONTEND_* variable;
  plus COLLECTOR_ARGS / AGGREGATOR_ARGS / FRONTEND_ARGS escape hatches
  appended to ExecStart. Not a dpkg conffile: operator edits survive
  upgrades and dpkg --purge removes it.

Versioned UDP wire format:
- ParseUDPLine dispatches on a leading "v<N>\t" tag; v1 routes to the
  existing 12-field parser. Unknown/missing versions fail closed so
  future v2 parsers can land before emitters are upgraded.
- Tests updated; design.md FR-2.2 rewritten to make the version tag
  normative.

Docs:
- README.md gains a Quick Start (Debian / Docker Compose / from source).
- user-guide.md rewritten around Installation and Configuration: full
  env-var table, UDP-only default explained, precise file/UDP log_format
  layouts, note that operators can emit "0" for unknown \$is_tor / \$asn.
- Drilldown cycle, frontend filter table, and CLI --group-by list all
  include source_tag. UDP counters documented in the Prometheus section.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-04-17 10:35:08 +02:00

34 KiB

Raw Blame History

nginx-logtail User Guide

Overview

nginx-logtail is a four-component system for real-time traffic analysis across a cluster of nginx machines. It answers questions like:

Which client prefix is causing the most HTTP 429s right now?
Which website is getting the most 503s over the last 24 hours?
Which nginx machine is the busiest?
Is there a DDoS in progress, and from where?

Components:

Binary	Runs on	Role
`collector`	each nginx host	Tails log files and/or UDP datagrams, aggregates in memory, serves gRPC
`aggregator`	central host	Merges all collectors, serves unified gRPC
`frontend`	central host	HTTP dashboard with drilldown UI
`cli`	operator laptop	Shell queries against collector or aggregator

Every binary accepts -version (or nginx-logtail version for the CLI) and prints its version, git commit, and build date.

Installation

Three flavors. make help lists every target; make install-deps sets up a fresh build box (apt deps, Go toolchain, protoc-gen-go, golangci-lint).

Debian package

make pkg-deb         # produces nginx-logtail_<ver>_{amd64,arm64}.deb
sudo dpkg -i nginx-logtail_*_amd64.deb

The package installs:

Path	Contents
`/usr/sbin/nginx-logtail-{collector,aggregator,frontend}`	Service binaries
`/usr/bin/nginx-logtail`	CLI
`/lib/systemd/system/nginx-logtail-*.service`	Three systemd units
`/usr/share/man/man8/nginx-logtail.8.gz`	Manpage (`man 8 nginx-logtail`)
`/usr/share/nginx-logtail/default.template`	Defaults template
`/etc/default/nginx-logtail`	Generated on first install from the template

The postinst creates a system user/group _logtail if absent and renders the template into /etc/default/nginx-logtail with the short hostname substituted. None of the services are enabled or started automatically — installing the package is safe on any host. Operators opt in per service:

sudo systemctl enable --now nginx-logtail-collector.service    # on each nginx host
sudo systemctl enable --now nginx-logtail-aggregator.service   # on the central host
sudo systemctl enable --now nginx-logtail-frontend.service     # on the central host

The collector runs as _logtail:www-data so it can read nginx access logs that are group-readable by www-data; aggregator and frontend run as _logtail:_logtail.

Docker / Docker Compose

The repo's docker-compose.yml runs the aggregator and frontend together from a single image that contains all four binaries.

make docker                             # builds git.ipng.ch/ipng/nginx-logtail:v<ver> + :latest, native arch
make docker-push                        # multi-arch (amd64+arm64) buildx push

AGGREGATOR_COLLECTORS=nginx1:9090,nginx2:9090 docker compose up -d
# frontend on :8080, aggregator gRPC on :9091

Each container explicitly selects its binary via command: ["/usr/local/bin/<binary>"].

From source

git clone https://git.ipng.ch/ipng/nginx-logtail
cd nginx-logtail
make build          # -> build/<arch>/{collector,aggregator,frontend,cli}
make test
./build/*/cli version

Requires Go ≥ 1.24 (see go.mod). No CGO, no external runtime dependencies.

Configuration

/etc/default/nginx-logtail

The Debian package ships one shared environment file read by all three systemd units via EnvironmentFile=-/etc/default/nginx-logtail. It enumerates every flag the three daemons accept as a COLLECTOR_*, AGGREGATOR_*, or FRONTEND_* env var. Defaults on first install are sensible for a single-host deployment:

Variable	First-install default	Purpose
`COLLECTOR_LISTEN`	`:9090`	gRPC listen address
`COLLECTOR_PROM_LISTEN`	`:9100`	Prometheus metrics; set `""` to disable
`COLLECTOR_LOGS`	(empty — UDP-only)	Comma-sep log paths/globs
`COLLECTOR_LOGS_FILE`	(empty)	File with one path/glob per line
`COLLECTOR_SOURCE`	`$(hostname -s)` at install	Display name in query responses
`COLLECTOR_V4PREFIX`	`24`	IPv4 bucket prefix
`COLLECTOR_V6PREFIX`	`48`	IPv6 bucket prefix
`COLLECTOR_SCAN_INTERVAL`	`10s`	Log-glob rescan cadence
`COLLECTOR_LOGTAIL_PORT`	`9514`	UDP port for `ipng_stats_logtail` (0 disables)
`COLLECTOR_LOGTAIL_BIND`	`127.0.0.1`	UDP bind address
`AGGREGATOR_LISTEN`	`:9091`	gRPC listen address
`AGGREGATOR_COLLECTORS`	`localhost:9090`	Comma-sep collectors (mandatory)
`AGGREGATOR_SOURCE`	`$(hostname -s)` at install	Display name
`FRONTEND_LISTEN`	`:8080`	HTTP dashboard address
`FRONTEND_TARGET`	`localhost:9091`	Default gRPC endpoint
`FRONTEND_N`	`25`	Default table row count
`FRONTEND_REFRESH`	`30`	Meta-refresh seconds; `0` disables

At least one of COLLECTOR_LOGS, COLLECTOR_LOGS_FILE, or COLLECTOR_LOGTAIL_PORT > 0 must be set, otherwise the collector refuses to start. The shipped default (COLLECTOR_LOGS= empty plus COLLECTOR_LOGTAIL_PORT=9514) makes the collector UDP-only — no file tailer goroutine is launched when no log patterns are supplied.

Three escape-hatch variables — COLLECTOR_ARGS, AGGREGATOR_ARGS, FRONTEND_ARGS — are appended verbatim to each unit's ExecStart argv. Use them for flags without an env-var form, or for temporary overrides, without editing the unit.

The file is not a dpkg conffile: postinst writes it only when absent, so operator edits survive upgrades, and dpkg --purge removes it.

nginx — file-based ingest

Add the logtail format and attach it to whichever server blocks you want tracked:

http {
    log_format logtail '$host\t$remote_addr\t$msec\t$request_method\t$request_uri\t$status\t$body_bytes_sent\t$request_time\t$is_tor\t$asn';

    server {
        access_log /var/log/nginx/access.log logtail;
        # or per-vhost:
        access_log /var/log/nginx/www.example.com.access.log logtail;
    }
}

Tab-separated, fixed field order, ten fields. The precise layout:

#	Field	Ingested into
0	`$host`	`website`
1	`$remote_addr`	`client_prefix` (truncated)
2	`$msec`	(discarded)
3	`$request_method`	Prom `method` label
4	`$request_uri`	`http_request_uri` (query stripped)
5	`$status`	`http_response`
6	`$body_bytes_sent`	Prom body histogram
7	`$request_time`	Prom duration histogram
8	`$is_tor`	`is_tor` (optional)
9	`$asn`	`asn` (optional)

$is_tor is 1 if the client IP is a TOR exit node and 0 otherwise (typically populated via a Lua script or $geoip2_data_*). $asn is the client AS number as a decimal integer (e.g. MaxMind GeoIP2's $geoip2_data_autonomous_system_number).

If either is unknown, emit 0. A literal 0 in $is_tor parses as false; a literal 0 in $asn parses as ASN 0, which you can exclude at query time with --asn '!=0' / the asn!=0 filter expression. Operators who don't have TOR or GeoIP data can simply emit 0 for both columns and everything works.

Both fields are also positionally optional for backward compatibility — older 8-field lines are accepted and default to false / 0. Records from the file tailer are always tagged source_tag="direct".

Then point the collector at the log files via COLLECTOR_LOGS — comma-separated paths or glob patterns. Make sure the files are group-readable by www-data (the collector's primary group in the systemd unit).

nginx — UDP ingest (`nginx-ipng-stats-plugin`)

If the nginx host runs nginx-ipng-stats-plugin, the plugin's ipng_stats_logtail directive emits one UDP datagram per request directly to the collector, no log file involved. The wire format is versioned — every datagram starts with a literal v1\t prefix so the collector can ship new parser versions (v2, v3, …) before emitters are upgraded and route each packet accordingly.

http {
    log_format ipng_stats_logtail
        'v1\t$host\t$remote_addr\t$request_method\t$request_uri\t$status\t$body_bytes_sent\t$request_time\t$is_tor\t$asn\t$ipng_source_tag\t$server_addr\t$scheme';

    ipng_stats_logtail ipng_stats_logtail udp://127.0.0.1:9514 buffer=64k flush=1s;
}

Precise v1 layout — 13 tab-separated fields total (version prefix + 12 payload fields):

#	Field	Ingested into
0	`v1`	version tag
1	`$host`	`website`
2	`$remote_addr`	`client_prefix` (truncated)
3	`$request_method`	Prom `method` label
4	`$request_uri`	`http_request_uri` (query stripped)
5	`$status`	`http_response`
6	`$body_bytes_sent`	Prom body histogram
7	`$request_time`	Prom duration histogram
8	`$is_tor`	`is_tor`
9	`$asn`	`asn`
10	`$ipng_source_tag`	`source_tag`
11	`$server_addr`	(parsed and discarded)
12	`$scheme`	(parsed and discarded)

Compared to the file format: the version tag is added, $msec is dropped, and three fields are appended — $ipng_source_tag (propagated into the data model), $server_addr and $scheme (reserved for future use).

Unknown $is_tor / $asn: emit 0. Same convention as the file format — operators without TOR or GeoIP data can emit 0 for both columns and everything works. A literal 0 in $is_tor is false; a literal 0 in $asn is ASN 0, filterable at query time.

All 13 fields are required for v1 — malformed packets (wrong version, wrong field count, bad IP) are silently dropped and counted via logtail_udp_packets_received_total minus logtail_udp_loglines_success_total. Both paths (file + UDP) can feed the same collector simultaneously; they converge on the same aggregation pipeline.

Collector

Runs on each nginx machine. Ingests logs from files (via fsnotify) and/or UDP datagrams (from nginx-ipng-stats-plugin), maintains in-memory top-K counters across six time windows, and exposes a gRPC interface for the aggregator (and directly for the CLI).

Flags

Flag	Default	Description
`--listen`	`:9090`	gRPC listen address
`--prom-listen`	`:9100`	Prometheus metrics address; empty string to disable
`--logs`	—	Comma-separated log file paths or glob patterns
`--logs-file`	—	File containing one log path/glob per line
`--source`	hostname	Name for this collector in query responses
`--v4prefix`	`24`	IPv4 prefix length for client bucketing
`--v6prefix`	`48`	IPv6 prefix length for client bucketing
`--scan-interval`	`10s`	How often to rescan glob patterns for new/removed files
`--logtail-port`	`0` (off)	UDP port receiving `ipng_stats_logtail` datagrams
`--logtail-bind`	`127.0.0.1`	UDP bind address
`--version`	—	Print version, commit, build date and exit

At least one of --logs, --logs-file, or --logtail-port > 0 is required; otherwise the collector refuses to start.

Examples

# UDP-only (nginx-ipng-stats-plugin feed)
./collector --logtail-port 9514

# Single file
./collector --logs /var/log/nginx/access.log

# Multiple files via glob (one inotify instance regardless of count)
./collector --logs "/var/log/nginx/*/access.log"

# Files and UDP at the same time
./collector --logs "/var/log/nginx/*.log" --logtail-port 9514

# Many files via a config file
./collector --logs-file /etc/nginx-logtail/logs.conf

# Custom prefix lengths and listen address
./collector \
  --logs "/var/log/nginx/*.log" \
  --listen :9091 \
  --source nginx3.prod \
  --v4prefix 24 \
  --v6prefix 48

logs-file format

One path or glob pattern per line. Lines starting with # are ignored.

# /etc/nginx-logtail/logs.conf
/var/log/nginx/access.log
/var/log/nginx/*/access.log
/var/log/nginx/api.example.com.access.log

Log rotation

The collector handles logrotate automatically. On RENAME/REMOVE events it drains the old file descriptor to EOF (so no lines are lost), then retries opening the original path with backoff until the new file appears. No restart or SIGHUP required.

Prometheus metrics

The collector exposes a Prometheus-compatible /metrics endpoint on --prom-listen (default :9100). Set --prom-listen "" to disable it entirely.

Per-host series:

nginx_http_requests_total{host, method, status} — counter. Map capped at 250 000 distinct label sets; new entries beyond the cap are dropped until the map is rolled over.
nginx_http_response_body_bytes_{bucket,count,sum}{host, le} — histogram of $body_bytes_sent. Buckets (bytes): 256, 1024, 4096, 16384, 65536, 262144, 1048576, +Inf.
nginx_http_request_duration_seconds_{bucket,count,sum}{host, le} — histogram of $request_time. Buckets (seconds): 0.005, 0.01, 0.025, 0.05, 0.1, 0.25, 0.5, 1, 2.5, 5, 10, +Inf. Not split by source_tag (duration histogram stays per-host to avoid cardinality blow-up).

Per-source_tag roll-ups (parallel series, not a cross-product with host):

nginx_http_requests_by_source_total{source_tag} — counter.
nginx_http_response_body_bytes_by_source_{bucket,count,sum}{source_tag, le} — histogram.

UDP ingest counters — lets operators distinguish parse failures from back-pressure drops:

logtail_udp_packets_received_total — datagrams read off the socket.
logtail_udp_loglines_success_total — parsed OK.
logtail_udp_loglines_consumed_total — forwarded to the store (not dropped).

received - success is the parse-failure rate; success - consumed is the back-pressure drop rate. Alert on either being non-zero.

Prometheus scrape config:

scrape_configs:
  - job_name: nginx_logtail
    static_configs:
      - targets:
          - nginx1:9100
          - nginx2:9100
          - nginx3:9100

Or with service discovery — the collector has no special requirements beyond a reachable TCP port.

Example queries:

# Request rate per host over last 5 minutes
rate(nginx_http_requests_total[5m])

# 5xx error rate fraction per host
sum by (host) (rate(nginx_http_requests_total{status=~"5.."}[5m]))
  /
sum by (host) (rate(nginx_http_requests_total[5m]))

# 95th percentile response time per host
histogram_quantile(0.95,
  sum by (host, le) (rate(nginx_http_request_duration_seconds_bucket[5m]))
)

# Median response body size per host
histogram_quantile(0.50,
  sum by (host, le) (rate(nginx_http_response_body_bytes_bucket[5m]))
)

Memory usage

The collector is designed to stay well under 1 GB:

Structure	Max entries	Approx size
Live map (current minute)	100 000	~19 MB
Fine ring (60 × 1-min)	60 × 50 000	~558 MB
Coarse ring (288 × 5-min)	288 × 5 000	~268 MB
Total		~845 MB

When the live map reaches 100 000 distinct 6-tuples, new keys are dropped for the rest of that minute. Existing keys continue to accumulate counts. The cap resets at each minute rotation.

Time windows

Data is served from two tiered ring buffers:

Window	Source ring	Resolution
1 min	fine	1 minute
5 min	fine	1 minute
15 min	fine	1 minute
60 min	fine	1 minute
6 h	coarse	5 minutes
24 h	coarse	5 minutes

History is lost on restart — the collector resumes tailing immediately but all ring buffers start empty. The fine ring fills in 1 hour; the coarse ring fills in 24 hours.

Running under systemd

The Debian package ships nginx-logtail-collector.service ready to run under the _logtail system user with Group=www-data (for log-file access). Every flag comes from /etc/default/nginx-logtail. To operate it:

sudo $EDITOR /etc/default/nginx-logtail        # set COLLECTOR_LOGS / COLLECTOR_LOGTAIL_PORT
sudo systemctl enable --now nginx-logtail-collector.service
sudo systemctl status nginx-logtail-collector.service
sudo journalctl -u nginx-logtail-collector.service -f

If you run from source without the package, compose a unit from the packaged template at debian/nginx-logtail-collector.service.

Aggregator

Runs on a central machine. Subscribes to the StreamSnapshots push stream from every configured collector, merges their snapshots into a unified in-memory cache, and serves the same gRPC interface as the collector. The frontend and CLI query the aggregator exactly as they would query a single collector.

Flags

Flag	Default	Description
`--listen`	`:9091`	gRPC listen address
`--collectors`	—	Comma-separated `host:port` addresses of collectors
`--source`	hostname	Name for this aggregator in query responses

--collectors is required; the aggregator exits immediately if it is not set.

Example

./aggregator \
  --collectors nginx1:9090,nginx2:9090,nginx3:9090 \
  --listen :9091 \
  --source agg-prod

Fault tolerance

The aggregator reconnects to each collector independently with exponential backoff (100 ms → doubles → cap 30 s). After 3 consecutive failures to a collector it marks that collector degraded: its last-known contribution is subtracted from the merged view so stale counts do not accumulate. When the collector recovers and sends a new snapshot, it is automatically reintegrated. The remaining collectors continue serving queries throughout.

Memory

The aggregator's merged cache uses the same tiered ring-buffer structure as the collector (60 × 1-min fine, 288 × 5-min coarse) but holds at most top-50 000 entries per fine bucket and top-5 000 per coarse bucket across all collectors combined. Memory footprint is roughly the same as one collector (~845 MB worst case).

Systemd unit example

[Unit]
Description=nginx-logtail aggregator
After=network.target

[Service]
ExecStart=/usr/local/bin/aggregator \
  --collectors nginx1:9090,nginx2:9090,nginx3:9090 \
  --listen :9091 \
  --source %H
Restart=on-failure
RestartSec=5

[Install]
WantedBy=multi-user.target

Frontend

HTTP dashboard. Connects to the aggregator (or directly to a single collector for debugging). Zero JavaScript — server-rendered HTML with inline SVG sparklines.

Flags

Flag	Default	Description
`--listen`	`:8080`	HTTP listen address
`--target`	`localhost:9091`	Default gRPC endpoint (aggregator or collector)
`--n`	`25`	Default number of table rows
`--refresh`	`30`	Auto-refresh interval in seconds; `0` to disable

Usage

Navigate to http://your-host:8080. The dashboard shows a ranked table of the top entries for the selected dimension and time window.

Window tabs — switch between 1m / 5m / 15m / 60m / 6h / 24h. Only the window changes; all active filters are preserved.

Dimension tabs — switch between grouping by website / asn / prefix / status / uri / source.

Drilldown — click any table row to add that value as a filter and advance to the next dimension in the hierarchy:

website → client prefix → request URI → HTTP status → ASN → source_tag → website (cycles)

Example: click example.com in the website view to see which client prefixes are hitting it; click a prefix there to see which URIs it is requesting; and so on.

Breadcrumb strip — shows all active filters above the table. Click × next to any token to remove just that filter, keeping the others.

Sparkline — inline SVG trend chart showing total request count per time bucket for the current filter state. Useful for spotting sudden spikes or sustained DDoS ramps.

Filter expression box — a text input above the table accepts a mini filter language that lets you type expressions directly without editing the URL:

status>=400
status>=400 AND website~=gouda.*
status>=400 AND website~=gouda.* AND uri~="^/api/"
website=example.com AND prefix=1.2.3.0/24

Supported fields and operators:

Field	Operators	Example
`status`	`=` `!=` `>` `>=` `<` `<=`	`status>=400`
`website`	`=` `~=`	`website~=gouda.*`
`uri`	`=` `~=`	`uri~=^/api/`
`prefix`	`=`	`prefix=1.2.3.0/24`
`is_tor`	`=` `!=`	`is_tor=1`, `is_tor!=0`
`asn`	`=` `!=` `>` `>=` `<` `<=`	`asn=8298`, `asn>=1000`
`source_tag`	`=`	`source_tag=direct`, `source_tag=cdn`

is_tor=1 and is_tor!=0 are equivalent (TOR traffic only). is_tor=0 and is_tor!=1 are equivalent (non-TOR traffic only).

asn accepts the same comparison expressions as status. Use asn=8298 to match a single AS, asn>=64512 to match the private-use ASN range, or asn!=0 to exclude unresolved entries.

~= means RE2 regex match. Values with spaces or quotes may be wrapped in double or single quotes: uri~="^/search\?q=".

The box pre-fills with the current active filter (including filters set by drilldown clicks), so you can see and extend what is applied. Submitting redirects to a clean URL with the individual filter params; × clear removes all filters at once.

On a parse error the page re-renders with the error shown below the input and the current data and filters unchanged.

Status expressions — the f_status URL param (and status in the expression box) accepts comparison expressions: 200, !=200, >=400, <500, etc.

Regex filters — f_website_re and f_uri_re URL params (and ~= in the expression box) accept RE2 regular expressions. The breadcrumb strip shows them as website~=gouda.* and uri~=^/api/ with the usual × remove link.

URL sharing — all filter state is in the URL query string (w, by, f_website, f_prefix, f_uri, f_status, f_website_re, f_uri_re, f_is_tor, f_asn, f_source_tag, n). Copy the URL to share an exact view with another operator, or bookmark a recurring query.

JSON output — append &raw=1 to any URL to receive the TopN result as JSON instead of HTML. Useful for scripting without the CLI binary:

# All 429s by prefix
curl -s 'http://frontend:8080/?f_status=429&by=prefix&w=1m&raw=1' | jq '.entries[0]'

# All errors (>=400) on gouda hosts
curl -s 'http://frontend:8080/?f_status=%3E%3D400&f_website_re=gouda.*&by=uri&w=5m&raw=1'

Target override — append ?target=host:port to point the frontend at a different gRPC endpoint for that request (useful for comparing a single collector against the aggregator):

http://frontend:8080/?target=nginx3:9090&w=5m

Source picker — when the frontend is pointed at an aggregator, a source: tab row appears below the dimension tabs listing each individual collector alongside an all tab (the default merged view). Clicking a collector tab switches the frontend to query that collector directly for the current request, letting you answer "which nginx machine is the busiest?" without leaving the dashboard. The picker is hidden when querying a collector directly (it has no sub-sources to list).

CLI

A shell companion for one-off queries and debugging. Works with any LogtailService endpoint — collector or aggregator. Accepts multiple targets, fans out concurrently, and labels each result. Default output is a human-readable table; add --json for machine-readable NDJSON.

Subcommands

logtail-cli topn    [flags]   ranked label → count table
logtail-cli trend   [flags]   per-bucket time series
logtail-cli stream  [flags]   live snapshot feed (runs until Ctrl-C)
logtail-cli targets [flags]   list targets known to the queried endpoint

Shared flags (all subcommands)

Flag	Default	Description
`--target`	`localhost:9090`	Comma-separated `host:port` list; queries fan out to all
`--json`	false	Emit newline-delimited JSON instead of a table
`--website`	—	Filter to this website
`--prefix`	—	Filter to this client prefix
`--uri`	—	Filter to this request URI
`--status`	—	Filter: HTTP status expression (`200`, `!=200`, `>=400`, `<500`, …)
`--website-re`	—	Filter: RE2 regex against website
`--uri-re`	—	Filter: RE2 regex against request URI
`--is-tor`	—	Filter: `1` or `!=0` = TOR only; `0` or `!=1` = non-TOR only
`--asn`	—	Filter: ASN expression (`12345`, `!=65000`, `>=1000`, `<64512`, …)
`--source-tag`	—	Filter: exact `ipng_source_tag` (e.g. `direct`, `cdn`)

`topn` flags

Flag	Default	Description
`--n`	`10`	Number of entries
`--window`	`5m`	`1m` `5m` `15m` `60m` `6h` `24h`
`--group-by`	`website`	`website` `prefix` `uri` `status` `asn` `source_tag`

`trend` flags

Flag	Default	Description
`--window`	`5m`	`1m` `5m` `15m` `60m` `6h` `24h`

Output format

Table (default — single target, no header):

RANK  COUNT    LABEL
   1  18 432   example.com
   2   4 211   other.com

Multi-target — each target gets a labeled section:

=== col-1 (nginx1:9090) ===
RANK  COUNT    LABEL
   1  10 000   example.com

=== agg-prod (agg:9091) ===
RANK  COUNT    LABEL
   1  18 432   example.com

JSON (--json) — a single JSON array with one object per target, suitable for jq:

[{"source":"agg-prod","target":"agg:9091","entries":[{"label":"example.com","count":18432},...]}]

stream JSON — one object per snapshot received (NDJSON), runs until interrupted:

{"ts":1773516180,"source":"col-1","target":"nginx1:9090","total_entries":823,"top_label":"example.com","top_count":10000}

`targets` subcommand

Lists the targets (collectors) known to the queried endpoint. When querying an aggregator, returns all configured collectors with their display names and addresses. When querying a collector, returns the collector itself (address shown as (self)).

# List collectors behind the aggregator
logtail-cli targets --target agg:9091

# Machine-readable output
logtail-cli targets --target agg:9091 --json

Table output example:

nginx1.prod                              nginx1:9090
nginx2.prod                              nginx2:9090
nginx3.prod                              (self)

JSON output (--json) — one object per target:

{"query_target":"agg:9091","name":"nginx1.prod","addr":"nginx1:9090"}

Examples

# Top 20 client prefixes sending 429s right now
logtail-cli topn --target agg:9091 --window 1m --group-by prefix --status 429 --n 20

# Same query, pipe to jq for scripting
logtail-cli topn --target agg:9091 --window 1m --group-by prefix --status 429 --n 20 \
  --json | jq '.[0].entries[0]'

# Which website has the most errors (4xx or 5xx) over the last 24h?
logtail-cli topn --target agg:9091 --window 24h --group-by website --status '>=400'

# Which client prefixes are NOT getting 200s? (anything non-success)
logtail-cli topn --target agg:9091 --window 5m --group-by prefix --status '!=200'

# Drill: top URIs on one website over the last 60 minutes
logtail-cli topn --target agg:9091 --window 60m --group-by uri --website api.example.com

# Filter by website regex: all gouda hosts
logtail-cli topn --target agg:9091 --window 5m --website-re 'gouda.*'

# Filter by URI regex: all /api/ paths
logtail-cli topn --target agg:9091 --window 5m --group-by uri --uri-re '^/api/'

# Show only TOR traffic — which websites are TOR clients hitting?
logtail-cli topn --target agg:9091 --window 5m --is-tor 1

# Show non-TOR traffic only — exclude exit nodes from the view
logtail-cli topn --target agg:9091 --window 5m --is-tor 0

# Top ASNs by request count over the last 5 minutes
logtail-cli topn --target agg:9091 --window 5m --group-by asn

# Which ASNs are generating the most 429s?
logtail-cli topn --target agg:9091 --window 5m --group-by asn --status 429

# Filter to traffic from a specific ASN
logtail-cli topn --target agg:9091 --window 5m --asn 8298

# Filter to traffic from private-use / unallocated ASNs
logtail-cli topn --target agg:9091 --window 5m --group-by prefix --asn '>=64512'

# Exclude unresolved entries (ASN 0) and show top source ASNs
logtail-cli topn --target agg:9091 --window 5m --group-by asn --asn '!=0'

# Compare two collectors side by side in one command
logtail-cli topn --target nginx1:9090,nginx2:9090 --window 5m

# Query both a collector and the aggregator at once
logtail-cli topn --target nginx3:9090,agg:9091 --window 5m --group-by prefix

# Trend of total traffic over 6h (for a quick sparkline in the terminal)
logtail-cli trend --target agg:9091 --window 6h --json | jq '.[0].points | [.[].count]'

# Watch live merged snapshots from the aggregator
logtail-cli stream --target agg:9091

# Watch two collectors simultaneously; each snapshot is labeled by source
logtail-cli stream --target nginx1:9090,nginx2:9090

The stream subcommand reconnects automatically after errors (5 s backoff) and runs until interrupted with Ctrl-C. The topn and trend subcommands exit immediately after one response.

Operational notes

No persistence. All data is in-memory. A collector restart loses ring buffer history but resumes tailing the log file from the current position immediately.

No TLS. Designed for trusted internal networks. If you need encryption in transit, put a TLS-terminating proxy (e.g. stunnel, nginx stream) in front of the gRPC port.

inotify limits. The collector uses a single inotify instance regardless of how many files it tails. If you tail files across many different directories, check /proc/sys/fs/inotify/max_user_watches (default 8192); increase it if needed:

echo 65536 | sudo tee /proc/sys/fs/inotify/max_user_watches

High-cardinality attacks. If a DDoS sends traffic from thousands of unique /24 prefixes with unique URIs, the live map will hit its 100 000 entry cap and drop new keys for the rest of that minute. The top-K entries already tracked continue accumulating counts. This is by design — the cap prevents memory exhaustion under attack conditions.

Clock skew. Trend sparklines are based on the collector's local clock. If collectors have significant clock skew, trend buckets from different collectors may not align precisely in the aggregator. NTP sync is recommended.

34 KiB Raw Blame History Unescape Escape

nginx-logtail User Guide

Overview

Installation

Debian package

Docker / Docker Compose

From source

Configuration

/etc/default/nginx-logtail

nginx — file-based ingest

nginx — UDP ingest (nginx-ipng-stats-plugin)

Collector

Flags

Examples

logs-file format

Log rotation

Prometheus metrics

Memory usage

Time windows

Running under systemd

Aggregator

Flags

Example

Fault tolerance

Memory

Systemd unit example

Frontend

Flags

Usage

CLI

Subcommands

Shared flags (all subcommands)

topn flags

trend flags

Output format

targets subcommand

Examples

Operational notes

34 KiB

Raw Blame History

nginx — UDP ingest (`nginx-ipng-stats-plugin`)

`topn` flags

`trend` flags

`targets` subcommand