PRE-RELEASE 0.9.1: Makefile, Debian packaging, versioned UDP

Build and release tooling:
- Makefile with help as default; targets: build/build-amd64/build-arm64,
  test, lint, proto, pkg-deb, docker, docker-push, clean, plus
  install-deps (+ three sub-targets for apt / Go toolchain / Go tools).
- internal/version package; -ldflags -X injects Version/Commit/Date into
  every binary. -version flag on all four binaries (nginx-logtail version
  for the CLI).
- Dockerfile takes VERSION/COMMIT/DATE build-args and forwards them.
- .deb output lands in build/; .gitignore ignores /build/.

Debian package:
- debian/build-deb.sh packages all four static binaries into a single
  nginx-logtail_<ver>_<arch>.deb using dpkg-deb.
- Binary layout: /usr/sbin/nginx-logtail-{collector,aggregator,frontend}
  and /usr/bin/nginx-logtail.
- nginx-logtail(8) manpage.
- Three systemd units (collector, aggregator, frontend) shipped under
  /lib/systemd/system/. Installed but never enabled or started — the
  operator opts in per host.
- Collector runs as _logtail:www-data (log access); aggregator and
  frontend as _logtail:_logtail. postinst creates the system user/group
  idempotently.
- Single shared env file /etc/default/nginx-logtail rendered from a
  template at first install with %HOSTNAME% substituted. Sensible
  defaults for every COLLECTOR_*, AGGREGATOR_*, FRONTEND_* variable;
  plus COLLECTOR_ARGS / AGGREGATOR_ARGS / FRONTEND_ARGS escape hatches
  appended to ExecStart. Not a dpkg conffile: operator edits survive
  upgrades and dpkg --purge removes it.

Versioned UDP wire format:
- ParseUDPLine dispatches on a leading "v<N>\t" tag; v1 routes to the
  existing 12-field parser. Unknown/missing versions fail closed so
  future v2 parsers can land before emitters are upgraded.
- Tests updated; design.md FR-2.2 rewritten to make the version tag
  normative.

Docs:
- README.md gains a Quick Start (Debian / Docker Compose / from source).
- user-guide.md rewritten around Installation and Configuration: full
  env-var table, UDP-only default explained, precise file/UDP log_format
  layouts, note that operators can emit "0" for unknown \$is_tor / \$asn.
- Drilldown cycle, frontend filter table, and CLI --group-by list all
  include source_tag. UDP counters documented in the Prometheus section.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-04-17 10:35:08 +02:00
parent 577ed3dad5
commit 143aad9063
23 changed files with 1214 additions and 114 deletions

View File

@@ -13,17 +13,63 @@ You have been warned :)
## What is this?
This project consists of four components:
1. A log collector that tails NGINX (or Apache) logs of a certain format, and aggregates
information per website, client address, status, and so on. It buckets these into windows
of 1min, 5min, 15min, 60min, 6hrs and 24hrs. It exposes this on a gRPC endpoint.
1. An aggregator that can scrape any number of collectors into a merged regional (or global)
view. The aggregator exposes the same gRPC endpoint as the collectors.
1. A Frontend that allows to query this data structure very quickly.
1. A CLI that allows to query this data also, returning JSON for further processing.
1. A log **collector** that tails NGINX (or Apache) logs and/or receives logs over UDP from
[`nginx-ipng-stats-plugin`](https://git.ipng.ch/ipng/nginx-ipng-stats-plugin), aggregating
counts per website, client address, URI, status, ASN, and source tag. It buckets these into
windows of 1m, 5m, 15m, 60m, 6h, and 24h and exposes them over gRPC.
1. An **aggregator** that subscribes to any number of collectors and serves a merged view on
the same gRPC surface.
1. An HTTP **frontend** that renders a drilldown dashboard (zero JavaScript, server-side SVG
sparklines) against any collector or the aggregator.
1. A **CLI** for shell queries, returning tables or JSON.
It's written in Go, and is meant to deploy collectors on any number of webservers, and central
aggregation and frontend logic. It's released under [[APACHE](LICENSE)] license. It can be run
either as `systemd` units, or in Docker, or any combination of the two.
Written in Go, released under [[APACHE](LICENSE)]. Runs as `systemd` units, in Docker, or any
combination.
## Quick Start
Three deployment flavors. Pick whichever suits the host.
**Debian package.** Build once, install the `.deb` on every nginx host (for the collector) and
on one central host (for the aggregator + frontend):
```bash
make install-deps # one-time: apt deps, Go toolchain, go tools
make pkg-deb # produces nginx-logtail_<ver>_{amd64,arm64}.deb
# on each nginx host:
sudo dpkg -i nginx-logtail_*_amd64.deb
sudo $EDITOR /etc/default/nginx-logtail # defaults to UDP-only on :9514; set COLLECTOR_LOGS=... to also tail files
sudo systemctl enable --now nginx-logtail-collector.service
# on the central host:
sudo dpkg -i nginx-logtail_*_amd64.deb
sudo systemctl enable --now nginx-logtail-aggregator.service nginx-logtail-frontend.service
# dashboard now at http://<central>:8080
```
Binaries land at `/usr/sbin/nginx-logtail-{collector,aggregator,frontend}` and the CLI at
`/usr/bin/nginx-logtail`. All three services run as the `_logtail` system user (collector uses
`Group=www-data` for log access). None are auto-enabled, so installing the package is safe on
any host.
**Docker Compose.** Runs the aggregator and frontend in one stack; point collectors (on each
nginx host) at the aggregator:
```bash
AGGREGATOR_COLLECTORS=nginx1:9090,nginx2:9090 docker compose up -d
# frontend on :8080, aggregator gRPC on :9091
```
**From source (`make`).**
```bash
make build # build/<arch>/{collector,aggregator,frontend,cli}
make test
./build/*/nginx-logtail -version
```
`make help` lists every target.
See [[User Guide](docs/user-guide.md)] for operator-facing documentation, or
[[Design](docs/design.md)] for the normative requirements and architectural rationale.