ctlog-uptime-exporter

A Prometheus exporter for Certificate Transparency log uptime data published by Google at:

https://www.gstatic.com/ct/compliance/endpoint_uptime_24h.csv

The CSV reports the 24-hour uptime percentage for each CT log URL broken down by operation type (add-chain, get-entries, get-sth, etc.). This exporter fetches that CSV on a configurable schedule and exposes the data as Prometheus metrics.

Metrics

Metric	Labels	Description
`ct_log_uptime_ratio`	`log_url`, `endpoint`	24h uptime as a ratio (0-1)
`ct_log_uptime_fetch_success`		1 if the last fetch succeeded, 0 otherwise
`ct_log_uptime_fetch_timestamp_seconds`		Unix timestamp of the last fetch attempt

Standard go_* and process_* metrics are also exposed.

Usage

go build -o ctlog-uptime-exporter .
./ctlog-uptime-exporter

Metrics are served at http://localhost:9781/metrics.

Flags

Flag	Default	Description
`-listen`	`:9781`	Address to listen on
`-url`	`https://www.gstatic.com/ct/compliance/endpoint_uptime_24h.csv`	URL of the uptime CSV
`-interval`	`12h`	How often to fetch the CSV
`-jitter`	`5m`	Maximum +/-jitter applied to the fetch interval

The exporter fetches the CSV once on startup, then repeats every -interval +/- a random value up to -jitter. This avoids thundering-herd effects if multiple instances run in parallel.

Example Prometheus scrape config

scrape_configs:
  - job_name: ctlog_uptime
    static_configs:
      - targets: ['localhost:9781']

systemd

A unit file and defaults file are included.

# install binary
go build -o /usr/local/bin/ctlog-uptime-exporter .

# install defaults (edit to taste)
cp ctlog-uptime-exporter.default /etc/default/ctlog-uptime-exporter

# install and start the service
cp ctlog-uptime-exporter.service /etc/systemd/system/
systemctl daemon-reload
systemctl enable --now ctlog-uptime-exporter

Runtime flags are controlled via the ARGS variable in /etc/default/ctlog-uptime-exporter.

Grafana dashboard

dashboard.json can be imported directly into Grafana (Dashboards -> Import). It expects a Prometheus datasource and provides:

Summary stats: number of logs, endpoint types, average uptime, degraded count, fetch status, last fetch time
Variable selectors for Log URL and Endpoint (both multi-select with All)
Time series panel showing the rolling 24h uptime ratio over the chosen time range
Table of the top N least-available log/endpoint pairs (N is selectable: 5, 10, 25, 50)

License

Apache 2.0 - see LICENSE.

2.6 KiB Raw Blame History