Files
ipng.ch/content/articles/2023-03-24-lego-dns01.md

347 lines
16 KiB
Markdown

---
date: "2023-03-24T10:56:54Z"
title: 'Case Study: Let''s Encrypt DNS-01'
aliases:
- /s/articles/2023/03/24/lego-dns01.html
---
Last week I shared how IPng Networks deployed a loadbalanced frontend cluster of NGINX webservers
that have public IPv4 / IPv6 addresses, but talk to a bunch of internal webservers that are in a
private network which isn't directly connected to the internet, so called _IPng Site Local_
[[ref]({{< ref "2023-03-11-mpls-core" >}})] with addresses **198.19.0.0/16** and
**2001:678:d78:500::/56**.
I wrote in [[that article]({{< ref "2023-03-17-ipng-frontends" >}})] that IPng will be using
_ACME_ HTTP-01 validation, which asks the certificate authority, in this case Let's Encrypt, to
contact the webserver on a well-known URI for each domain that I'm requesting a certificate for.
Unsurprisingly, several folks reached out to me asking "well what about DNS-01", and one sentence
caught their eye:
> Some SSL certificate providers allow for wildcards (ie. `*.ipng.ch`), but I'm going to keep it
> relatively simple and use [[Let's Encrypt](https://letsencrypt.org/)] which offers free
> certificates with a validity of three months.
I could've seen this one coming! The sentence can be read to imply it doesn't, but **of course**
Let's Encrypt offers wildcard certificates. It just doesn't satisfy my _relatively simple_ qualifier
of the second part of the sentence ... So here I go, down the rabbit hole that is understanding
(for myself, and possibly for readers of this article), how the DNS-01 challenge works, in greater
detail. Hopefully after writing this (me) and reading this (you), we can all agree that I was
wrong, and that using DNS-01 ***is*** relatively simple after all.
## Overview
I've installed three frontend NGINX servers (running at Coloclue AS8283, IPng AS8298 and IP-Max
AS25091), and one LEGO certificate machine (running in the internal _IPng Site Local_ network).
In the [[previous article]({{< ref "2023-03-17-ipng-frontends" >}})], I described the setup and
the use of Let's Encrypt with HTTP-01 challenges. I'll skip that here.
#### HTTP-01 vs DNS-01
{{< image width="200px" float="right" src="/assets/ipng-frontends/lego-logo.min.svg" alt="LEGO" >}}
Today, most SSL authorities and their customers use the Automatic Certificate Management Environment
or _ACME protocol_ which is described in [[RFC8555](https://www.rfc-editor.org/rfc/rfc8555)]. It
defines a way for certificate authorities to check the websites that they are asked to issue a
certificate for using so-called challenges. One popular challenge is the so-called `HTTP-01`, in
which the certificate authority will visit a well-known URI on the website domain for which the
certificate is being requested, namely `/.well-known/acme-challenge/`, which described in
[[RFC5785](https://www.rfc-editor.org/rfc/rfc5785)]. The CA will expect the webserver to respond
with an agreed upon string of numbers at that location, in which case proof of ownership is
established and a certificate is issued.
In some situations, this `HTTP-01` challenge can be difficult to perform:
* If the webserver is not reachable from the internet, or not reachable from the Let's Encrypt
servers, for example if it is on an intranet, such as _IPng Site Local_ itself.
* If the operator would prefer a wildcard certificate, proving ownership of all possible
sub-domains is no longer feasible with `HTTP-01` but proving ownership of the parent domain is.
One possible solution for these cases is to use the ACME challenge `DNS-01`, which doesn't use the
webserver running on `go.ipng.ch` to prove ownership, but the _nameserver_ that serves `ipng.ch`
instead. The Let's Encrypt GO client [[ref](https://go-acme.github.io/lego/)] supports both
challenges types.
The flow of requests in a `DNS-01` challenge is as follows:
{{< image width="400px" float="right" src="/assets/ipng-frontends/acme-flow-dns01.svg" alt="ACME Flow DNS01" >}}
1. First, the _LEGO_ client registers itself with the ACME-DNS server running on `auth.ipng.ch`.
After successful registration, _LEGO_ is given a username, password, and access to one DNS
recordname $(RRNAME).
It is expected that the operator sets up a CNAME for a well-known record `_acme-challenge.ipng.ch`
which points to that `$(RRNAME).auth.ipng.ch`. This happens only once.
1. When a certificate is needed, the _LEGO_ client contacts the Certificate Authority and requests
validation for the hostname `go.ipng.ch`. The CA will will inform the client of a random
number $(RANDOM) that it expects to see in a a well-known TXT record for `_acme-challenge.ipng.ch`
(which is the CNAME set up previously).
1. The _LEGO_ client now uses the username and password it received in step 1, to update the TXT
record of its `$(RRNAME).auth.ipng.ch` record to contain the $(RANDOM) number it learned in step 2.
1. The CA will issue a TXT query for `_acme-challenge.ipng.ch`, which is a CNAME to
`$(RRNAME).auth.ipng.ch`, which ultimately responds to the TXT query with the $(RANDOM) number.
1. After validating that the response on the TXT records contains the agreed upon random number, the
CA knows that the operator of the nameserver is the same as the certificate requestor for the domain.
It issues a certificate to the _LEGO_ client, which stores it on its local filesystem.
1. Similar to any other challenge, the _LEGO_ machine can now distribute the private key and
certificate to all NGINX machines, which are now capable of serving SSL traffic under the given names.
One thing worth noting, is that the TXT query is for _domain_ names, not _hostnames_, in other
words, anything in the `ipng.ch` domain will solicit a query to `_acme-challenge.ipng.ch` by the
`DNS-01` challenge. It is for this reason, that the challenge allows for wildcard certificates,
which can greatly reduce operational complexity and the total number of certificates needed.
### ACME DNS
Originally, DNS providers were expected to give the ability for their clients to _directly_ update the
well-known `_acme-challenge` TXT record, and while many commercial providers allow for this, IPng
Networks runs just plain-old [[NSD](https://nlnetlabs.nl/projects/nsd/about)] as authoritative
nameservers (shown above as `nsd0`, `nsd1` and `nsd2`). So what todo? Luckily, it was quickly
understood by the community that if there is a lookup for TXT record of `_acme-challenge.ipng.ch`,
that it would be absolutely OK to make some form of DNS-symlink by means of a CNAME.
One really great solution that leverages this ability is written by Joona Hoikkala, called
[[ACME-DNS](https://github.com/joohoi/acme-dns)]. It's sole purpose is to allow for an API, served
over https, to register new clients, let those clients update their TXT record(s), and then serve
them out in DNS. It's meant to be a multi-tenant system, by which I mean one ACME-DNS instance can
host millions of domains from thousands of distinct users.
#### Installing
I noticed that ACME-DNS relies on features in relatively modern Go, and the standard version that
comes with Debian Bullseye is a tad old, so first I need to install Go v1.19 from backports, before
I can continue with the build of the binary:
```
lego@lego:~$ sudo apt -t bullseye-backports install golang
lego@lego:~/src$ git clone https://github.com/joohoi/acme-dns
lego@lego:~/src/acme-dns$ export GOPATH=/tmp/acme-dns
lego@lego:~/src/acme-dns$ go build
lego@lego:~/src/acme-dns$ sudo cp acme-dns /usr/local/bin/acme-dns
lego@lego:~/src/acme-dns$ cat << EOF | sudo tee /lib/systemd/system/acme-dns.service
[Unit]
Description=Limited DNS server with RESTful HTTP API to handle ACME DNS challenges easily and
securely
After=network.target
[Service]
User=lego
Group=lego
AmbientCapabilities=CAP_NET_BIND_SERVICE
WorkingDirectory=~
ExecStart=/usr/local/bin/acme-dns -c /home/lego/acme-dns/config.cfg
Restart=on-failure
[Install]
WantedBy=multi-user.target
EOF
```
This authoritative nameserver will want to listen on UDP and TCP ports 53, for which it either needs to
run as root, or perhaps better, run as non-privileged user with the `CAP_NET_BIND_SERVICE`
capability. The only other difference with the provided unit file, is that I'll be running this as
the `lego` user, with a configuration file and working path in its home-directory.
#### Configuring
***Step 1. Delegate auth.ipng.ch***
The first thing I should do is configure the subdomain for ACME-DNS, which I decide will be hosted on
`auth.ipng.ch`. I assign it an NS, an A and a AAAA record, and then update the `ipng.ch` domain:
```
$ORIGIN ipng.ch.
$TTL 86400
@ IN SOA ns.paphosting.net. hostmaster.ipng.ch. ( 2023032401 28800 7200 604800 86400)
NS ns.paphosting.nl.
NS ns.paphosting.net.
NS ns.paphosting.eu.
; ACME DNS
auth NS auth.ipng.ch.
A 194.1.163.93
AAAA 2001:678:d78:3::93
```
This snippet will make a DNS delegation for sub-domain `auth.ipng.ch` to the server also called
`auth.ipng.ch` and because the downstream delegation is in the same domain, I need to provide _glue_
records, that tell clients who are querying for `auth.ipng.ch` where to find that nameserver. At
this point, any request for `*.auth.ipng.ch` will end up being forwarded to the authoritative
nameserver, which can be found at either 194.1.163.93 or 2001:678:d78:3::93.
***Step 2. Start ACME DNS***
After having built the acme-dns server and given it a suitable systemd unit file, and knowing that
it's going to be responsible for the sub-domain `auth.ipng.ch`, I give it the following straight
forward configuration file:
```
lego@lego:~$ mkdir ~/acme-dns/
lego@lego:~$ cat << EOF > acme-dns/config.cfg
[general]
listen = "[::]:53"
protocol = "both"
domain = "auth.ipng.ch"
nsname = "auth.ipng.ch"
nsadmin = "hostmaster.ipng.ch"
records = [
"auth.ipng.ch. NS auth.ipng.ch.",
"auth.ipng.ch. A 194.1.163.93",
"auth.ipng.ch. AAAA 2001:678:d78:3::93",
]
debug = false
[database]
engine = "sqlite3"
connection = "/home/lego/acme-dns/acme-dns.db"
[api]
ip = "[::]"
disable_registration = false
port = "443"
tls = "letsencrypt"
acme_cache_dir = "/home/lego/acme-dns/api-certs"
notification_email = "hostmaster+dns-auth@ipng.ch"
corsorigins = [ "*" ]
use_header = false
header_name = "X-Forwarded-For"
[logconfig]
loglevel = "debug"
logtype = "stdout"
logformat = "text"
EOF
lego@lego:~$ sudo systemctl enable acme-dns
lego@lego:~$ sudo systemctl start acme-dns
```
The first part of this tells the server how to construct the SOA record (domain, nsname and
nsadmin), and which records to put in the apex, nominally the NS/A/AAAA records that describe the
nameserver which is authoritative for the `auth.ipng.ch` domain. Then, the database part is where
user credentials will be stored, and the API portion shows how users will be able to interact with
the controlplane part of the service, notably registering new clients, and updating nameserver TXT
records for existing clients.
{{< image width="200px" float="right" src="/assets/ipng-frontends/turtles.png" alt="Turtles" >}}
Interestingly, the API is served on HTTPS port 443, and for that it needs, you guessed it, a
certificate! ACME-DNS eats its own dogfood, which I can appreciate: it will use `DNS-01` validation
to get a certificate for `auth.ipng.ch` _itself_, by serving the challenge for well known record
`_acme-challenge.auth.ipng.ch`, so it's turtles all the way down!
***Step 3. Register a new client***
Seeing as many public DNS providers allow programmatic setting of the contents of the zonefiles, for
them it's a matter of directly being driven by _LEGO_. But for me, running NSD, I am going to be using
the ACME DNS server to fulfill that purpose, so I have to configure it to do that for me.
In the explanation of `DNS-01` challenges above, you'll remember I made a mention of registering. Here's
a closer look at what that means:
```
lego@lego:~$ curl -s -X POST https://auth.ipng.ch/register | json_pp
{
"allowfrom" : [],
"fulldomain" : "76f88564-740b-4483-9bc0-86d1fb531e20.auth.ipng.ch",
"password" : "<redacted>",
"subdomain" : "76f88564-740b-4483-9bc0-86d1fb531e20",
"username" : "e4608fdf-9a69-4930-8cf1-57218738792d"
}
```
What happened here is that, using the HTTPS endpoint, I asked the ACME-DNS server to create for me an empty
DNS record, which it did on `76f88564-740b-4483-9bc0-86d1fb531e20.auth.ipng.ch`. Further, if I offer
the given username and password, I am able to update that record's value. Let's take a look:
```
lego@lego:~$ dig +short TXT 02e3acfc-bbca-46bb-9cee-8eab52c73c30.auth.ipng.ch
lego@lego:~$ curl -s -X POST -H "X-Api-User: 5f3591d1-0d13-4816-a329-7965a8639ab5" \
-H "X-Api-Key: <redacted>" \
-d '{"subdomain": "02e3acfc-bbca-46bb-9cee-8eab52c73c30", \
"txt": "___Hello_World_token_______________________"}' \
https://auth.ipng.ch/update
```
Numbers everywhere, but I learned a lot here! Notice how the first time I sent the `dig` request for
the `02e3acfc-bbca-46bb-9cee-8eab52c73c30.auth.ipng.ch` it did not respond anything (an empty
record). But then, using the username/password I could update the record with a 41 character
string, and I was informed of the `fulldomain` key there, which is the one that I should be
configuring in the domain(s) for which I want to get a certificate.
I configure it in the `ipng.ch` and `ipng.nl` domain as follows (taking `ipng.nl` as an example):
```
$ORIGIN ipng.nl.
$TTL 86400
@ IN SOA ns.paphosting.net. hostmaster.ipng.nl. ( 2023032401 28800 7200 604800 86400)
IN NS ns.paphosting.nl.
IN NS ns.paphosting.net.
IN NS ns.paphosting.eu.
CAA 0 issue "letsencrypt.org"
CAA 0 issuewild "letsencrypt.org"
CAA 0 iodef "mailto:hostmaster@ipng.ch"
_acme-challenge CNAME 8ee2969b-571c-4b3a-b6a0-6d6221130c96.auth.ipng.ch.
```
The records here are a `CAA` which is a type of DNS record used to provide additional confirmation
for the Certificate Authority when validating an SSL certificate. This record allows me to specify
which certificate authorities are authorized to deliver SSL certificates for the domain. Then, the
well known `_acme-challenge.ipng.nl` record is merely telling the client by means of a `CNAME` to go
ask for `8ee2969b-571c-4b3a-b6a0-6d6221130c96.auth.ipng.ch` instead.
Putting this part all together now, I can issue a query for that ipng.nl domain ...
```
lego@lego:~$ dig +short TXT _acme-challenge.ipng.nl.
"___Hello_World_token_______________________"
```
... and would you look at that! The query for the ipng.nl domain, is a CNAME to the specific uuid
record in the auth.ipng.ch domain, where ACME-DNS is serving it with the response that I can
programmatically set to different values, yee-haw!
***Step 4. Run LEGO***
The _LEGO_ client has all sorts of challenge providers linked in. Once again, Debian is a bit behind
on things, shipping version 3.2.0-3.1+b5 in Bullseye, although upstream is much further along. So I
purge the Debian package and download the v4.10.2 amd64 package directly from its
[[Github](https://github.com/go-acme/lego/releases)] releases page. The ACME-DNS handler was only
added in v4 of the client. But now all that's left for me to do is run it:
```
lego@lego:~$ export ACME_DNS_API_BASE=https://auth.ipng.ch/
lego@lego:~$ export ACME_DNS_STORAGE_PATH=/home/lego/acme-dns/credentials.json
lego@lego:~$ /home/lego/bin/lego --path /etc/lego/ --email noc@ipng.ch --accept-tos --dns acme-dns \
--domains ipng.ch --domains *.ipng.ch \
--domains ipng.nl --domains *.ipng.nl \
run
```
The LEGO client goes through the ACME flow that I described at the top of this article, and ends up
spitting out a certificate \o/
```
lego@lego:~$ openssl x509 -noout -text -in /etc/lego/certificates/ipng.ch.crt
Certificate:
Data:
Version: 3 (0x2)
Serial Number:
03:58:8f:c1:25:00:e2:f3:d3:3f:d6:ed:ba:bc:1d:0d:54:ea
Signature Algorithm: sha256WithRSAEncryption
Issuer: C = US, O = Let's Encrypt, CN = R3
Validity
Not Before: Mar 21 20:24:08 2023 GMT
Not After : Jun 19 20:24:07 2023 GMT
Subject: CN = ipng.ch
X509v3 extensions:
X509v3 Subject Alternative Name:
DNS:*.ipng.ch, DNS:*.ipng.nl, DNS:ipng.ch, DNS:ipng.nl
```
Et voila! Wildcard certificates for multiple domains using ACME-DNS.