Rewrite all images to Hugo format
This commit is contained in:
628
content/articles/2023-08-27-ansible-nginx.md
Normal file
628
content/articles/2023-08-27-ansible-nginx.md
Normal file
@ -0,0 +1,628 @@
|
||||
---
|
||||
date: "2023-08-27T08:56:54Z"
|
||||
title: 'Case Study: NGINX + Certbot with Ansible'
|
||||
---
|
||||
|
||||
# About this series
|
||||
|
||||
{{< image width="200px" float="right" src="/assets/ansible/Ansible_logo.svg" alt="Ansible" >}}
|
||||
|
||||
In the distant past (to be precise, in November of 2009) I wrote a little piece of automation together with my buddy Paul, called
|
||||
_PaPHosting_. The goal was to be able to configure common attributes like servername, config files, webserver and DNS configs in a
|
||||
consistent way, tracked in Subversion. By the way despite this project deriving its name from the first two authors, our mutual buddy Jeroen
|
||||
also started using it, and has written lots of additional cool stuff in the repo, as well as helped to move from Subversion to Git a few
|
||||
years ago.
|
||||
|
||||
Michael DeHaan [[ref](https://www.ansible.com/blog/author/michael-dehaan)] founded Ansible in 2012, and by then our little _PaPHosting_
|
||||
project, which was written as a set of bash scripts, had sufficiently solved our automation needs. But, as is the case with most home-grown
|
||||
systems, over time I kept on seeing more and more interesting features and integrations emerge, solid documentation, large user group, and
|
||||
eventually I had to reconsider our 1.5K LOC of Bash and ~16.5K files under maintenance, and in the end, I settled on Ansible.
|
||||
|
||||
```
|
||||
commit c986260040df5a9bf24bef6bfc28e1f3fa4392ed
|
||||
Author: Pim van Pelt <pim@ipng.nl>
|
||||
Date: Thu Nov 26 23:13:21 2009 +0000
|
||||
|
||||
pim@squanchy:~/src/paphosting$ find * -type f | wc -l
|
||||
16541
|
||||
|
||||
pim@squanchy:~/src/paphosting/scripts$ wc -l *push.sh funcs
|
||||
132 apache-push.sh
|
||||
148 dns-push.sh
|
||||
92 files-push.sh
|
||||
100 nagios-push.sh
|
||||
178 nginx-push.sh
|
||||
271 pkg-push.sh
|
||||
100 sendmail-push.sh
|
||||
76 smokeping-push.sh
|
||||
371 funcs
|
||||
1468 total
|
||||
```
|
||||
|
||||
In a [[previous article]({% post_url 2023-03-17-ipng-frontends %})], I talked about having not one but a cluster of NGINX servers that would
|
||||
each share a set of SSL certificates and pose as a reversed proxy for a bunch of websites. At the bottom of that article, I wrote:
|
||||
|
||||
> The main thing that's next is to automate a bit more of this. IPng Networks has an Ansible controller, which I'd like to add ...
|
||||
> but considering Ansible is its whole own elaborate bundle of joy, I'll leave that for maybe another article.
|
||||
|
||||
**Tadaah.wav** that article is here! This is by no means an introduction or howto to Ansible. For that, please take a look at the
|
||||
incomparable Jeff Geerling [[ref](https://www.jeffgeerling.com/)] and his book: [[Ansible for Devops](https://www.ansiblefordevops.com/)]. I
|
||||
bought and read this book, and I highly recommend it.
|
||||
|
||||
## Ansible: Playbook Anatomy
|
||||
|
||||
The first thing I do is install four Debian Bookworm virtual machines, two in Amsterdam, one in Geneva and one in Zurich. These will be my
|
||||
first group of NGINX servers, that are supposed to be my geo-distributed frontend pool. I don't do any specific configuration or
|
||||
installation of packages, I just leave whatever deboostrap gives me, which is a relatively lean install with 8 vCPUs, 16GB of memory, a 20GB
|
||||
boot disk and a 30G second disk for caching and static websites.
|
||||
|
||||
Ansible is a simple, but powerful, server and configuration management tool (with a few other tricks up its sleeve). It consists of an
|
||||
_inventory_ (the hosts I'll manage), that are put in one or more _groups_, there is a registery of _variables_ (telling me things about
|
||||
those hosts and groups), and an elaborate system to run small bits of automation, called _tasks_ organized in things called _Playbooks_.
|
||||
|
||||
### NGINX Cluster: Group Basics
|
||||
|
||||
First of all, I create an Ansible _group_ called **nginx** and I add the following four freshly installed virtual machine hosts to it:
|
||||
|
||||
```
|
||||
pim@squanchy:~/src/ipng-ansible$ cat << EOF | tee -a inventory/nodes.yml
|
||||
nginx:
|
||||
hosts:
|
||||
nginx0.chrma0.net.ipng.ch:
|
||||
nginx0.chplo0.net.ipng.ch:
|
||||
nginx0.nlams1.net.ipng.ch:
|
||||
nginx0.nlams2.net.ipng.ch:
|
||||
EOF
|
||||
```
|
||||
|
||||
I have a mixture of Debian and OpenBSD machines at IPng Networks, so I will add this group **nginx** as a child to another group called
|
||||
**debian**, so that I can run "common debian tasks", such as installing Debian packages that I want all of my servers to have, adding users
|
||||
and their SSH key for folks who need access, installing and configuring the firewall and things like Borgmatic backups.
|
||||
|
||||
I'm not going to go into all the details here for the **debian** playbook, though. It's just there to make the base system consistent across
|
||||
all servers (bare metal or virtual). The one thing I'll mention though, is that the **debian** playbook will see to it that the correct
|
||||
users are created, with their SSH pubkey, and I'm going to first use this feature by creating two users:
|
||||
|
||||
1. `lego`: As I described in a [[post on DNS-01]({% post_url 2023-03-24-lego-dns01 %})], IPng has a certificate machine that answers Let's
|
||||
Encrypt DNS-01 challenges, and its job is to regularly prove ownership of my domains, and then request a (wildcard!) certificate.
|
||||
Once that renews, copy the certificate to all NGINX machines. To do that copy, `lego` needs an account on these machines, it needs
|
||||
to be able to write the certs and issue a reload to the NGINX server.
|
||||
1. `drone`: Most of my websites are static, for example `ipng.ch` is generated by Jekyll. I typically write an article on my laptop, and
|
||||
once I'm happy with it, I'll git commit and push it, after which a _Continuous Integration_ system called [[Drone](https://drone.io)]
|
||||
gets triggered, builds the website, runs some tests, and ultimately copies it out to the NGINX machines. Similar to the first user,
|
||||
this second user must have an account and the ability to write its web data to the NGINX server in the right spot.
|
||||
|
||||
That explains the following:
|
||||
|
||||
```yaml
|
||||
pim@squanchy:~/src/ipng-ansible$ cat << EOF | tee group_vars/nginx.yml
|
||||
---
|
||||
users:
|
||||
lego:
|
||||
comment: Lets Encrypt
|
||||
password: "!"
|
||||
groups: [ lego ]
|
||||
drone:
|
||||
comment: Drone CI
|
||||
password: "!"
|
||||
groups: [ www-data ]
|
||||
|
||||
sshkeys:
|
||||
lego:
|
||||
- key: ecdsa-sha2-nistp256 <hidden>
|
||||
comment: lego@lego.net.ipng.ch
|
||||
drone:
|
||||
- key: ecdsa-sha2-nistp256 <hidden>
|
||||
comment: drone@git.net.ipng.ch
|
||||
```
|
||||
|
||||
I note that the `users` and `sshkeys` used here are dictionaries, and that the `users` role defines a few default accounts like my own
|
||||
account `pim`, so writing this to the **group_vars** means that these new entries are applied to all machines that belong to the group
|
||||
**nginx**, so they'll get these users created _in addition to_ the other users in the dictionary. Nifty!
|
||||
|
||||
### NGINX Cluster: Config
|
||||
|
||||
I wanted to be able to conserve IP addresses, and just a few months ago, had a discussion with some folks at Coloclue where we shared the
|
||||
frustration that what was hip in the 90s (go to RIPE NCC and ask for a /20, justifying that with "I run SSL websites") is somehow still
|
||||
being used today, even though that's no longer required, or in fact, desirable. So I take one IPv4 and IPv6 address and will use a TLS
|
||||
extension called _Server Name Indication_ or [[SNI](https://en.wikipedia.org/wiki/Server_Name_Indication)], designed in 2003 (**20 years
|
||||
old today**), which you can see described in [[RFC 3546](https://datatracker.ietf.org/doc/html/rfc3546)].
|
||||
|
||||
Folks who try to argue they need multiple IPv4 addresses because they run multiple SSL websites are somewhat of a trigger to me, so this
|
||||
article doubles up as a "how to do SNI and conserve IPv4 addresses".
|
||||
|
||||
I will group my websites that share the same SSL certificate, and I'll call these things _clusters_. An IPng NGINX Cluster:
|
||||
|
||||
* is identified by a name, for example `ipng` or `frysix`
|
||||
* is served by one or more NGINX servers, for example `nginx0.chplo0.ipng.ch` and `nginx0.nlams1.ipng.ch`
|
||||
* serves one or more distinct websites, for example `www.ipng.ch` and `nagios.ipng.ch` and `go.ipng.ch`
|
||||
* has exactly one SSL certificate, which should cover all of the website(s), preferably using wildcard certs, for example `*.ipng.ch,
|
||||
ipng.ch`
|
||||
|
||||
And then, I define several clusters this way, in the following configuration file:
|
||||
|
||||
```yaml
|
||||
pim@squanchy:~/src/ipng-ansible$ cat << EOF | tee vars/nginx.yml
|
||||
---
|
||||
nginx:
|
||||
clusters:
|
||||
ipng:
|
||||
members: [ nginx0.chrma0.net.ipng.ch, nginx0.chplo0.net.ipng.ch, nginx0.nlams1.net.ipng.ch, nginx0.nlams2.net.ipng.ch ]
|
||||
ssl_common_name: ipng.ch
|
||||
sites:
|
||||
ipng.ch:
|
||||
nagios.ipng.ch:
|
||||
go.ipng.ch:
|
||||
frysix:
|
||||
members: [ nginx0.nlams1.net.ipng.ch, nginx0.nlams2.net.ipng.ch ]
|
||||
ssl_common_name: frys-ix.net
|
||||
sites:
|
||||
frys-ix.net:
|
||||
```
|
||||
|
||||
This way I can neatly group the websites (eg. the **ipng** websites) together, call them by name, and immediately see which servers are going to
|
||||
be serving them using which certificate common name. For future expansion (hint: an upcoming article on monitoring), I decide to make the
|
||||
**sites** element here a _dictionary_ with only keys and no values as opposed to a _list_, because later I will want to add some bits and
|
||||
pieces of information for each website.
|
||||
|
||||
### NGINX Cluster: Sites
|
||||
|
||||
As is common with NGINX, I will keep a list of websites in the directory `/etc/nginx/sites-available/` and once I need a given machine to
|
||||
actually serve that website, I'll symlink it from `/etc/nginx/sites-enabled/`. In addition, I decide to add a few common configuration
|
||||
snippets, such as logging and SSL/TLS parameter files and options, which allow the webserver to score relatively high on SSL certificate
|
||||
checker sites. It helps to keep the security buffs off my case.
|
||||
|
||||
So I decide on the following structure, each file to be copied to all nginx machines in `/etc/nginx/`:
|
||||
|
||||
```
|
||||
roles/nginx/files/conf.d/http-log.conf
|
||||
roles/nginx/files/conf.d/ipng-headers.inc
|
||||
roles/nginx/files/conf.d/options-ssl-nginx.inc
|
||||
roles/nginx/files/conf.d/ssl-dhparams.inc
|
||||
roles/nginx/files/sites-available/ipng.ch.conf
|
||||
roles/nginx/files/sites-available/nagios.ipng.ch.conf
|
||||
roles/nginx/files/sites-available/go.ipng.ch.conf
|
||||
roles/nginx/files/sites-available/go.ipng.ch.htpasswd
|
||||
roles/nginx/files/sites-available/...
|
||||
```
|
||||
|
||||
In order:
|
||||
* `conf.d/http-log.conf` defines a custom logline type called `upstream` that contains a few interesting additional items that show me
|
||||
the performance of NGINX:
|
||||
> log_format upstream '$remote_addr - $remote_user [$time_local] ' '"$request" $status $body_bytes_sent ' '"$http_referer" "$http_user_agent" ' 'rt=$request_time uct=$upstream_connect_time uht=$upstream_header_time urt=$upstream_response_time';
|
||||
* `conf.d/ipng-headers.inc` adds a header served to end-users from this NGINX, that reveals the instance that served the request.
|
||||
Debugging a cluster becomes a lot easier if you know which server served what:
|
||||
> add_header X-IPng-Frontend $hostname always;
|
||||
* `conf.d/options-ssl-nginx.inc` and `conf.d/ssl-dhparams.inc` are files borrowed from Certbot's NGINX configuration, and ensure the best
|
||||
TLS and SSL session parameters are used.
|
||||
* `sites-available/*.conf` are the configuration blocks for the port-80 (HTTP) and port-443 (SSL certificate) websites. In the interest of
|
||||
brevity I won't copy them here, but if you're curious I showed a bunch of these in a [[previous article]({% post_url
|
||||
2023-03-17-ipng-frontends %})]. These per-website config files sensibly include the SSL defaults, custom IPng headers and `upstream` log
|
||||
format.
|
||||
|
||||
### NGINX Cluster: Let's Encrypt
|
||||
|
||||
I figure the single most important thing to get right is how to enable multiple groups of websites, including SSL certificates, in multiple
|
||||
_Clusters_ (say `ipng` and `frysix`), to be served using different SSL certificates, but on the same IPv4 and IPv6 address, using _Server
|
||||
Name Indication_ or SNI. Let's first take a look at building these two of these certificates, one for [[IPng Networks](https://ipng.ch)] and
|
||||
one for [[FrysIX](https://frys-ix.net/)], the internet exchange with Frysian roots, which incidentally offers free 1G, 10G, 40G and 100G
|
||||
ports all over the Amsterdam metro. My buddy Arend and I are running that exchange, so please do join it!
|
||||
|
||||
I described the usual `HTTP-01` certificate challenge a while ago in [[this article]({% post_url 2023-03-17-ipng-frontends %})], but I
|
||||
rarely use it because I've found that once installed, `DNS-01` is vastly superior. I wrote about the ability to request a single certificate
|
||||
with multiple _wildcard_ entries in a [[DNS-01 article]({% post_url 2023-03-24-lego-dns01 %})], so I'm going to save you the repetition, and
|
||||
simply use `certbot`, `acme-dns` and the `DNS-01` challenge type, to request the following _two_ certificates:
|
||||
|
||||
```bash
|
||||
lego@lego:~$ certbot certonly --config-dir /home/lego/acme-dns --logs-dir /home/lego/logs \
|
||||
--work-dir /home/lego/workdir --manual --manual-auth-hook /home/lego/acme-dns/acme-dns-auth.py \
|
||||
--preferred-challenges dns --debug-challenges \
|
||||
-d ipng.ch -d *.ipng.ch -d *.net.ipng.ch \
|
||||
-d ipng.nl -d *.ipng.nl \
|
||||
-d ipng.eu -d *.ipng.eu \
|
||||
-d ipng.li -d *.ipng.li \
|
||||
-d ublog.tech -d *.ublog.tech \
|
||||
-d as8298.net -d *.as8298.net \
|
||||
-d as50869.net -d *.as50869.net
|
||||
|
||||
lego@lego:~$ certbot certonly --config-dir /home/lego/acme-dns --logs-dir /home/lego/logs \
|
||||
--work-dir /home/lego/workdir --manual --manual-auth-hook /home/lego/acme-dns/acme-dns-auth.py \
|
||||
--preferred-challenges dns --debug-challenges \
|
||||
-d frys-ix.net -d *.frys-ix.net
|
||||
```
|
||||
|
||||
First off, while I showed how to get these certificates by hand, actually generating these two commands is easily doable in Ansible (which
|
||||
I'll show at the end of this article!) I defined which cluster has which main certificate name, and which websites it's wanting to serve.
|
||||
Looking at `vars/nginx.yml`, it becomes quickly obvious how I can automate this. Using a relatively straight forward construct, I can let
|
||||
Ansible create for me a list of commandline arguments programmatically:
|
||||
|
||||
1. Initialize a variable `CERT_ALTNAMES` as a list of `nginx.clusters.ipng.ssl_common_name` and its wildcard, in other words `[ipng.ch,
|
||||
*.ipng.ch]`.
|
||||
1. As a convenience, tack onto the `CERT_ALTNAMES` list any entries in the `nginx.clusters.ipng.ssl_altname`, such as `[*.net.ipng.ch]`.
|
||||
1. Then looping over each entry in the `nginx.clusters.ipng.sites` dictionary, use `fnmatch` to match it against any entries in the
|
||||
`CERT_ALTNAMES` list:
|
||||
* If it matches, for example with `go.ipng.ch`, skip and continue. This website is covered already by an altname.
|
||||
* If it doesn't match, for example with `ublog.tech`, simply add it and its wildcard to the `CERT_ALTNAMES` list: `[ublog.tech, *.ublog.tech]`.
|
||||
|
||||
|
||||
Now, the first time I run this for a new cluster (which has never had a certificate issued before), `certbot` will ask me to ensure the correct
|
||||
`_acme-challenge` records are in each respective DNS zone. After doing that, it will issue two separate certificates and install a cronjob
|
||||
that will periodically check the age, and renew the certificate(s) when they are up for renewal. In a post-renewal hook, I will create a
|
||||
script that copies the new certificate to the NGINX cluster (using the `lego` user + SSH key that I defined above).
|
||||
|
||||
```bash
|
||||
lego@lego:~$ find /home/lego/acme-dns/live/ -type f
|
||||
/home/lego/acme-dns/live/README
|
||||
/home/lego/acme-dns/live/frys-ix.net/README
|
||||
/home/lego/acme-dns/live/frys-ix.net/chain.pem
|
||||
/home/lego/acme-dns/live/frys-ix.net/privkey.pem
|
||||
/home/lego/acme-dns/live/frys-ix.net/cert.pem
|
||||
/home/lego/acme-dns/live/frys-ix.net/fullchain.pem
|
||||
/home/lego/acme-dns/live/ipng.ch/README
|
||||
/home/lego/acme-dns/live/ipng.ch/chain.pem
|
||||
/home/lego/acme-dns/live/ipng.ch/privkey.pem
|
||||
/home/lego/acme-dns/live/ipng.ch/cert.pem
|
||||
/home/lego/acme-dns/live/ipng.ch/fullchain.pem
|
||||
```
|
||||
|
||||
The crontab entry that Certbot normally installs makes soms assumptions on directory and which user is running the renewal. I am not a fan of
|
||||
having the `root` user do this, so I've changed it to this:
|
||||
|
||||
```bash
|
||||
lego@lego:~$ cat /etc/cron.d/certbot
|
||||
0 */12 * * * lego perl -e 'sleep int(rand(43200))' && certbot -q renew \
|
||||
--config-dir /home/lego/acme-dns --logs-dir /home/lego/logs \
|
||||
--work-dir /home/lego/workdir \
|
||||
--deploy-hook "/home/lego/bin/certbot-distribute"
|
||||
```
|
||||
|
||||
And some pretty cool magic happens with this `certbot-distribute` script. When `certbot` has successfully received a new
|
||||
certificate, it'll set a few environment variables and execute the deploy hook with them:
|
||||
|
||||
* ***RENEWED_LINEAGE***: will point to the config live subdirectory (eg. `/home/lego/acme-dns/live/ipng.ch`) containing the new
|
||||
certificates and keys
|
||||
* ***RENEWED_DOMAINS*** will contain a space-delimited list of renewed certificate domains (eg. `ipng.ch *.ipng.ch *.net.ipng.ch`)
|
||||
|
||||
Using the first of those two things, I guess it becomes straight forward to distribute the new certs:
|
||||
|
||||
```bash
|
||||
#!/bin/sh
|
||||
|
||||
CERT=$(basename $RENEWED_LINEAGE)
|
||||
CERTFILE=$RENEWED_LINEAGE/fullchain.pem
|
||||
KEYFILE=$RENEWED_LINEAGE/privkey.pem
|
||||
|
||||
if [ "$CERT" = "ipng.ch" ]; then
|
||||
MACHS="nginx0.chrma0.ipng.ch nginx0.chplo0.ipng.ch nginx0.nlams1.ipng.ch nginx0.nlams2.ipng.ch"
|
||||
elif [ "$CERT" = "frys-ix.net" ]; then
|
||||
MACHS="nginx0.nlams1.ipng.ch nginx0.nlams2.ipng.ch"
|
||||
else
|
||||
echo "Unknown certificate $CERT, do not know which machines to copy to"
|
||||
exit 3
|
||||
fi
|
||||
|
||||
for MACH in $MACHS; do
|
||||
fping -q $MACH 2>/dev/null || {
|
||||
echo "$MACH: Skipping (unreachable)"
|
||||
continue
|
||||
}
|
||||
echo $MACH: Copying $CERT
|
||||
scp -q $CERTFILE $MACH:/etc/nginx/certs/$CERT.crt
|
||||
scp -q $KEYFILE $MACH:/etc/nginx/certs/$CERT.key
|
||||
echo $MACH: Reloading nginx
|
||||
ssh $MACH 'sudo systemctl reload nginx'
|
||||
done
|
||||
```
|
||||
|
||||
There are a few things to note, if you look at my little shell script. I already kind of know which `CERT` belongs to which `MACHS`,
|
||||
because this was configured in `vars/nginx.yml`, where I have a cluster name, say `ipng`, which conveniently has two variables, one called
|
||||
`members` which is a list of machines, and the second is `ssl_common_name` which is `ipng.ch`. I think that I can find a way to let
|
||||
Ansible generate this file for me also, whoot!
|
||||
|
||||
### Ansible: NGINX
|
||||
|
||||
Tying it all together (frankly, a tiny bit surprised you're still reading this!), I can now offer an Ansible role that automates all of this.
|
||||
|
||||
```yaml
|
||||
{%- raw %}
|
||||
pim@squanchy:~/src/ipng-ansible$ cat << EOF | tee roles/nginx/tasks/main.yml
|
||||
- name: Install Debian packages
|
||||
ansible.builtin.apt:
|
||||
update_cache: true
|
||||
pkg: [ nginx, ufw, net-tools, apache2-utils, mtr-tiny, rsync ]
|
||||
|
||||
- name: Copy config files
|
||||
ansible.builtin.copy:
|
||||
src: "{{ item }}"
|
||||
dest: "/etc/nginx/"
|
||||
owner: root
|
||||
group: root
|
||||
mode: u=rw,g=r,o=r
|
||||
directory_mode: u=rwx,g=rx,o=rx
|
||||
loop: [ conf.d, sites-available ]
|
||||
notify: Reload nginx
|
||||
|
||||
- name: Add cluster
|
||||
ansible.builtin.include_tasks:
|
||||
file: cluster.yml
|
||||
loop: "{{ nginx.clusters | dict2items }}"
|
||||
loop_control:
|
||||
label: "{{ item.key }}"
|
||||
EOF
|
||||
|
||||
pim@squanchy:~/src/ipng-ansible$ cat << EOF > roles/nginx/handlers/main.yml
|
||||
- name: Reload nginx
|
||||
ansible.builtin.service:
|
||||
name: nginx
|
||||
state: reloaded
|
||||
EOF
|
||||
{% endraw %}
|
||||
```
|
||||
|
||||
The first task installs the Debian packages I'll want to use. The `apache2-utils` package is to create and maintain `htpasswd` files and
|
||||
some other useful things. The `rsync` package is needed to accept both website data from the `drone` continuous integration user, as well as
|
||||
certificate data from the `lego` user.
|
||||
|
||||
The second task copies all of the (static) configuration files onto the machine, populating `/etc/nginx/conf.d/` and
|
||||
`/etc/nginx/sites-available/`. It uses a `notify` stanza to make note if any of these files (notably the ones in `conf.d/`) have changed, and
|
||||
if so, remember to invoke a _handler_ to reload the running NGINX to pick up those changes later on.
|
||||
|
||||
Finally, the third task branches out and executes the tasks defined in `tasks/cluster.yml` one for each NGINX cluster (in my case, `ipng`
|
||||
and then `frysix`):
|
||||
|
||||
```yaml
|
||||
{%- raw %}
|
||||
pim@squanchy:~/src/ipng-ansible$ cat << EOF | tee roles/nginx/tasks/cluster.yml
|
||||
- name: "Enable sites for cluster {{ item.key }}"
|
||||
ansible.builtin.file:
|
||||
src: "/etc/nginx/sites-available/{{ sites_item.key }}.conf"
|
||||
dest: "/etc/nginx/sites-enabled/{{ sites_item.key }}.conf"
|
||||
owner: root
|
||||
group: root
|
||||
state: link
|
||||
loop: "{{ (nginx.clusters[item.key].sites | default({}) | dict2items) }}"
|
||||
when: inventory_hostname in nginx.clusters[item.key].members | default([])
|
||||
loop_control:
|
||||
loop_var: sites_item
|
||||
label: "{{ sites_item.key }}"
|
||||
notify: Reload nginx
|
||||
EOF
|
||||
{% endraw %}
|
||||
```
|
||||
|
||||
This task is a bit more complicated, so let me go over it from outwards facing in. The thing that called us, already has a loop variable
|
||||
called `item` which has a key (`ipng`) and a value (the whole cluster defined under `nginx.clusters.ipng`). Now if I take that
|
||||
`item.key` variable and look at its `sites` dictionary (in other words: `nginx.clusters.ipng.sites`, I can create another loop over all the
|
||||
sites belonging to that cluster. Iterating over a dictionary in Ansible is done with a filter called `dict2items`, and because technically
|
||||
the cluster could have zero sites, I can ensure the `sites` dictionary defaults to the empty dictionary `{}`. Phew!
|
||||
|
||||
Ansible is running this for each machine, and of course I only want to execute this block, if the given machine (which is referenced as
|
||||
`inventory_hostname` occurs in the clusters' `members` list. If not: skip, if yes: go! which is what the `when` line does.
|
||||
|
||||
The loop itself then runs for each site in the `sites` dictionary, allowing the `loop_control` to give that loop variable a unique name
|
||||
called `sites_item`, and when printing information on the CLI, using the `label` set to the `sites_item.key` variable (eg. `frys-ix.net`)
|
||||
rather than the whole dictionary belonging to it.
|
||||
|
||||
With all of that said, the inner loop is easy: create a (sym)link for each website config file from `sites-available` to `sites-enabled` and
|
||||
if new links are created, invoke the _Reload nginx_ handler.
|
||||
|
||||
### Ansible: Certbot
|
||||
|
||||
***But what about that LEGO stuff?*** Fair question. The two scripts I described above (one to create the certbot certificate, and another
|
||||
to copy it to the correct machines), both need to be generated and copied to the right places, so here I go, appending to the tasks:
|
||||
|
||||
```yaml
|
||||
{%- raw %}
|
||||
pim@squanchy:~/src/ipng-ansible$ cat << EOF | tee -a roles/nginx/tasks/main.yml
|
||||
- name: Create LEGO directory
|
||||
ansible.builtin.file:
|
||||
path: "/etc/nginx/certs/"
|
||||
owner: lego
|
||||
group: lego
|
||||
mode: u=rwx,g=rx,o=
|
||||
|
||||
- name: Add sudoers.d
|
||||
ansible.builtin.copy:
|
||||
src: sudoers
|
||||
dest: "/etc/sudoers.d/lego-ipng"
|
||||
owner: root
|
||||
group: root
|
||||
|
||||
- name: Generate Certbot Distribute script
|
||||
delegate_to: lego.net.ipng.ch
|
||||
run_once: true
|
||||
ansible.builtin.template:
|
||||
src: certbot-distribute.j2
|
||||
dest: "/home/lego/bin/certbot-distribute"
|
||||
owner: lego
|
||||
group: lego
|
||||
mode: u=rwx,g=rx,o=
|
||||
|
||||
- name: Generate Certbot Cluster scripts
|
||||
delegate_to: lego.net.ipng.ch
|
||||
run_once: true
|
||||
ansible.builtin.template:
|
||||
src: certbot-cluster.j2
|
||||
dest: "/home/lego/bin/certbot-{{ item.key }}"
|
||||
owner: lego
|
||||
group: lego
|
||||
mode: u=rwx,g=rx,o=
|
||||
loop: "{{ nginx.clusters | dict2items }}"
|
||||
EOF
|
||||
|
||||
pim@squanchy:~/src/ipng-ansible$ cat << EOF | tee roles/nginx/files/sudoers
|
||||
## *** Managed by IPng Ansible ***
|
||||
#
|
||||
%lego ALL=(ALL) NOPASSWD: /usr/bin/systemctl reload nginx
|
||||
EOF
|
||||
{% endraw -%}
|
||||
```
|
||||
|
||||
The first task creates `/etc/nginx/certs` which will be owned by the user `lego`, and that's where Certbot will rsync the certificates after
|
||||
renewal. The second task then allows `lego` user to issue a `systemctl reload nginx` so that NGINX can pick up the certificates once they've
|
||||
changed on disk.
|
||||
|
||||
The third task generated the `certbot-distribute` script, that, depending on the common name of the certificate (for example `ipng.ch` or
|
||||
`frys-ix.net`), knows which NGINX machines to copy it to. Its logic is pretty similar to the plain-old shellscript I started with, but does
|
||||
have a few variable expansions. If you'll recall, that script had hard coded way to assemble the MACHS variable, which can be replaced now:
|
||||
|
||||
```bash
|
||||
{%- raw %}
|
||||
# ...
|
||||
{% for cluster_name, cluster in nginx.clusters.items() | default({}) %}
|
||||
{% if not loop.first%}el{% endif %}if [ "$CERT" = "{{ cluster.ssl_common_name }}" ]; then
|
||||
MACHS="{{ cluster.members | join(' ') }}"
|
||||
{% endfor %}
|
||||
else
|
||||
echo "Unknown certificate $CERT, do not know which machines to copy to"
|
||||
exit 3
|
||||
fi
|
||||
{% endraw %}
|
||||
```
|
||||
|
||||
One common Ansible trick here is to detect if a given loop has just begun (in which case `loop.first` will be true), or if this is the last
|
||||
element in the loop (in which case `loop.last` will be true). I can use this to emit the `if` (first) versus `elif` (not first) statements.
|
||||
|
||||
Looking back at what I wrote in this _Certbot Distribute_ task, you'll see I used two additional configuration elements:
|
||||
1. ***run_once***: Since there are potentially many machines in the **nginx** _Group_, by default Ansible will run this task for each machine. However, the Certbot cluster and distribute scripts really only need to be generated once per _Playbook_ execution, which is determined by this `run_once` field.
|
||||
1. ***delegate_to***: This task should be executed not on an NGINX machine, rather instead on the `lego.net.ipng.ch` machine, which is specified by the `delegate_to` field.
|
||||
|
||||
#### Ansible: lookup example
|
||||
|
||||
And now for the _pièce de résistance_, the fourth and final task generates a shell script that captures for each cluster the primary name
|
||||
(called `ssl_common_name`) and the list of alternate names which will turn into full commandline to request a certificate with all wildcard
|
||||
domains added (eg. `ipng.ch` and `*.ipng.ch`). To do this, I decide to create an Ansible [[Lookup
|
||||
Plugin](https://docs.ansible.com/ansible/latest/plugins/lookup.html)]. This lookup will simply return **true** if a given sitename is
|
||||
covered by any of the existing certificace altnames, including wildcard domains, for which I can use the standard python `fnmatch`.
|
||||
|
||||
First, I can create the lookup plugin in a a well-known directory, so Ansible can discover it:
|
||||
|
||||
```
|
||||
pim@squanchy:~/src/ipng-ansible$ cat << EOF | tee roles/nginx/lookup_plugins/altname_match.py
|
||||
import ansible.utils as utils
|
||||
import ansible.errors as errors
|
||||
from ansible.plugins.lookup import LookupBase
|
||||
import fnmatch
|
||||
|
||||
class LookupModule(LookupBase):
|
||||
def __init__(self, basedir=None, **kwargs):
|
||||
self.basedir = basedir
|
||||
def run(self, terms, variables=None, **kwargs):
|
||||
sitename = terms[0]
|
||||
cert_altnames = terms[1]
|
||||
for altname in cert_altnames:
|
||||
if sitename == altname:
|
||||
return [True]
|
||||
if fnmatch.fnmatch(sitename, altname):
|
||||
return [True]
|
||||
return [False]
|
||||
EOF
|
||||
```
|
||||
|
||||
The Python class here will compare the website name in `terms[0]` with a list of altnames given in
|
||||
`terms[1]` and will return True either if a literal match occured, or if the altname `fnmatch` with the sitename.
|
||||
It will return False otherwise. Dope! Here's how I use it in the `certbot-cluster` script, which is
|
||||
starting to get pretty fancy:
|
||||
|
||||
```bash
|
||||
{%- raw %}
|
||||
pim@squanchy:~/src/ipng-ansible$ cat << EOF | tee roles/nginx/templates/certbot-cluster.j2
|
||||
#!/bin/sh
|
||||
###
|
||||
### {{ ansible_managed }}
|
||||
###
|
||||
{% set cluster_name = item.key %}
|
||||
{% set cluster = item.value %}
|
||||
{% set sites = nginx.clusters[cluster_name].sites | default({}) %}
|
||||
#
|
||||
# This script generates a certbot commandline to initialize (or re-initialize) a given certificate for an NGINX cluster.
|
||||
#
|
||||
### Metadata for this cluster:
|
||||
#
|
||||
# {{ cluster_name }}: {{ cluster }}
|
||||
{% set cert_altname = [ cluster.ssl_common_name, '*.' + cluster.ssl_common_name ] %}
|
||||
{% do cert_altname.extend(cluster.ssl_altname|default([])) %}
|
||||
{% for sitename, site in sites.items() %}
|
||||
{% set altname_matched = lookup('altname_match', sitename, cert_altname) %}
|
||||
{% if not altname_matched %}
|
||||
{% do cert_altname.append(sitename) %}
|
||||
{% do cert_altname.append("*."+sitename) %}
|
||||
{% endif %}
|
||||
{% endfor %}
|
||||
# CERT_ALTNAME: {{ cert_altname | join(' ') }}
|
||||
#
|
||||
###
|
||||
|
||||
certbot certonly --config-dir /home/lego/acme-dns --logs-dir /home/lego/logs --work-dir /home/lego/workdir \
|
||||
--manual --manual-auth-hook /home/lego/acme-dns/acme-dns-auth.py \
|
||||
--preferred-challenges dns --debug-challenges \
|
||||
{% for domain in cert_altname %}
|
||||
-d {{ domain }}{% if not loop.last %} \{% endif %}
|
||||
|
||||
{% endfor %}
|
||||
EOF
|
||||
{% endraw %}
|
||||
```
|
||||
|
||||
Ansible provides a lot of templating and logic evaluation in its Jinja2 templating language, but it isn't really a programming language.
|
||||
That said, from the top, here's what happens:
|
||||
* I set three variables, `cluster_name`, `cluster` (the dictionary with the cluster config) and as a shorthand `sites` which is a
|
||||
dictionary of sites, defaulting to `{}` if it doesn't exist.
|
||||
* I'll print the cluster name and the cluster config for posterity. Who knows, eventually I'll be debugging this anyway :-)
|
||||
* Then comes the main thrust, the simple loop that I described above, but in Jinja2:
|
||||
* Initialize the `cert_altname` list with the `ssl_common_name` and its wildcard variant, optionally extending it with the list
|
||||
of altnames in `ssl_altname`, if it's set.
|
||||
* For each site in the sites dictionary, invoke the lookup and capture its (boolean) result in `altname_matched`.
|
||||
* If the match failed, we have a new domain, so add it and its wildcard variant to the `cert_altname` list, I use the `do`
|
||||
Jinja2 extension there comes from package `jinja2.ext.do`.
|
||||
* At the end of this, all of these website names have been reduced to their domain+wildcard variant, which I can loop over to emit
|
||||
the `-d` flags to `certbot` at the bottom of the file.
|
||||
|
||||
And with that, I can generate both the certificate request command, and distribute the resulting
|
||||
certificates to those NGINX servers that need them.
|
||||
|
||||
## Results
|
||||
|
||||
{{< image src="/assets/ansible/ansible-run.png" alt="Ansible Run" >}}
|
||||
|
||||
I'm very pleased with the results. I can clearly see that the two servers that I assigned to this
|
||||
NGINX cluster (the two in Amsterdam) got their sites enabled, whereas the other two (Zurich and
|
||||
Geneva) were skipped. I can also see that the new certbot request scripts was generated and the
|
||||
existing certbot-distribute script was updated (to be aware of where to copy a renewed cert for this
|
||||
cluster). And, in the end only the two relevant NGINX servers were reloaded, reducing overall risk.
|
||||
|
||||
One other way to show that the very same IPv4 and IPv6 address can be used to serve multiple
|
||||
distinct multi-domain/wildcard SSL certificates, using this _Server Name Indication_ (SNI, which, I
|
||||
repeat, has been available **since 2003** or so), is this:
|
||||
|
||||
```bash
|
||||
pim@squanchy:~$ HOST=nginx0.nlams1.ipng.ch
|
||||
pim@squanchy:~$ PORT=443
|
||||
pim@squanchy:~$ SERVERNAME=www.ipng.ch
|
||||
pim@squanchy:~$ openssl s_client -connect $HOST:$PORT -servername $SERVERNAME </dev/null 2>/dev/null \
|
||||
| openssl x509 -text | grep DNS: | sed -e 's,^ *,,'
|
||||
DNS:*.ipng.ch, DNS:*.ipng.eu, DNS:*.ipng.li, DNS:*.ipng.nl, DNS:*.net.ipng.ch, DNS:*.ublog.tech,
|
||||
DNS:as50869.net, DNS:as8298.net, DNS:ipng.ch, DNS:ipng.eu, DNS:ipng.li, DNS:ipng.nl, DNS:ublog.tech
|
||||
|
||||
pim@squanchy:~$ SERVERNAME=www.frys-ix.net
|
||||
pim@squanchy:~$ openssl s_client -connect $HOST:$PORT -servername $SERVERNAME </dev/null 2>/dev/null \
|
||||
| openssl x509 -text | grep DNS: | sed -e 's,^ *,,'
|
||||
DNS:*.frys-ix.net, DNS:frys-ix.net
|
||||
```
|
||||
|
||||
Ansible is really powerful, and once I got to know it a little bit, will readily admit it's way
|
||||
cooler than PaPhosting ever was :)
|
||||
|
||||
## What's Next
|
||||
|
||||
If you remember, I wrote that the `nginx.clusters.*.sites` would not be a list but rather a
|
||||
dictionary, because I'd like to be able to carry other bits of information. And if you take a close
|
||||
look at my screenshot above, you'll see I revealed something about Nagios... so in an upcoming post
|
||||
I'd like to share how IPng Networks arranges its Nagios environment, and I'll use the NGINX configs
|
||||
here to show how I automatically monitor all servers participating in an NGINX _Cluster_, both for
|
||||
pending certificate expiry, which should not generally happen precisely due to the automation here,
|
||||
but also in case any backend server takes the day off.
|
||||
|
||||
Stay tuned! Oh, and if you're good at Ansible and would like to point out how silly I approach
|
||||
things, please do drop me a line on Mastodon, where you can reach me on
|
||||
[[@IPngNetworks@ublog.tech](https://ublog.tech/@IPngNetworks)].
|
Reference in New Issue
Block a user