Files
ipng.ch/content/articles/2025-05-28-minio-1.md
Pim van Pelt 5042f822ef
All checks were successful
continuous-integration/drone/push Build is passing
Minio Article #1
2025-06-01 12:53:16 +02:00

32 KiB

date, title
date title
2025-05-28T22:07:23Z Case Study: Minio S3 - Part 1

{{< image float="right" src="/assets/minio/minio-logo.png" alt="MinIO Logo" width="6em" >}}

Introduction

Amazon Simple Storage Service (Amazon S3) is an object storage service offering industry-leading scalability, data availability, security, and performance. Millions of customers of all sizes and industries store, manage, analyze, and protect any amount of data for virtually any use case, such as data lakes, cloud-native applications, and mobile apps. With cost-effective storage classes and easy-to-use management features, you can optimize costs, organize and analyze data, and configure fine-tuned access controls to meet specific business and compliance requirements.

Amazon's S3 became the de facto standard object storage system, and there exist several fully open source implementations of the protocol. One of them is MinIO: designed to allow enterprises to consolidate all of their data on a single, private cloud namespace. Architected using the same principles as the hyperscalers, AIStor delivers performance at scale at a fraction of the cost compared to the public cloud.

IPng Networks is an Internet Service Provider, but I also dabble in self-hosting things, for example [PeerTube], [Mastodon], [Immich], [Pixelfed] and of course [Hugo]. These services all have one thing in common: they tend to use lots of storage when they grow. At IPng Networks, all hypervisors ship with enterprise SAS flash drives, mostly 1.92TB and 3.84TB. Scaling up each of these services, and backing them up safely, can be quite the headache.

This article is for the storage-buffs. I'll set up a set of distributed MinIO nodes from scatch.

Physical

{{< image float="right" src="/assets/minio/disks.png" alt="MinIO Disks" width="16em" >}}

I'll start with the basics. I still have a few Dell R720 servers laying around, they are getting a bit older but still have 24 cores and 64GB of memory. First I need to get me some disks. I order 36pcs of 16TB SATA enterprise disk, a mixture of Seagate EXOS and Toshiba MG series disks. I've once learned (the hard way), that buying a big stack of disks from one production run is a risk - so I'll mix and match the drives.

Three trays of caddies and a melted credit card later, I have 576TB of SATA disks safely in hand. Each machine will carry 192TB of raw storage. The nice thing about this chassis is that Dell can ship them with 12x 3.5" SAS slots in the front, and 2x 2.5" SAS slots in the rear of the chassis.

So I'll install Debian Bookworm on one small 480G SSD in software RAID1.

Cloning an install

I have three identical machines so in total I'll want six of these SSDs. I temporarily screw the other five in 3.5" drive caddies and plug them into the first installed Dell, which I've called minio-proto:

pim@minio-proto:~$ for i in b c d e f; do
  sudo dd if=/dev/sda of=/dev/sd${i} bs=512 count=1;
  sudo mdadm --manage /dev/md0 --add /dev/md${i}1
done
pim@minio-proto:~$ sudo mdadm --manage /dev/md0 --grow 6
pim@minio-proto:~$ watch cat /proc/mdstat
pim@minio-proto:~$ for i in a b c d e f; do
  sudo grub-install /dev/sd$i
done

{{< image float="right" src="/assets/minio/rack.png" alt="MinIO Rack" width="16em" >}}

The first command takes my installed disk, /dev/sda, and copies the first sector over to the other five. This will give them the same partition table. Next, I'll add the first partition of each disk to the raidset. Then, I'll expand the raidset to have six members, after which the kernel starts a recovery process that syncs the newly added paritions to /dev/md0 (by copying from /dev/sda to all other disks at once). Finally, I'll watch this exciting movie and grab a cup of tea.

Once the disks are fully copied, I'll shut down the machine and distribute the disks to their respective Dell R720, two each. Once they boot they will all be identical. I'll need to make sure their hostnames, and machine/host-id are unique, otherwise things like bridges will have overlapping MAC addresses - ask me how I know:

pim@minio-proto:~$ sudo mdadm --manage /dev/md0 --grow -n 2
pim@minio-proto:~$ sudo rm /etc/ssh/ssh_host*
pim@minio-proto:~$ sudo hostname minio0-chbtl0
pim@minio-proto:~$ sudo dpkg-reconfigure openssh-server
pim@minio-proto:~$ sudo dd if=/dev/random of=/etc/hostid bs=4 count=1
pim@minio-proto:~$ sudo /usr/bin/dbus-uuidgen > /etc/machine-id
pim@minio-proto:~$ sudo reboot

After which I have three beautiful and unique machines:

  • minio0.chbtl0.net.ipng.ch: which will go into my server rack at the IPng office.
  • minio0.ddln0.net.ipng.ch: which will go to [[Daedalean]({{< ref 2022-02-24-colo >}})], doing AI since before it was all about vibe coding.
  • minio0.chrma0.net.ipng.ch: which will go to [IP-Max], one of the best ISPs on the planet. 🥰

Deploying Minio

The user guide that MinIO provides [ref] is super good, arguably one of the best documented open source projects I've ever seen. it shows me that I can do three types of install. A 'Standalone' with one disk, a 'Standalone Multi-Drive', and a 'Distributed' deployment. I decide to make three independent standalone multi-drive installs. This way, I have less shared fate, and will be immune to network partitions (as these are going to be in three different physical locations). I've also read about per-bucket replication, which will be an excellent way to get geographical distribution and active/active instances to work together.

I feel good about the single-machine multi-drive decision. I follow the install guide [ref] for this deployment type.

IPng Frontends

At IPng I use a private IPv4/IPv6/MPLS network that is not connected to the internet. I call this network [[IPng Site Local]({{< ref 2023-03-11-mpls-core.md >}})]. But how will users reach my Minio install? I have four redundantly and geographically deployed frontends, two in the Netherlands and two in Switzerland. I've described the frontend setup in a [[previous article]({{< ref 2023-03-17-ipng-frontends >}})] and the certificate management in [[this article]({{< ref 2023-03-24-lego-dns01 >}})].

I've decided to run the service on these three regionalized endpoints:

  1. s3.chbtl0.ipng.ch which will back into minio0.chbtl0.net.ipng.ch
  2. s3.ddln0.ipng.ch which will back into minio0.ddln0.net.ipng.ch
  3. s3.chrma0.ipng.ch which will back into minio0.chrma0.net.ipng.ch

The first thing I take note of is that S3 buckets can be either addressed by path, in other words something like s3.chbtl0.ipng.ch/my-bucket/README.md, but they can also be addressed by virtual host, like so: my-bucket.s3.chbtl0.ipng.ch/README.md. A subtle difference, but from the docs I understand that Minio needs to have control of the whole space under its main domain.

There's a small implication to this requirement -- the Web Console that ships with MinIO (eh, well, maybe that's going to change, more on that later), will want to have its own domain-name, so I choose something simple: cons0-s3.chbtl0.ipng.ch and so on. This way, somebody might still be able to have a bucket name called cons0 :)

Let's Encrypt Certificates

Alright, so I will be neading nine domains into this new certificate which I'll simply call s3.ipng.ch. I configure it in Ansible:

certbot:
  certs:
...
    s3.ipng.ch:
      groups: [ 'nginx', 'minio' ]
      altnames:
        - 's3.chbtl0.ipng.ch'
        - 'cons0-s3.chbtl0.ipng.ch'
        - '*.s3.chbtl0.ipng.ch'
        - 's3.ddln0.ipng.ch'
        - 'cons0-s3.ddln0.ipng.ch'
        - '*.s3.ddln0.ipng.ch'
        - 's3.chrma0.ipng.ch'
        - 'cons0-s3.chrma0.ipng.ch'
        - '*.s3.chrma0.ipng.ch'

I run the certbot playbook and it does two things:

  1. On the machines from group nginx and minio, it will ensure there exists a user lego with an SSH key and write permissions to /etc/lego/; this is where the automation will write (and update) the certificate keys.
  2. On the lego machine, it'll create two files. One is the certificate requestor, and the other is a certificate distribution script that will copy the cert to the right machine(s) when it renews.

On the lego machine, I'll run the cert request for the first time:

lego@lego:~$ bin/certbot:s3.ipng.ch
lego@lego:~$ RENEWED_LINEAGE=/home/lego/acme-dns/live/s3.ipng.ch bin/certbot-distribute

The first script asks me to add the _acme-challenge DNS entries, which I'll do, for example on the s3.chbtl0.ipng.ch instance (and similar for the ddln0 and chrma0 ones:

$ORIGIN chbtl0.ipng.ch.
_acme-challenge.s3        CNAME 51f16fd0-8eb6-455c-b5cd-96fad12ef8fd.auth.ipng.ch.
_acme-challenge.cons0-s3  CNAME 450477b8-74c9-4b9e-bbeb-de49c3f95379.auth.ipng.ch.
s3                        CNAME nginx0.ipng.ch.
*.s3                      CNAME nginx0.ipng.ch.
cons0-s3                  CNAME nginx0.ipng.ch.

I push and reload the ipng.ch zonefile with these changes after which the certificate gets requested and a cronjob added to check for renewals. The second script will copy the newly created cert to all three minio machines, and all four nginx machines. From now on, every 90 days, a new cert will be automatically generated and distributed. Slick!

NGINX Configs

With the LE wildcard certs in hand, I can create an NGINX frontend for these minio deployments.

First, a simple redirector service that punts people on port 80 to port 443:

server {
  listen [::]:80;
  listen 0.0.0.0:80;

  server_name cons0-s3.chbtl0.ipng.ch s3.chbtl0.ipng.ch *.s3.chbtl0.ipng.ch;
  access_log /var/log/nginx/s3.chbtl0.ipng.ch-access.log;
  include /etc/nginx/conf.d/ipng-headers.inc;

  location / {
    return 301 https://$server_name$request_uri;
  }
}

Next, the Minio API service itself which runs on port 9000, with a configuration snippet inspired by the MinIO [docs]:

server {
  listen [::]:443 ssl http2;
  listen 0.0.0.0:443 ssl http2;
  ssl_certificate /etc/certs/s3.ipng.ch/fullchain.pem;
  ssl_certificate_key /etc/certs/s3.ipng.ch/privkey.pem;
  include /etc/nginx/conf.d/options-ssl-nginx.inc;
  ssl_dhparam /etc/nginx/conf.d/ssl-dhparams.inc;

  server_name s3.chbtl0.ipng.ch *.s3.chbtl0.ipng.ch;
  access_log /var/log/nginx/s3.chbtl0.ipng.ch-access.log upstream;
  include /etc/nginx/conf.d/ipng-headers.inc;

  add_header Strict-Transport-Security "max-age=31536000; includeSubDomains" always;

  ignore_invalid_headers off;
  client_max_body_size 0;
  # Disable buffering
  proxy_buffering off;
  proxy_request_buffering off;

  location / {
    proxy_set_header Host $http_host;
    proxy_set_header X-Real-IP $remote_addr;
    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    proxy_set_header X-Forwarded-Proto $scheme;

    proxy_connect_timeout 300;
    proxy_http_version 1.1;
    proxy_set_header Connection "";
    chunked_transfer_encoding off;

    proxy_pass http://minio0.chbtl0.net.ipng.ch:9000;
  }
}

Finally, the Minio Console service which runs on port 9090:

include /etc/nginx/conf.d/geo-ipng-trusted.inc;

server {
  listen [::]:443 ssl http2;
  listen 0.0.0.0:443 ssl http2;
  ssl_certificate /etc/certs/s3.ipng.ch/fullchain.pem;
  ssl_certificate_key /etc/certs/s3.ipng.ch/privkey.pem;
  include /etc/nginx/conf.d/options-ssl-nginx.inc;
  ssl_dhparam /etc/nginx/conf.d/ssl-dhparams.inc;

  server_name cons0-s3.chbtl0.ipng.ch;
  access_log /var/log/nginx/cons0-s3.chbtl0.ipng.ch-access.log upstream;
  include /etc/nginx/conf.d/ipng-headers.inc;

  add_header Strict-Transport-Security "max-age=31536000; includeSubDomains" always;

  ignore_invalid_headers off;
  client_max_body_size 0;
  # Disable buffering
  proxy_buffering off;
  proxy_request_buffering off;

  location / {
    if ($geo_ipng_trusted = 0) { rewrite ^ https://ipng.ch/ break; }
    proxy_set_header Host $http_host;
    proxy_set_header X-Real-IP $remote_addr;
    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    proxy_set_header X-Forwarded-Proto $scheme;
    proxy_set_header X-NginX-Proxy true;

    real_ip_header X-Real-IP;
    proxy_connect_timeout 300;
    chunked_transfer_encoding off;

    proxy_http_version 1.1;
    proxy_set_header Upgrade $http_upgrade;
    proxy_set_header Connection "upgrade";

    proxy_pass http://minio0.chbtl0.net.ipng.ch:9090;
  }
}

This last one has an NGINX trick. It will only allow users in if they are in the map called geo_ipng_trusted, which contains a set of IPv4 and IPv6 prefixes. Visitors who are not in this map will receive an HTTP redirect back to the [IPng.ch] homepage instead.

I run the Ansible Playbook which contains the NGINX changes to all frontends, but of course nothing runs yet, because I haven't yet started MinIO backends.

MinIO Backends

The first thing I need to do is get those disks mounted. MinIO likes using XFS, so I'll install that and prepare the disks as follows:

pim@minio0-chbtl0:~$ sudo apt install xfsprogs
pim@minio0-chbtl0:~$ sudo modprobe xfs
pim@minio0-chbtl0:~$ echo xfs | sudo tee -a /etc/modules
pim@minio0-chbtl0:~$ sudo update-initramfs -k all -u
pim@minio0-chbtl0:~$ for i in a b c d  e f g h  i j k l; do sudo mkfs.xfs /dev/sd$i; done
pim@minio0-chbtl0:~$ blkid | awk 'BEGIN {i=1} /TYPE="xfs"/ {
       printf "%s /minio/disk%d   xfs defaults 0 2\n",$2,i; i++;
    }' | sudo tee -a /etc/fstab
pim@minio0-chbtl0:~$ for i in `seq 1 12`; do sudo mkdir -p /minio/disk$i; done
pim@minio0-chbtl0:~$ sudo mount -t xfs -a
pim@minio0-chbtl0:~$ sudo chown -R minio-user: /minio/

From the top: I'll install xfsprogs which contains the things I need to manipulate XFS filesystems in Debian. Then I'll install the xfs kernel module, and make sure it gets inserted upon subsequent startup by adding it to /etc/modules and regenerating the initrd for the installed kernels.

Next, I'll format all twelve 16TB disks (which are /dev/sda - /dev/sdl on these machines), and add their resulting blockdevice id's to /etc/fstab so they get persistently mounted on reboot.

Finally, I'll create their mountpoints, mount all XFS filesystems, and chown them to the user that MinIO is running as. End result:

pim@minio0-chbtl0:~$ df -T
Filesystem     Type       1K-blocks      Used   Available Use% Mounted on
udev           devtmpfs    32950856         0    32950856   0% /dev
tmpfs          tmpfs        6595340      1508     6593832   1% /run
/dev/md0       ext4       114695308   5423976   103398948   5% /
tmpfs          tmpfs       32976680         0    32976680   0% /dev/shm
tmpfs          tmpfs           5120         4        5116   1% /run/lock
/dev/sda       xfs      15623792640 121505936 15502286704   1% /minio/disk1
/dev/sde       xfs      15623792640 121505968 15502286672   1% /minio/disk12
/dev/sdi       xfs      15623792640 121505968 15502286672   1% /minio/disk11
/dev/sdl       xfs      15623792640 121505904 15502286736   1% /minio/disk10
/dev/sdd       xfs      15623792640 121505936 15502286704   1% /minio/disk4
/dev/sdb       xfs      15623792640 121505968 15502286672   1% /minio/disk3
/dev/sdk       xfs      15623792640 121505936 15502286704   1% /minio/disk5
/dev/sdc       xfs      15623792640 121505936 15502286704   1% /minio/disk9
/dev/sdf       xfs      15623792640 121506000 15502286640   1% /minio/disk2
/dev/sdj       xfs      15623792640 121505968 15502286672   1% /minio/disk7
/dev/sdg       xfs      15623792640 121506000 15502286640   1% /minio/disk8
/dev/sdh       xfs      15623792640 121505968 15502286672   1% /minio/disk6
tmpfs          tmpfs        6595336         0     6595336   0% /run/user/0

MinIO likes to be configured using environment variables - and this is likely because it's a popupar thing to run in a containerized environment like Kubernetes. The maintainers ship it also as a Debian package, which will read its environment from /etc/default/minio, and I'll prepare that file as follows:

pim@minio0-chbtl0:~$ cat << EOF | sudo tee /etc/default/minio
MINIO_DOMAIN="s3.chbtl0.ipng.ch,minio0.chbtl0.net.ipng.ch"
MINIO_ROOT_USER="XXX"
MINIO_ROOT_PASSWORD="YYY"
MINIO_VOLUMES="/minio/disk{1...12}"
MINIO_OPTS="--console-address :9001"
EOF
pim@minio0-chbtl0:~$ sudo systemctl enable --now minio
pim@minio0-chbtl0:~$ sudo journalctl -u minio
May 31 10:44:11 minio0-chbtl0 minio[690420]: MinIO Object Storage Server
May 31 10:44:11 minio0-chbtl0 minio[690420]: Copyright: 2015-2025 MinIO, Inc.
May 31 10:44:11 minio0-chbtl0 minio[690420]: License: GNU AGPLv3 - https://www.gnu.org/licenses/agpl-3.0.html
May 31 10:44:11 minio0-chbtl0 minio[690420]: Version: RELEASE.2025-05-24T17-08-30Z (go1.24.3 linux/amd64)
May 31 10:44:11 minio0-chbtl0 minio[690420]: API: http://198.19.4.11:9000  http://127.0.0.1:9000
May 31 10:44:11 minio0-chbtl0 minio[690420]: WebUI: https://cons0-s3.chbtl0.ipng.ch/
May 31 10:44:11 minio0-chbtl0 minio[690420]: Docs: https://docs.min.io

pim@minio0-chbtl0:~$ sudo ipmitool sensor | grep Watts
Pwr Consumption  | 154.000    | Watts

Incidentally - I am pretty pleased with this 192TB disk tank, sporting 24 cores, 64GB memory and 2x10G network, casually hanging out at 154 Watts of power all up. Slick!

{{< image float="right" src="/assets/minio/minio-ec.svg" alt="MinIO Erasure Coding" width="22em" >}}

MinIO implements erasure coding as a core component in providing availability and resiliency during drive or node-level failure events. MinIO partitions each object into data and parity shards and distributes those shards across a single so-called erasure set. Under the hood, it uses [Reed-Solomon] erasure coding implementation and partitions the object for distribution. From the MinIO website, I'll borrow a diagram to show how it looks like on a single node like mine to the right.

Anyway, MinIO detects 12 disks and installs an erasure set with 8 data disks and 4 parity disks, which it calls EC:4 encoding, also known in the industry as RS8.4. Just like that, the thing shoots to life. Awesome!

MinIO Client

On Summer, I'll install the MinIO Client called mc. This is easy because the maintainers ship a Linux binary which I can just download. On OpenBSD, they don't do that. Not a problem though, on Squanchy, Pencilvester and Glootie, I will just go install the client. Using the mc commandline, I can all any of the S3 APIs on my new MinIO instance:

pim@summer:~$ set +o history
pim@summer:~$ mc alias set chbtl0 https://s3.chbtl0.ipng.ch/ <rootuser> <rootpass>
pim@summer:~$ set -o history
pim@summer:~$ mc admin info chbtl0/
●  s3.chbtl0.ipng.ch
   Uptime: 22 hours 
   Version: 2025-05-24T17:08:30Z
   Network: 1/1 OK 
   Drives: 12/12 OK 
   Pool: 1

┌──────┬───────────────────────┬─────────────────────┬──────────────┐
│ Pool │ Drives Usage          │ Erasure stripe size │ Erasure sets │
│ 1st  │ 0.8% (total: 116 TiB) │ 12                  │ 1            │
└──────┴───────────────────────┴─────────────────────┴──────────────┘

95 GiB Used, 5 Buckets, 5,859 Objects, 318 Versions, 1 Delete Marker
12 drives online, 0 drives offline, EC:4

Cool beans. I think I should get rid of this root account though, I've installed those credentials into the /etc/default/minio environment file, but I don't want to keep them out in the open. So I'll make an account for myself and assign me reasonable privileges, called consoleAdmin in the default install:

pim@summer:~$ set +o history
pim@summer:~$ mc admin user add chbtl0/ <someuser> <somepass>
pim@summer:~$ mc admin policy info chbtl0 consoleAdmin
pim@summer:~$ mc admin policy attach chbtl0 consoleAdmin --user=<someuser>
pim@summer:~$ mc alias set chbtl0 https://s3.chbtl0.ipng.ch/ <someuser> <somepass>
pim@summer:~$ set -o history

OK, I feel less gross now that I'm not operating as root on the MinIO deployment. Using my new user-powers, let me set some metadata on my new minio server:

pim@summer:~$ mc admin config set chbtl0/ site name=chbtl0 region=switzerland
Successfully applied new settings.
Please restart your server 'mc admin service restart chbtl0/'.
pim@summer:~$ mc admin service restart chbtl0/
Service status: ▰▰▱ [DONE]
Summary:
    ┌───────────────┬─────────────────────────────┐
    │ Servers:      │ 1 online, 0 offline, 0 hung │
    │ Restart Time: │ 61.322886ms                 │
    └───────────────┴─────────────────────────────┘
pim@summer:~$ mc admin config get chbtl0/ site 
site name=chbtl0 region=switzerland 

By the way, what's really cool about these open standards is that both the Amazon aws client works with MinIO, but mc also works with AWS!

MinIO Console

Although I'm pretty good with APIs and command line tools, there's some benefit also in using a Graphical User Interface. MinIO ships with one, but there as a bit of a kerfuffle in the MinIO community. Unfortunately, these are pretty common -- Redis (an open source key/value storage system) changed their offering abruptly. Terraform (an open source infrastructure-as-code tool) changed their licensing at some point. Ansible (an open source machine management tool) changed their offering also. MinIO developers decided to strip their console of ~all features recently. The gnarly bits are discussed on [reddit]. but suffice to say: the same thing that happened in literally 100% of the other cases, also happened here. Somebody decided to simply fork the code from before it was changed.

Enter OpenMaxIO. A cringe worthy name, but it gets the job done. Reading up on the [GitHub], reviving the fully working console is pretty straight forward -- that is, once somebody spent a few days figuring it out. Thank you icesvz for this excellent pointer. With this, I can create a systemd service for the console and start it:

pim@minio0-chbtl0:~$ cat << EOF | sudo tee -a /etc/default/minio
## NOTE(pim): For openmaxio console service
CONSOLE_MINIO_SERVER="http://localhost:9000"
MINIO_BROWSER_REDIRECT_URL="https://cons0-s3.chbtl0.ipng.ch/"
EOF
pim@minio0-chbtl0:~$ cat << EOF | sudo tee /lib/systemd/system/minio-console.service
[Unit]
Description=OpenMaxIO Console Service
Wants=network-online.target
After=network-online.target
AssertFileIsExecutable=/usr/local/bin/minio-console

[Service]
Type=simple

WorkingDirectory=/usr/local

User=minio-user
Group=minio-user
ProtectProc=invisible

EnvironmentFile=-/etc/default/minio
ExecStart=/usr/local/bin/minio-console server
Restart=always
LimitNOFILE=1048576
MemoryAccounting=no
TasksMax=infinity
TimeoutSec=infinity
OOMScoreAdjust=-1000
SendSIGKILL=no

[Install]
WantedBy=multi-user.target
EOF
pim@minio0-chbtl0:~$ sudo systemctl enable --now minio-console
pim@minio0-chbtl0:~$ sudo systemctl restart minio

The first snippet is an update to the MinIO configuration that instructs it to redirect users who are not trying to use the API to the console endpoint on cons0-s3.chbtl0.ipng.ch, and then the console-server needs to know where to find the API, which from its vantage point is running on localhost:9000. Hello, beautiful fully featured console:

{{< image src="/assets/minio/console-1.png" alt="MinIO Console" >}}

MinIO Prometheus

MinIO ships with a prometheus metrics endpoint, and I notice on its console that it has a nice metrics tab, which is fully greyed out. This is most likely because, well, I don't have a Prometheus install here yet. I decide to keep the storage nodes self-contained and start a Prometheus server on the local machine. I can always plumb that to IPng's Grafana instance later.

For now, I'll install Prometheus as follows:

pim@minio0-chbtl0:~$ cat << EOF | sudo tee -a /etc/default/minio
## NOTE(pim): Metrics for minio-console
MINIO_PROMETHEUS_AUTH_TYPE="public"
CONSOLE_PROMETHEUS_URL="http://localhost:19090/"
CONSOLE_PROMETHEUS_JOB_ID="minio-job"
EOF

pim@minio0-chbtl0:~$ sudo apt install prometheus
pim@minio0-chbtl0:~$ cat << EOF | sudo tee /etc/default/prometheus
ARGS="--web.listen-address='[::]:19090' --storage.tsdb.retention.size=16GB"
EOF
pim@minio0-chbtl0:~$ cat << EOF | sudo tee /etc/prometheus/prometheus.yml
global:
  scrape_interval: 60s

scrape_configs:
  - job_name: minio-job
    metrics_path: /minio/v2/metrics/cluster
    static_configs:
      - targets: ['localhost:9000']
        labels: 
          cluster: minio0-chbtl0

  - job_name: minio-job-node
    metrics_path: /minio/v2/metrics/node
    static_configs:
      - targets: ['localhost:9000']
        labels: 
          cluster: minio0-chbtl0

  - job_name: minio-job-bucket
    metrics_path: /minio/v2/metrics/bucket
    static_configs:
      - targets: ['localhost:9000']
        labels: 
          cluster: minio0-chbtl0

  - job_name: minio-job-resource
    metrics_path: /minio/v2/metrics/resource
    static_configs:
      - targets: ['localhost:9000']
        labels: 
          cluster: minio0-chbtl0
  
  - job_name: node
    static_configs:
      - targets: ['localhost:9100']
        labels: 
          cluster: minio0-chbtl0
pim@minio0-chbtl0:~$ sudo systemctl restart minio prometheus

In the first snippet, I'll tell MinIO where it should find its Prometheus instance. Since the MinIO console service is running on port 9090, and this is also the default port for Prometheus, I will run Promtheus on port 19090 instead. From reading the MinIO docs, I can see that normally MinIO will want prometheus to authenticate to it before it'll allow the endpoints to be scraped. I'll turn that off by making these public. On the IPng Frontends, I can always remove access to /minio/v2 and simply use the IPng Site Local access for local Prometheus scrapers instead.

After telling Prometheus its runtime arguments (in /etc/default/prometheus) and its scraping endpoints (in /etc/prometheus/prometheus.yml), I can restart minio and prometheus. A few minutes later, I can see the Metrics tab in the console come to life.

But now that I have this prometheus running on the MinIO node, I can also add it to IPng's Grafana configuration, by adding a new data source on minio0.chbtl0.net.ipng.ch:19090 and pointing the default Grafana [Dashboard] at it:

{{< image src="/assets/minio/console-2.png" alt="Grafana Dashboard" >}}

A two-for-one: I will both be able to see metrics directly in the console, but also I will be able to hook up these per-node prometheus instances into IPng's alertmanager also, and I've read some [docs] on the concepts. I'm really liking the experience so far!

MinIO Nagios

Prometheus is fancy and all, but at IPng Networks, I've been doing monitoring for a while now. As a dinosaur, I still have an active [Nagios] install, which autogenerates all of its configuration using the Ansible repository I have. So for the new Ansible group called minio, I will autogenerate the following snippet:

define command {
  command_name   ipng_check_minio
  command_line   $USER1$/check_http -E -H $HOSTALIAS$ -I $ARG1$ -p $ARG2$ -u $ARG3$ -r '$ARG4$'
}

define service {
  hostgroup_name        ipng:minio:ipv6
  service_description   minio6:api
  check_command         ipng_check_minio!$_HOSTADDRESS6$!9000!/minio/health/cluster!
  use                   ipng-service-fast
  notification_interval 0 ; set > 0 if you want to be renotified
}

define service {
  hostgroup_name        ipng:minio:ipv6
  service_description   minio6:prom
  check_command         ipng_check_minio!$_HOSTADDRESS6$!19090!/classic/targets!minio-job
  use                   ipng-service-fast
  notification_interval 0 ; set > 0 if you want to be renotified
}

define service {
  hostgroup_name        ipng:minio:ipv6
  service_description   minio6:console
  check_command         ipng_check_minio!$_HOSTADDRESS6$!9090!/!MinIO Console
  use                   ipng-service-fast
  notification_interval 0 ; set > 0 if you want to be renotified
}

I've shown the snippet for IPv6 but I also have three services defined for legacy IP in the hostgroup ipng:minio:ipv4. The check command here uses -I which has the IPv4 or IPv6 address to talk to, -p for the port to consule, -u for the URI to hit and an option -r for a regular expression to expect in the output. For the Nagios afficianados out there: my Ansible groups correspond one to one with autogenerated Nagios hostgroups. This allows me to add arbitrary checks by group-type, like above in the ipng:minio group for IPv4 and IPv6.

In the MinIO [docs] I read up on the Healthcheck API. I choose to monitor the Cluster Write Quorum on my minio deployments. For Prometheus, I decide to hit the targets endpoint and expect the minio-job to be among them. Finally, for the MinIO Console, I expect to see a login screen with the words MinIO Console in the returned page. I guessed right, because Nagios is all green:

{{< image src="/assets/minio/nagios.png" alt="Nagios Dashboard" >}}

My First Bucket

The IPng website is a statically generated Hugo site, and when-ever I submit a change to my Git repo, a CI/CD runner (called [Drone]), picks up the change. It re-builds the static website, and copies it to four redundant NGINX servers.

But IPng's website has amassed quite a bit of extra files (like VM images and VPP packages that I publish), which are copied separately using a simple push script I have in my home directory. This avoids all those big media files from cluttering the Git repository. I decide to move this stuff into S3:

pim@summer:~/src/ipng-web-assets$ echo 'Gruezi World.' > ipng.ch/media/README.md
pim@summer:~/src/ipng-web-assets$ mc mb chbtl0/ipng-web-assets
pim@summer:~/src/ipng-web-assets$ mc mirror . chbtl0/ipng-web-assets/
...ch/media/README.md: 6.50 GiB / 6.50 GiB ┃▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓┃ 236.38 MiB/s 28s
pim@summer:~/src/ipng-web-assets$ mc anonymous set download chbtl0/ipng-web-assets/

OK, two things that immediately jump out at me. This stuff is fast: Summer is connected with a 2.5GbE network card, and she's running hard, copying the 6.5GB of data that are in these web assets essentially at line rate. It doesn't really surprise me because Summer is running off of Gen4 NVME, while MinIO has 12 spinning disks which each can write about 160MB/s or so sustained [ref], with 24 CPUs to tend to the NIC (2x10G) and disks (2x SSD, 12x LFF). Should be plenty!

The second is that MinIO allows for buckets to be publicly shared in three ways: 1) read-only by setting download; 2) write-only by setting upload, and 3) read-write by setting public. I set download here, which means I should be able to fetch an asset now publicly:

pim@summer:~$ curl https://s3.chbtl0.ipng.ch/ipng-web-assets/ipng.ch/media/README.md
Gruezi World.
pim@summer:~$ curl https://ipng-web-assets.s3.chbtl0.ipng.ch/ipng.ch/media/README.md
Gruezi World.

The first curl here shows the path-based access, while the second one shows an equivalent virtual-host based access. Both retrieve the file I just pushed via the public Internet. Whoot!

What's Next

I'm going to be moving [Restic] backups from IPng's ZFS storage pool to this S3 service over the next few days. I'll also migrate PeerTube and possibly Mastodon from NVME based storage to replicated S3 buckets as well. Finally, the IPng website media that I mentinoed above, should make for a nice followup article. Stay tuned!