Minio Article #1
All checks were successful
continuous-integration/drone/push Build is passing
All checks were successful
continuous-integration/drone/push Build is passing
This commit is contained in:
713
content/articles/2025-05-28-minio-1.md
Normal file
713
content/articles/2025-05-28-minio-1.md
Normal file
@ -0,0 +1,713 @@
|
|||||||
|
---
|
||||||
|
date: "2025-05-28T22:07:23Z"
|
||||||
|
title: 'Case Study: Minio S3 - Part 1'
|
||||||
|
---
|
||||||
|
|
||||||
|
{{< image float="right" src="/assets/minio/minio-logo.png" alt="MinIO Logo" width="6em" >}}
|
||||||
|
|
||||||
|
# Introduction
|
||||||
|
|
||||||
|
Amazon Simple Storage Service (Amazon S3) is an object storage service offering industry-leading
|
||||||
|
scalability, data availability, security, and performance. Millions of customers of all sizes and
|
||||||
|
industries store, manage, analyze, and protect any amount of data for virtually any use case, such
|
||||||
|
as data lakes, cloud-native applications, and mobile apps. With cost-effective storage classes and
|
||||||
|
easy-to-use management features, you can optimize costs, organize and analyze data, and configure
|
||||||
|
fine-tuned access controls to meet specific business and compliance requirements.
|
||||||
|
|
||||||
|
Amazon's S3 became the _de facto_ standard object storage system, and there exist several fully open
|
||||||
|
source implementations of the protocol. One of them is MinIO: designed to allow enterprises to
|
||||||
|
consolidate all of their data on a single, private cloud namespace. Architected using the same
|
||||||
|
principles as the hyperscalers, AIStor delivers performance at scale at a fraction of the cost
|
||||||
|
compared to the public cloud.
|
||||||
|
|
||||||
|
IPng Networks is an Internet Service Provider, but I also dabble in self-hosting things, for
|
||||||
|
example [[PeerTube](https://video.ipng.ch/)], [[Mastodon](https://ublog.tech/)],
|
||||||
|
[[Immich](https://photos.ipng.ch/)], [[Pixelfed](https://pix.ublog.tech/)] and of course
|
||||||
|
[[Hugo](https://ipng/ch/)]. These services all have one thing in common: they tend to use lots of
|
||||||
|
storage when they grow. At IPng Networks, all hypervisors ship with enterprise SAS flash drives,
|
||||||
|
mostly 1.92TB and 3.84TB. Scaling up each of these services, and backing them up safely, can be
|
||||||
|
quite the headache.
|
||||||
|
|
||||||
|
This article is for the storage-buffs. I'll set up a set of distributed MinIO nodes from scatch.
|
||||||
|
|
||||||
|
## Physical
|
||||||
|
|
||||||
|
{{< image float="right" src="/assets/minio/disks.png" alt="MinIO Disks" width="16em" >}}
|
||||||
|
|
||||||
|
I'll start with the basics. I still have a few Dell R720 servers laying around, they are getting a
|
||||||
|
bit older but still have 24 cores and 64GB of memory. First I need to get me some disks. I order
|
||||||
|
36pcs of 16TB SATA enterprise disk, a mixture of Seagate EXOS and Toshiba MG series disks. I've once
|
||||||
|
learned (the hard way), that buying a big stack of disks from one production run is a risk - so I'll
|
||||||
|
mix and match the drives.
|
||||||
|
|
||||||
|
Three trays of caddies and a melted credit card later, I have 576TB of SATA disks safely in hand.
|
||||||
|
Each machine will carry 192TB of raw storage. The nice thing about this chassis is that Dell can
|
||||||
|
ship them with 12x 3.5" SAS slots in the front, and 2x 2.5" SAS slots in the rear of the chassis.
|
||||||
|
|
||||||
|
So I'll install Debian Bookworm on one small 480G SSD in software RAID1.
|
||||||
|
|
||||||
|
### Cloning an install
|
||||||
|
|
||||||
|
I have three identical machines so in total I'll want six of these SSDs. I temporarily screw the
|
||||||
|
other five in 3.5" drive caddies and plug them into the first installed Dell, which I've called
|
||||||
|
`minio-proto`:
|
||||||
|
|
||||||
|
|
||||||
|
```
|
||||||
|
pim@minio-proto:~$ for i in b c d e f; do
|
||||||
|
sudo dd if=/dev/sda of=/dev/sd${i} bs=512 count=1;
|
||||||
|
sudo mdadm --manage /dev/md0 --add /dev/md${i}1
|
||||||
|
done
|
||||||
|
pim@minio-proto:~$ sudo mdadm --manage /dev/md0 --grow 6
|
||||||
|
pim@minio-proto:~$ watch cat /proc/mdstat
|
||||||
|
pim@minio-proto:~$ for i in a b c d e f; do
|
||||||
|
sudo grub-install /dev/sd$i
|
||||||
|
done
|
||||||
|
```
|
||||||
|
|
||||||
|
{{< image float="right" src="/assets/minio/rack.png" alt="MinIO Rack" width="16em" >}}
|
||||||
|
|
||||||
|
The first command takes my installed disk, `/dev/sda`, and copies the first sector over to the other
|
||||||
|
five. This will give them the same partition table. Next, I'll add the first partition of each disk
|
||||||
|
to the raidset. Then, I'll expand the raidset to have six members, after which the kernel starts a
|
||||||
|
recovery process that syncs the newly added paritions to `/dev/md0` (by copying from `/dev/sda` to
|
||||||
|
all other disks at once). Finally, I'll watch this exciting movie and grab a cup of tea.
|
||||||
|
|
||||||
|
|
||||||
|
Once the disks are fully copied, I'll shut down the machine and distribute the disks to their
|
||||||
|
respective Dell R720, two each. Once they boot they will all be identical. I'll need to make sure
|
||||||
|
their hostnames, and machine/host-id are unique, otherwise things like bridges will have overlapping
|
||||||
|
MAC addresses - ask me how I know:
|
||||||
|
|
||||||
|
```
|
||||||
|
pim@minio-proto:~$ sudo mdadm --manage /dev/md0 --grow -n 2
|
||||||
|
pim@minio-proto:~$ sudo rm /etc/ssh/ssh_host*
|
||||||
|
pim@minio-proto:~$ sudo hostname minio0-chbtl0
|
||||||
|
pim@minio-proto:~$ sudo dpkg-reconfigure openssh-server
|
||||||
|
pim@minio-proto:~$ sudo dd if=/dev/random of=/etc/hostid bs=4 count=1
|
||||||
|
pim@minio-proto:~$ sudo /usr/bin/dbus-uuidgen > /etc/machine-id
|
||||||
|
pim@minio-proto:~$ sudo reboot
|
||||||
|
```
|
||||||
|
|
||||||
|
After which I have three beautiful and unique machines:
|
||||||
|
* `minio0.chbtl0.net.ipng.ch`: which will go into my server rack at the IPng office.
|
||||||
|
* `minio0.ddln0.net.ipng.ch`: which will go to [[Daedalean]({{< ref
|
||||||
|
2022-02-24-colo >}})], doing AI since before it was all about vibe coding.
|
||||||
|
* `minio0.chrma0.net.ipng.ch`: which will go to [[IP-Max](https://ip-max.net/)], one of the best
|
||||||
|
ISPs on the planet. 🥰
|
||||||
|
|
||||||
|
|
||||||
|
## Deploying Minio
|
||||||
|
|
||||||
|
The user guide that MinIO provides
|
||||||
|
[[ref](https://min.io/docs/minio/linux/operations/installation.html)] is super good, arguably one of
|
||||||
|
the best documented open source projects I've ever seen. it shows me that I can do three types of
|
||||||
|
install. A 'Standalone' with one disk, a 'Standalone Multi-Drive', and a 'Distributed' deployment.
|
||||||
|
I decide to make three independent standalone multi-drive installs. This way, I have less shared
|
||||||
|
fate, and will be immune to network partitions (as these are going to be in three different
|
||||||
|
physical locations). I've also read about per-bucket _replication_, which will be an excellent way
|
||||||
|
to get geographical distribution and active/active instances to work together.
|
||||||
|
|
||||||
|
I feel good about the single-machine multi-drive decision. I follow the install guide
|
||||||
|
[[ref](https://min.io/docs/minio/linux/operations/install-deploy-manage/deploy-minio-single-node-multi-drive.html#minio-snmd)]
|
||||||
|
for this deployment type.
|
||||||
|
|
||||||
|
### IPng Frontends
|
||||||
|
|
||||||
|
At IPng I use a private IPv4/IPv6/MPLS network that is not connected to the internet. I call this
|
||||||
|
network [[IPng Site Local]({{< ref 2023-03-11-mpls-core.md >}})]. But how will users reach my Minio
|
||||||
|
install? I have four redundantly and geographically deployed frontends, two in the Netherlands and
|
||||||
|
two in Switzerland. I've described the frontend setup in a [[previous article]({{< ref
|
||||||
|
2023-03-17-ipng-frontends >}})] and the certificate management in [[this article]({{< ref
|
||||||
|
2023-03-24-lego-dns01 >}})].
|
||||||
|
|
||||||
|
I've decided to run the service on these three regionalized endpoints:
|
||||||
|
1. `s3.chbtl0.ipng.ch` which will back into `minio0.chbtl0.net.ipng.ch`
|
||||||
|
1. `s3.ddln0.ipng.ch` which will back into `minio0.ddln0.net.ipng.ch`
|
||||||
|
1. `s3.chrma0.ipng.ch` which will back into `minio0.chrma0.net.ipng.ch`
|
||||||
|
|
||||||
|
The first thing I take note of is that S3 buckets can be either addressed _by path_, in other words
|
||||||
|
something like `s3.chbtl0.ipng.ch/my-bucket/README.md`, but they can also be addressed by virtual
|
||||||
|
host, like so: `my-bucket.s3.chbtl0.ipng.ch/README.md`. A subtle difference, but from the docs I
|
||||||
|
understand that Minio needs to have control of the whole space under its main domain.
|
||||||
|
|
||||||
|
There's a small implication to this requirement -- the Web Console that ships with MinIO (eh, well,
|
||||||
|
maybe that's going to change, more on that later), will want to have its own domain-name, so I
|
||||||
|
choose something simple: `cons0-s3.chbtl0.ipng.ch` and so on. This way, somebody might still be able
|
||||||
|
to have a bucket name called `cons0` :)
|
||||||
|
|
||||||
|
#### Let's Encrypt Certificates
|
||||||
|
|
||||||
|
Alright, so I will be neading nine domains into this new certificate which I'll simply call
|
||||||
|
`s3.ipng.ch`. I configure it in Ansible:
|
||||||
|
|
||||||
|
```
|
||||||
|
certbot:
|
||||||
|
certs:
|
||||||
|
...
|
||||||
|
s3.ipng.ch:
|
||||||
|
groups: [ 'nginx', 'minio' ]
|
||||||
|
altnames:
|
||||||
|
- 's3.chbtl0.ipng.ch'
|
||||||
|
- 'cons0-s3.chbtl0.ipng.ch'
|
||||||
|
- '*.s3.chbtl0.ipng.ch'
|
||||||
|
- 's3.ddln0.ipng.ch'
|
||||||
|
- 'cons0-s3.ddln0.ipng.ch'
|
||||||
|
- '*.s3.ddln0.ipng.ch'
|
||||||
|
- 's3.chrma0.ipng.ch'
|
||||||
|
- 'cons0-s3.chrma0.ipng.ch'
|
||||||
|
- '*.s3.chrma0.ipng.ch'
|
||||||
|
```
|
||||||
|
|
||||||
|
I run the `certbot` playbook and it does two things:
|
||||||
|
1. On the machines from group `nginx` and `minio`, it will ensure there exists a user `lego` with
|
||||||
|
an SSH key and write permissions to `/etc/lego/`; this is where the automation will write (and
|
||||||
|
update) the certificate keys.
|
||||||
|
1. On the `lego` machine, it'll create two files. One is the certificate requestor, and the other
|
||||||
|
is a certificate distribution script that will copy the cert to the right machine(s) when it
|
||||||
|
renews.
|
||||||
|
|
||||||
|
On the `lego` machine, I'll run the cert request for the first time:
|
||||||
|
|
||||||
|
```
|
||||||
|
lego@lego:~$ bin/certbot:s3.ipng.ch
|
||||||
|
lego@lego:~$ RENEWED_LINEAGE=/home/lego/acme-dns/live/s3.ipng.ch bin/certbot-distribute
|
||||||
|
```
|
||||||
|
|
||||||
|
The first script asks me to add the _acme-challenge DNS entries, which I'll do, for example on the
|
||||||
|
`s3.chbtl0.ipng.ch` instance (and similar for the `ddln0` and `chrma0` ones:
|
||||||
|
|
||||||
|
```
|
||||||
|
$ORIGIN chbtl0.ipng.ch.
|
||||||
|
_acme-challenge.s3 CNAME 51f16fd0-8eb6-455c-b5cd-96fad12ef8fd.auth.ipng.ch.
|
||||||
|
_acme-challenge.cons0-s3 CNAME 450477b8-74c9-4b9e-bbeb-de49c3f95379.auth.ipng.ch.
|
||||||
|
s3 CNAME nginx0.ipng.ch.
|
||||||
|
*.s3 CNAME nginx0.ipng.ch.
|
||||||
|
cons0-s3 CNAME nginx0.ipng.ch.
|
||||||
|
```
|
||||||
|
|
||||||
|
I push and reload the `ipng.ch` zonefile with these changes after which the certificate gets
|
||||||
|
requested and a cronjob added to check for renewals. The second script will copy the newly created
|
||||||
|
cert to all three `minio` machines, and all four `nginx` machines. From now on, every 90 days, a new
|
||||||
|
cert will be automatically generated and distributed. Slick!
|
||||||
|
|
||||||
|
#### NGINX Configs
|
||||||
|
|
||||||
|
With the LE wildcard certs in hand, I can create an NGINX frontend for these minio deployments.
|
||||||
|
|
||||||
|
First, a simple redirector service that punts people on port 80 to port 443:
|
||||||
|
|
||||||
|
```
|
||||||
|
server {
|
||||||
|
listen [::]:80;
|
||||||
|
listen 0.0.0.0:80;
|
||||||
|
|
||||||
|
server_name cons0-s3.chbtl0.ipng.ch s3.chbtl0.ipng.ch *.s3.chbtl0.ipng.ch;
|
||||||
|
access_log /var/log/nginx/s3.chbtl0.ipng.ch-access.log;
|
||||||
|
include /etc/nginx/conf.d/ipng-headers.inc;
|
||||||
|
|
||||||
|
location / {
|
||||||
|
return 301 https://$server_name$request_uri;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Next, the Minio API service itself which runs on port 9000, with a configuration snippet inspired by
|
||||||
|
the MinIO [[docs](https://min.io/docs/minio/linux/integrations/setup-nginx-proxy-with-minio.html)]:
|
||||||
|
|
||||||
|
```
|
||||||
|
server {
|
||||||
|
listen [::]:443 ssl http2;
|
||||||
|
listen 0.0.0.0:443 ssl http2;
|
||||||
|
ssl_certificate /etc/certs/s3.ipng.ch/fullchain.pem;
|
||||||
|
ssl_certificate_key /etc/certs/s3.ipng.ch/privkey.pem;
|
||||||
|
include /etc/nginx/conf.d/options-ssl-nginx.inc;
|
||||||
|
ssl_dhparam /etc/nginx/conf.d/ssl-dhparams.inc;
|
||||||
|
|
||||||
|
server_name s3.chbtl0.ipng.ch *.s3.chbtl0.ipng.ch;
|
||||||
|
access_log /var/log/nginx/s3.chbtl0.ipng.ch-access.log upstream;
|
||||||
|
include /etc/nginx/conf.d/ipng-headers.inc;
|
||||||
|
|
||||||
|
add_header Strict-Transport-Security "max-age=31536000; includeSubDomains" always;
|
||||||
|
|
||||||
|
ignore_invalid_headers off;
|
||||||
|
client_max_body_size 0;
|
||||||
|
# Disable buffering
|
||||||
|
proxy_buffering off;
|
||||||
|
proxy_request_buffering off;
|
||||||
|
|
||||||
|
location / {
|
||||||
|
proxy_set_header Host $http_host;
|
||||||
|
proxy_set_header X-Real-IP $remote_addr;
|
||||||
|
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
|
||||||
|
proxy_set_header X-Forwarded-Proto $scheme;
|
||||||
|
|
||||||
|
proxy_connect_timeout 300;
|
||||||
|
proxy_http_version 1.1;
|
||||||
|
proxy_set_header Connection "";
|
||||||
|
chunked_transfer_encoding off;
|
||||||
|
|
||||||
|
proxy_pass http://minio0.chbtl0.net.ipng.ch:9000;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Finally, the Minio Console service which runs on port 9090:
|
||||||
|
|
||||||
|
```
|
||||||
|
include /etc/nginx/conf.d/geo-ipng-trusted.inc;
|
||||||
|
|
||||||
|
server {
|
||||||
|
listen [::]:443 ssl http2;
|
||||||
|
listen 0.0.0.0:443 ssl http2;
|
||||||
|
ssl_certificate /etc/certs/s3.ipng.ch/fullchain.pem;
|
||||||
|
ssl_certificate_key /etc/certs/s3.ipng.ch/privkey.pem;
|
||||||
|
include /etc/nginx/conf.d/options-ssl-nginx.inc;
|
||||||
|
ssl_dhparam /etc/nginx/conf.d/ssl-dhparams.inc;
|
||||||
|
|
||||||
|
server_name cons0-s3.chbtl0.ipng.ch;
|
||||||
|
access_log /var/log/nginx/cons0-s3.chbtl0.ipng.ch-access.log upstream;
|
||||||
|
include /etc/nginx/conf.d/ipng-headers.inc;
|
||||||
|
|
||||||
|
add_header Strict-Transport-Security "max-age=31536000; includeSubDomains" always;
|
||||||
|
|
||||||
|
ignore_invalid_headers off;
|
||||||
|
client_max_body_size 0;
|
||||||
|
# Disable buffering
|
||||||
|
proxy_buffering off;
|
||||||
|
proxy_request_buffering off;
|
||||||
|
|
||||||
|
location / {
|
||||||
|
if ($geo_ipng_trusted = 0) { rewrite ^ https://ipng.ch/ break; }
|
||||||
|
proxy_set_header Host $http_host;
|
||||||
|
proxy_set_header X-Real-IP $remote_addr;
|
||||||
|
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
|
||||||
|
proxy_set_header X-Forwarded-Proto $scheme;
|
||||||
|
proxy_set_header X-NginX-Proxy true;
|
||||||
|
|
||||||
|
real_ip_header X-Real-IP;
|
||||||
|
proxy_connect_timeout 300;
|
||||||
|
chunked_transfer_encoding off;
|
||||||
|
|
||||||
|
proxy_http_version 1.1;
|
||||||
|
proxy_set_header Upgrade $http_upgrade;
|
||||||
|
proxy_set_header Connection "upgrade";
|
||||||
|
|
||||||
|
proxy_pass http://minio0.chbtl0.net.ipng.ch:9090;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
This last one has an NGINX trick. It will only allow users in if they are in the map called
|
||||||
|
`geo_ipng_trusted`, which contains a set of IPv4 and IPv6 prefixes. Visitors who are not in this map
|
||||||
|
will receive an HTTP redirect back to the [[IPng.ch](https://ipng.ch/)] homepage instead.
|
||||||
|
|
||||||
|
I run the Ansible Playbook which contains the NGINX changes to all frontends, but of course nothing
|
||||||
|
runs yet, because I haven't yet started MinIO backends.
|
||||||
|
|
||||||
|
### MinIO Backends
|
||||||
|
|
||||||
|
The first thing I need to do is get those disks mounted. MinIO likes using XFS, so I'll install that
|
||||||
|
and prepare the disks as follows:
|
||||||
|
|
||||||
|
```
|
||||||
|
pim@minio0-chbtl0:~$ sudo apt install xfsprogs
|
||||||
|
pim@minio0-chbtl0:~$ sudo modprobe xfs
|
||||||
|
pim@minio0-chbtl0:~$ echo xfs | sudo tee -a /etc/modules
|
||||||
|
pim@minio0-chbtl0:~$ sudo update-initramfs -k all -u
|
||||||
|
pim@minio0-chbtl0:~$ for i in a b c d e f g h i j k l; do sudo mkfs.xfs /dev/sd$i; done
|
||||||
|
pim@minio0-chbtl0:~$ blkid | awk 'BEGIN {i=1} /TYPE="xfs"/ {
|
||||||
|
printf "%s /minio/disk%d xfs defaults 0 2\n",$2,i; i++;
|
||||||
|
}' | sudo tee -a /etc/fstab
|
||||||
|
pim@minio0-chbtl0:~$ for i in `seq 1 12`; do sudo mkdir -p /minio/disk$i; done
|
||||||
|
pim@minio0-chbtl0:~$ sudo mount -t xfs -a
|
||||||
|
pim@minio0-chbtl0:~$ sudo chown -R minio-user: /minio/
|
||||||
|
```
|
||||||
|
|
||||||
|
From the top: I'll install `xfsprogs` which contains the things I need to manipulate XFS filesystems
|
||||||
|
in Debian. Then I'll install the `xfs` kernel module, and make sure it gets inserted upon subsequent
|
||||||
|
startup by adding it to `/etc/modules` and regenerating the initrd for the installed kernels.
|
||||||
|
|
||||||
|
Next, I'll format all twelve 16TB disks (which are `/dev/sda` - `/dev/sdl` on these machines), and
|
||||||
|
add their resulting blockdevice id's to `/etc/fstab` so they get persistently mounted on reboot.
|
||||||
|
|
||||||
|
Finally, I'll create their mountpoints, mount all XFS filesystems, and chown them to the user that
|
||||||
|
MinIO is running as. End result:
|
||||||
|
|
||||||
|
```
|
||||||
|
pim@minio0-chbtl0:~$ df -T
|
||||||
|
Filesystem Type 1K-blocks Used Available Use% Mounted on
|
||||||
|
udev devtmpfs 32950856 0 32950856 0% /dev
|
||||||
|
tmpfs tmpfs 6595340 1508 6593832 1% /run
|
||||||
|
/dev/md0 ext4 114695308 5423976 103398948 5% /
|
||||||
|
tmpfs tmpfs 32976680 0 32976680 0% /dev/shm
|
||||||
|
tmpfs tmpfs 5120 4 5116 1% /run/lock
|
||||||
|
/dev/sda xfs 15623792640 121505936 15502286704 1% /minio/disk1
|
||||||
|
/dev/sde xfs 15623792640 121505968 15502286672 1% /minio/disk12
|
||||||
|
/dev/sdi xfs 15623792640 121505968 15502286672 1% /minio/disk11
|
||||||
|
/dev/sdl xfs 15623792640 121505904 15502286736 1% /minio/disk10
|
||||||
|
/dev/sdd xfs 15623792640 121505936 15502286704 1% /minio/disk4
|
||||||
|
/dev/sdb xfs 15623792640 121505968 15502286672 1% /minio/disk3
|
||||||
|
/dev/sdk xfs 15623792640 121505936 15502286704 1% /minio/disk5
|
||||||
|
/dev/sdc xfs 15623792640 121505936 15502286704 1% /minio/disk9
|
||||||
|
/dev/sdf xfs 15623792640 121506000 15502286640 1% /minio/disk2
|
||||||
|
/dev/sdj xfs 15623792640 121505968 15502286672 1% /minio/disk7
|
||||||
|
/dev/sdg xfs 15623792640 121506000 15502286640 1% /minio/disk8
|
||||||
|
/dev/sdh xfs 15623792640 121505968 15502286672 1% /minio/disk6
|
||||||
|
tmpfs tmpfs 6595336 0 6595336 0% /run/user/0
|
||||||
|
```
|
||||||
|
|
||||||
|
MinIO likes to be configured using environment variables - and this is likely because it's a popupar
|
||||||
|
thing to run in a containerized environment like Kubernetes. The maintainers ship it also as a
|
||||||
|
Debian package, which will read its environment from `/etc/default/minio`, and I'll prepare that
|
||||||
|
file as follows:
|
||||||
|
|
||||||
|
```
|
||||||
|
pim@minio0-chbtl0:~$ cat << EOF | sudo tee /etc/default/minio
|
||||||
|
MINIO_DOMAIN="s3.chbtl0.ipng.ch,minio0.chbtl0.net.ipng.ch"
|
||||||
|
MINIO_ROOT_USER="XXX"
|
||||||
|
MINIO_ROOT_PASSWORD="YYY"
|
||||||
|
MINIO_VOLUMES="/minio/disk{1...12}"
|
||||||
|
MINIO_OPTS="--console-address :9001"
|
||||||
|
EOF
|
||||||
|
pim@minio0-chbtl0:~$ sudo systemctl enable --now minio
|
||||||
|
pim@minio0-chbtl0:~$ sudo journalctl -u minio
|
||||||
|
May 31 10:44:11 minio0-chbtl0 minio[690420]: MinIO Object Storage Server
|
||||||
|
May 31 10:44:11 minio0-chbtl0 minio[690420]: Copyright: 2015-2025 MinIO, Inc.
|
||||||
|
May 31 10:44:11 minio0-chbtl0 minio[690420]: License: GNU AGPLv3 - https://www.gnu.org/licenses/agpl-3.0.html
|
||||||
|
May 31 10:44:11 minio0-chbtl0 minio[690420]: Version: RELEASE.2025-05-24T17-08-30Z (go1.24.3 linux/amd64)
|
||||||
|
May 31 10:44:11 minio0-chbtl0 minio[690420]: API: http://198.19.4.11:9000 http://127.0.0.1:9000
|
||||||
|
May 31 10:44:11 minio0-chbtl0 minio[690420]: WebUI: https://cons0-s3.chbtl0.ipng.ch/
|
||||||
|
May 31 10:44:11 minio0-chbtl0 minio[690420]: Docs: https://docs.min.io
|
||||||
|
|
||||||
|
pim@minio0-chbtl0:~$ sudo ipmitool sensor | grep Watts
|
||||||
|
Pwr Consumption | 154.000 | Watts
|
||||||
|
```
|
||||||
|
|
||||||
|
Incidentally - I am pretty pleased with this 192TB disk tank, sporting 24 cores, 64GB memory and
|
||||||
|
2x10G network, casually hanging out at 154 Watts of power all up. Slick!
|
||||||
|
|
||||||
|
{{< image float="right" src="/assets/minio/minio-ec.svg" alt="MinIO Erasure Coding" width="22em" >}}
|
||||||
|
|
||||||
|
MinIO implements _erasure coding_ as a core component in providing availability and resiliency
|
||||||
|
during drive or node-level failure events. MinIO partitions each object into data and parity shards
|
||||||
|
and distributes those shards across a single so-called _erasure set_. Under the hood, it uses
|
||||||
|
[[Reed-Solomon](https://en.wikipedia.org/wiki/Reed%E2%80%93Solomon_error_correction)] erasure coding
|
||||||
|
implementation and partitions the object for distribution. From the MinIO website, I'll borrow a
|
||||||
|
diagram to show how it looks like on a single node like mine to the right.
|
||||||
|
|
||||||
|
Anyway, MinIO detects 12 disks and installs an erasure set with 8 data disks and 4 parity disks,
|
||||||
|
which it calls `EC:4` encoding, also known in the industry as `RS8.4`.
|
||||||
|
Just like that, the thing shoots to life. Awesome!
|
||||||
|
|
||||||
|
### MinIO Client
|
||||||
|
|
||||||
|
On Summer, I'll install the MinIO Client called `mc`. This is easy because the maintainers ship a
|
||||||
|
Linux binary which I can just download. On OpenBSD, they don't do that. Not a problem though, on
|
||||||
|
Squanchy, Pencilvester and Glootie, I will just `go install` the client. Using the `mc` commandline,
|
||||||
|
I can all any of the S3 APIs on my new MinIO instance:
|
||||||
|
|
||||||
|
```
|
||||||
|
pim@summer:~$ set +o history
|
||||||
|
pim@summer:~$ mc alias set chbtl0 https://s3.chbtl0.ipng.ch/ <rootuser> <rootpass>
|
||||||
|
pim@summer:~$ set -o history
|
||||||
|
pim@summer:~$ mc admin info chbtl0/
|
||||||
|
● s3.chbtl0.ipng.ch
|
||||||
|
Uptime: 22 hours
|
||||||
|
Version: 2025-05-24T17:08:30Z
|
||||||
|
Network: 1/1 OK
|
||||||
|
Drives: 12/12 OK
|
||||||
|
Pool: 1
|
||||||
|
|
||||||
|
┌──────┬───────────────────────┬─────────────────────┬──────────────┐
|
||||||
|
│ Pool │ Drives Usage │ Erasure stripe size │ Erasure sets │
|
||||||
|
│ 1st │ 0.8% (total: 116 TiB) │ 12 │ 1 │
|
||||||
|
└──────┴───────────────────────┴─────────────────────┴──────────────┘
|
||||||
|
|
||||||
|
95 GiB Used, 5 Buckets, 5,859 Objects, 318 Versions, 1 Delete Marker
|
||||||
|
12 drives online, 0 drives offline, EC:4
|
||||||
|
|
||||||
|
```
|
||||||
|
|
||||||
|
Cool beans. I think I should get rid of this root account though, I've installed those credentials
|
||||||
|
into the `/etc/default/minio` environment file, but I don't want to keep them out in the open. So
|
||||||
|
I'll make an account for myself and assign me reasonable privileges, called `consoleAdmin` in the
|
||||||
|
default install:
|
||||||
|
|
||||||
|
```
|
||||||
|
pim@summer:~$ set +o history
|
||||||
|
pim@summer:~$ mc admin user add chbtl0/ <someuser> <somepass>
|
||||||
|
pim@summer:~$ mc admin policy info chbtl0 consoleAdmin
|
||||||
|
pim@summer:~$ mc admin policy attach chbtl0 consoleAdmin --user=<someuser>
|
||||||
|
pim@summer:~$ mc alias set chbtl0 https://s3.chbtl0.ipng.ch/ <someuser> <somepass>
|
||||||
|
pim@summer:~$ set -o history
|
||||||
|
```
|
||||||
|
|
||||||
|
OK, I feel less gross now that I'm not operating as root on the MinIO deployment. Using my new
|
||||||
|
user-powers, let me set some metadata on my new minio server:
|
||||||
|
|
||||||
|
```
|
||||||
|
pim@summer:~$ mc admin config set chbtl0/ site name=chbtl0 region=switzerland
|
||||||
|
Successfully applied new settings.
|
||||||
|
Please restart your server 'mc admin service restart chbtl0/'.
|
||||||
|
pim@summer:~$ mc admin service restart chbtl0/
|
||||||
|
Service status: ▰▰▱ [DONE]
|
||||||
|
Summary:
|
||||||
|
┌───────────────┬─────────────────────────────┐
|
||||||
|
│ Servers: │ 1 online, 0 offline, 0 hung │
|
||||||
|
│ Restart Time: │ 61.322886ms │
|
||||||
|
└───────────────┴─────────────────────────────┘
|
||||||
|
pim@summer:~$ mc admin config get chbtl0/ site
|
||||||
|
site name=chbtl0 region=switzerland
|
||||||
|
```
|
||||||
|
|
||||||
|
By the way, what's really cool about these open standards is that both the Amazon `aws` client works
|
||||||
|
with MinIO, but `mc` also works with AWS!
|
||||||
|
### MinIO Console
|
||||||
|
|
||||||
|
Although I'm pretty good with APIs and command line tools, there's some benefit also in using a
|
||||||
|
Graphical User Interface. MinIO ships with one, but there as a bit of a kerfuffle in the MinIO
|
||||||
|
community. Unfortunately, these are pretty common -- Redis (an open source key/value storage system)
|
||||||
|
changed their offering abruptly. Terraform (an open source infrastructure-as-code tool) changed
|
||||||
|
their licensing at some point. Ansible (an open source machine management tool) changed their
|
||||||
|
offering also. MinIO developers decided to strip their console of ~all features recently. The gnarly
|
||||||
|
bits are discussed on
|
||||||
|
[[reddit](https://www.reddit.com/r/selfhosted/comments/1kva3pw/avoid_minio_developers_introduce_trojan_horse/)].
|
||||||
|
but suffice to say: the same thing that happened in literally 100% of the other cases, also happened
|
||||||
|
here. Somebody decided to simply fork the code from before it was changed.
|
||||||
|
|
||||||
|
Enter OpenMaxIO. A cringe worthy name, but it gets the job done. Reading up on the
|
||||||
|
[[GitHub](https://github.com/OpenMaxIO/openmaxio-object-browser/issues/5)], reviving the fully
|
||||||
|
working console is pretty straight forward -- that is, once somebody spent a few days figuring it
|
||||||
|
out. Thank you `icesvz` for this excellent pointer. With this, I can create a systemd service for
|
||||||
|
the console and start it:
|
||||||
|
|
||||||
|
```
|
||||||
|
pim@minio0-chbtl0:~$ cat << EOF | sudo tee -a /etc/default/minio
|
||||||
|
## NOTE(pim): For openmaxio console service
|
||||||
|
CONSOLE_MINIO_SERVER="http://localhost:9000"
|
||||||
|
MINIO_BROWSER_REDIRECT_URL="https://cons0-s3.chbtl0.ipng.ch/"
|
||||||
|
EOF
|
||||||
|
pim@minio0-chbtl0:~$ cat << EOF | sudo tee /lib/systemd/system/minio-console.service
|
||||||
|
[Unit]
|
||||||
|
Description=OpenMaxIO Console Service
|
||||||
|
Wants=network-online.target
|
||||||
|
After=network-online.target
|
||||||
|
AssertFileIsExecutable=/usr/local/bin/minio-console
|
||||||
|
|
||||||
|
[Service]
|
||||||
|
Type=simple
|
||||||
|
|
||||||
|
WorkingDirectory=/usr/local
|
||||||
|
|
||||||
|
User=minio-user
|
||||||
|
Group=minio-user
|
||||||
|
ProtectProc=invisible
|
||||||
|
|
||||||
|
EnvironmentFile=-/etc/default/minio
|
||||||
|
ExecStart=/usr/local/bin/minio-console server
|
||||||
|
Restart=always
|
||||||
|
LimitNOFILE=1048576
|
||||||
|
MemoryAccounting=no
|
||||||
|
TasksMax=infinity
|
||||||
|
TimeoutSec=infinity
|
||||||
|
OOMScoreAdjust=-1000
|
||||||
|
SendSIGKILL=no
|
||||||
|
|
||||||
|
[Install]
|
||||||
|
WantedBy=multi-user.target
|
||||||
|
EOF
|
||||||
|
pim@minio0-chbtl0:~$ sudo systemctl enable --now minio-console
|
||||||
|
pim@minio0-chbtl0:~$ sudo systemctl restart minio
|
||||||
|
```
|
||||||
|
|
||||||
|
The first snippet is an update to the MinIO configuration that instructs it to redirect users who
|
||||||
|
are not trying to use the API to the console endpoint on `cons0-s3.chbtl0.ipng.ch`, and then the
|
||||||
|
console-server needs to know where to find the API, which from its vantage point is running on
|
||||||
|
`localhost:9000`. Hello, beautiful fully featured console:
|
||||||
|
|
||||||
|
{{< image src="/assets/minio/console-1.png" alt="MinIO Console" >}}
|
||||||
|
|
||||||
|
### MinIO Prometheus
|
||||||
|
|
||||||
|
MinIO ships with a prometheus metrics endpoint, and I notice on its console that it has a nice
|
||||||
|
metrics tab, which is fully greyed out. This is most likely because, well, I don't have a Prometheus
|
||||||
|
install here yet. I decide to keep the storage nodes self-contained and start a Prometheus server on
|
||||||
|
the local machine. I can always plumb that to IPng's Grafana instance later.
|
||||||
|
|
||||||
|
For now, I'll install Prometheus as follows:
|
||||||
|
|
||||||
|
```
|
||||||
|
pim@minio0-chbtl0:~$ cat << EOF | sudo tee -a /etc/default/minio
|
||||||
|
## NOTE(pim): Metrics for minio-console
|
||||||
|
MINIO_PROMETHEUS_AUTH_TYPE="public"
|
||||||
|
CONSOLE_PROMETHEUS_URL="http://localhost:19090/"
|
||||||
|
CONSOLE_PROMETHEUS_JOB_ID="minio-job"
|
||||||
|
EOF
|
||||||
|
|
||||||
|
pim@minio0-chbtl0:~$ sudo apt install prometheus
|
||||||
|
pim@minio0-chbtl0:~$ cat << EOF | sudo tee /etc/default/prometheus
|
||||||
|
ARGS="--web.listen-address='[::]:19090' --storage.tsdb.retention.size=16GB"
|
||||||
|
EOF
|
||||||
|
pim@minio0-chbtl0:~$ cat << EOF | sudo tee /etc/prometheus/prometheus.yml
|
||||||
|
global:
|
||||||
|
scrape_interval: 60s
|
||||||
|
|
||||||
|
scrape_configs:
|
||||||
|
- job_name: minio-job
|
||||||
|
metrics_path: /minio/v2/metrics/cluster
|
||||||
|
static_configs:
|
||||||
|
- targets: ['localhost:9000']
|
||||||
|
labels:
|
||||||
|
cluster: minio0-chbtl0
|
||||||
|
|
||||||
|
- job_name: minio-job-node
|
||||||
|
metrics_path: /minio/v2/metrics/node
|
||||||
|
static_configs:
|
||||||
|
- targets: ['localhost:9000']
|
||||||
|
labels:
|
||||||
|
cluster: minio0-chbtl0
|
||||||
|
|
||||||
|
- job_name: minio-job-bucket
|
||||||
|
metrics_path: /minio/v2/metrics/bucket
|
||||||
|
static_configs:
|
||||||
|
- targets: ['localhost:9000']
|
||||||
|
labels:
|
||||||
|
cluster: minio0-chbtl0
|
||||||
|
|
||||||
|
- job_name: minio-job-resource
|
||||||
|
metrics_path: /minio/v2/metrics/resource
|
||||||
|
static_configs:
|
||||||
|
- targets: ['localhost:9000']
|
||||||
|
labels:
|
||||||
|
cluster: minio0-chbtl0
|
||||||
|
|
||||||
|
- job_name: node
|
||||||
|
static_configs:
|
||||||
|
- targets: ['localhost:9100']
|
||||||
|
labels:
|
||||||
|
cluster: minio0-chbtl0
|
||||||
|
pim@minio0-chbtl0:~$ sudo systemctl restart minio prometheus
|
||||||
|
```
|
||||||
|
|
||||||
|
In the first snippet, I'll tell MinIO where it should find its Prometheus instance. Since the MinIO
|
||||||
|
console service is running on port 9090, and this is also the default port for Prometheus, I will
|
||||||
|
run Promtheus on port 19090 instead. From reading the MinIO docs, I can see that normally MinIO will
|
||||||
|
want prometheus to authenticate to it before it'll allow the endpoints to be scraped. I'll turn that
|
||||||
|
off by making these public. On the IPng Frontends, I can always remove access to /minio/v2 and
|
||||||
|
simply use the IPng Site Local access for local Prometheus scrapers instead.
|
||||||
|
|
||||||
|
After telling Prometheus its runtime arguments (in `/etc/default/prometheus`) and its scraping
|
||||||
|
endpoints (in `/etc/prometheus/prometheus.yml`), I can restart minio and prometheus. A few minutes
|
||||||
|
later, I can see the _Metrics_ tab in the console come to life.
|
||||||
|
|
||||||
|
But now that I have this prometheus running on the MinIO node, I can also add it to IPng's Grafana
|
||||||
|
configuration, by adding a new data source on `minio0.chbtl0.net.ipng.ch:19090` and pointing the
|
||||||
|
default Grafana [[Dashboard](https://grafana.com/grafana/dashboards/13502-minio-dashboard/)] at it:
|
||||||
|
|
||||||
|
{{< image src="/assets/minio/console-2.png" alt="Grafana Dashboard" >}}
|
||||||
|
|
||||||
|
A two-for-one: I will both be able to see metrics directly in the console, but also I will be able
|
||||||
|
to hook up these per-node prometheus instances into IPng's alertmanager also, and I've read some
|
||||||
|
[[docs](https://min.io/docs/minio/linux/operations/monitoring/collect-minio-metrics-using-prometheus.html)]
|
||||||
|
on the concepts. I'm really liking the experience so far!
|
||||||
|
|
||||||
|
### MinIO Nagios
|
||||||
|
|
||||||
|
Prometheus is fancy and all, but at IPng Networks, I've been doing monitoring for a while now. As a
|
||||||
|
dinosaur, I still have an active [[Nagios](https://www.nagios.org/)] install, which autogenerates
|
||||||
|
all of its configuration using the Ansible repository I have. So for the new Ansible group called
|
||||||
|
`minio`, I will autogenerate the following snippet:
|
||||||
|
|
||||||
|
```
|
||||||
|
define command {
|
||||||
|
command_name ipng_check_minio
|
||||||
|
command_line $USER1$/check_http -E -H $HOSTALIAS$ -I $ARG1$ -p $ARG2$ -u $ARG3$ -r '$ARG4$'
|
||||||
|
}
|
||||||
|
|
||||||
|
define service {
|
||||||
|
hostgroup_name ipng:minio:ipv6
|
||||||
|
service_description minio6:api
|
||||||
|
check_command ipng_check_minio!$_HOSTADDRESS6$!9000!/minio/health/cluster!
|
||||||
|
use ipng-service-fast
|
||||||
|
notification_interval 0 ; set > 0 if you want to be renotified
|
||||||
|
}
|
||||||
|
|
||||||
|
define service {
|
||||||
|
hostgroup_name ipng:minio:ipv6
|
||||||
|
service_description minio6:prom
|
||||||
|
check_command ipng_check_minio!$_HOSTADDRESS6$!19090!/classic/targets!minio-job
|
||||||
|
use ipng-service-fast
|
||||||
|
notification_interval 0 ; set > 0 if you want to be renotified
|
||||||
|
}
|
||||||
|
|
||||||
|
define service {
|
||||||
|
hostgroup_name ipng:minio:ipv6
|
||||||
|
service_description minio6:console
|
||||||
|
check_command ipng_check_minio!$_HOSTADDRESS6$!9090!/!MinIO Console
|
||||||
|
use ipng-service-fast
|
||||||
|
notification_interval 0 ; set > 0 if you want to be renotified
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
I've shown the snippet for IPv6 but I also have three services defined for legacy IP in the
|
||||||
|
hostgroup `ipng:minio:ipv4`. The check command here uses `-I` which has the IPv4 or IPv6 address to
|
||||||
|
talk to, `-p` for the port to consule, `-u` for the URI to hit and an option `-r` for a regular
|
||||||
|
expression to expect in the output. For the Nagios afficianados out there: my Ansible `groups`
|
||||||
|
correspond one to one with autogenerated Nagios `hostgroups`. This allows me to add arbitrary checks
|
||||||
|
by group-type, like above in the `ipng:minio` group for IPv4 and IPv6.
|
||||||
|
|
||||||
|
In the MinIO [[docs](https://min.io/docs/minio/linux/operations/monitoring/healthcheck-probe.html)]
|
||||||
|
I read up on the Healthcheck API. I choose to monitor the _Cluster Write Quorum_ on my minio
|
||||||
|
deployments. For Prometheus, I decide to hit the `targets` endpoint and expect the `minio-job` to be
|
||||||
|
among them. Finally, for the MinIO Console, I expect to see a login screen with the words `MinIO
|
||||||
|
Console` in the returned page. I guessed right, because Nagios is all green:
|
||||||
|
|
||||||
|
{{< image src="/assets/minio/nagios.png" alt="Nagios Dashboard" >}}
|
||||||
|
|
||||||
|
## My First Bucket
|
||||||
|
|
||||||
|
The IPng website is a statically generated Hugo site, and when-ever I submit a change to my Git
|
||||||
|
repo, a CI/CD runner (called [[Drone](https://www.drone.io/)]), picks up the change. It re-builds
|
||||||
|
the static website, and copies it to four redundant NGINX servers.
|
||||||
|
|
||||||
|
But IPng's website has amassed quite a bit of extra files (like VM images and VPP packages that I
|
||||||
|
publish), which are copied separately using a simple push script I have in my home directory. This
|
||||||
|
avoids all those big media files from cluttering the Git repository. I decide to move this stuff
|
||||||
|
into S3:
|
||||||
|
|
||||||
|
```
|
||||||
|
pim@summer:~/src/ipng-web-assets$ echo 'Gruezi World.' > ipng.ch/media/README.md
|
||||||
|
pim@summer:~/src/ipng-web-assets$ mc mb chbtl0/ipng-web-assets
|
||||||
|
pim@summer:~/src/ipng-web-assets$ mc mirror . chbtl0/ipng-web-assets/
|
||||||
|
...ch/media/README.md: 6.50 GiB / 6.50 GiB ┃▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓┃ 236.38 MiB/s 28s
|
||||||
|
pim@summer:~/src/ipng-web-assets$ mc anonymous set download chbtl0/ipng-web-assets/
|
||||||
|
```
|
||||||
|
|
||||||
|
OK, two things that immediately jump out at me. This stuff is **fast**: Summer is connected with a
|
||||||
|
2.5GbE network card, and she's running hard, copying the 6.5GB of data that are in these web assets
|
||||||
|
essentially at line rate. It doesn't really surprise me because Summer is running off of Gen4 NVME,
|
||||||
|
while MinIO has 12 spinning disks which each can write about 160MB/s or so sustained
|
||||||
|
[[ref](https://www.seagate.com/www-content/datasheets/pdfs/exos-x16-DS2011-1-1904US-en_US.pdf)],
|
||||||
|
with 24 CPUs to tend to the NIC (2x10G) and disks (2x SSD, 12x LFF). Should be plenty!
|
||||||
|
|
||||||
|
The second is that MinIO allows for buckets to be publicly shared in three ways: 1) read-only by
|
||||||
|
setting `download`; 2) write-only by setting `upload`, and 3) read-write by setting `public`.
|
||||||
|
I set `download` here, which means I should be able to fetch an asset now publicly:
|
||||||
|
|
||||||
|
```
|
||||||
|
pim@summer:~$ curl https://s3.chbtl0.ipng.ch/ipng-web-assets/ipng.ch/media/README.md
|
||||||
|
Gruezi World.
|
||||||
|
pim@summer:~$ curl https://ipng-web-assets.s3.chbtl0.ipng.ch/ipng.ch/media/README.md
|
||||||
|
Gruezi World.
|
||||||
|
```
|
||||||
|
|
||||||
|
The first `curl` here shows the path-based access, while the second one shows an equivalent
|
||||||
|
virtual-host based access. Both retrieve the file I just pushed via the public Internet. Whoot!
|
||||||
|
|
||||||
|
# What's Next
|
||||||
|
|
||||||
|
I'm going to be moving [[Restic](https://restic.net/)] backups from IPng's ZFS storage pool to this
|
||||||
|
S3 service over the next few days. I'll also migrate PeerTube and possibly Mastodon from NVME based
|
||||||
|
storage to replicated S3 buckets as well. Finally, the IPng website media that I mentinoed above,
|
||||||
|
should make for a nice followup article. Stay tuned!
|
BIN
static/assets/minio/console-1.png
(Stored with Git LFS)
Normal file
BIN
static/assets/minio/console-1.png
(Stored with Git LFS)
Normal file
Binary file not shown.
BIN
static/assets/minio/console-2.png
(Stored with Git LFS)
Normal file
BIN
static/assets/minio/console-2.png
(Stored with Git LFS)
Normal file
Binary file not shown.
BIN
static/assets/minio/disks.png
(Stored with Git LFS)
Normal file
BIN
static/assets/minio/disks.png
(Stored with Git LFS)
Normal file
Binary file not shown.
1633
static/assets/minio/minio-ec.svg
Normal file
1633
static/assets/minio/minio-ec.svg
Normal file
File diff suppressed because it is too large
Load Diff
After Width: | Height: | Size: 90 KiB |
BIN
static/assets/minio/minio-logo.png
(Stored with Git LFS)
Normal file
BIN
static/assets/minio/minio-logo.png
(Stored with Git LFS)
Normal file
Binary file not shown.
BIN
static/assets/minio/nagios.png
(Stored with Git LFS)
Normal file
BIN
static/assets/minio/nagios.png
(Stored with Git LFS)
Normal file
Binary file not shown.
BIN
static/assets/minio/rack.png
(Stored with Git LFS)
Normal file
BIN
static/assets/minio/rack.png
(Stored with Git LFS)
Normal file
Binary file not shown.
Reference in New Issue
Block a user