A few readability edits
All checks were successful
continuous-integration/drone/push Build is passing

This commit is contained in:
2025-08-10 18:50:00 +02:00
parent f4ed332b18
commit 4f0188abeb

View File

@@ -89,24 +89,28 @@ pim@ctlog-test:/etc/sunlight$ openssl req -newkey rsa:2048 -nodes -keyout sunlig
pim@ctlog-test:/etc/sunlight# openssl x509 -req -extfile \ pim@ctlog-test:/etc/sunlight# openssl x509 -req -extfile \
<(printf "subjectAltName=DNS:ctlog-test.lab.ipng.ch,DNS:ctlog-test.lab.ipng.ch") -days 365 \ <(printf "subjectAltName=DNS:ctlog-test.lab.ipng.ch,DNS:ctlog-test.lab.ipng.ch") -days 365 \
-in sunlight.csr -CA ca.crt -CAkey ca.key -CAcreateserial -out sunlight.pem -in sunlight.csr -CA ca.crt -CAkey ca.key -CAcreateserial -out sunlight.pem
ln -s sunlight.pem skylight.pem
ln -s sunlight-key.pem skylight-key.pem
``` ```
This little snippet yields `sunlight.pem` (the certificate) and `sunlight-key.pem` (the private key). This little snippet yields `sunlight.pem` (the certificate) and `sunlight-key.pem` (the private
With these in hand, I can start the rest of the show. First I will prepare the NVME storage with a key), and symlinks them to `skylight.pem` and `skylight-key.pem` for simplicity. With these in hand,
few datasets in which Sunlight will store its data: I can start the rest of the show. First I will prepare the NVME storage with a few datasets in
which Sunlight will store its data:
``` ```
pim@ctlog-test:/etc/sunlight$ sudo zfs create ssd-vol0/sunlight-test pim@ctlog-test:~$ sudo zfs create ssd-vol0/sunlight-test
pim@ctlog-test:/etc/sunlight$ sudo zfs create ssd-vol0/sunlight-test/shared pim@ctlog-test:~$ sudo zfs create ssd-vol0/sunlight-test/shared
pim@ctlog-test:/etc/sunlight$ sudo zfs create ssd-vol0/sunlight-test/logs pim@ctlog-test:~$ sudo zfs create ssd-vol0/sunlight-test/logs
pim@ctlog-test:/etc/sunlight$ sudo zfs create ssd-vol0/sunlight-test/logs/sunlight-test pim@ctlog-test:~$ sudo zfs create ssd-vol0/sunlight-test/logs/sunlight-test
pim@ctlog-test:/etc/sunlight$ sudo chown -R pim:pim /ssd-vol0/sunlight-test pim@ctlog-test:~$ sudo chown -R pim:pim /ssd-vol0/sunlight-test
``` ```
Then I'll create the Sunlight configuration: Then I'll create the Sunlight configuration:
``` ```
pim@ctlog-test:/etc/sunlight$ sunlight-keygen -f sunlight-test.seed.bin pim@ctlog-test:/etc/sunlight$ sunlight-keygen -f sunlight-test.seed.bin
Log ID: IPngJcHCHWi+s37vfFqpY9ouk+if78wAY2kl/sh3c8E=
ECDSA public key: ECDSA public key:
-----BEGIN PUBLIC KEY----- -----BEGIN PUBLIC KEY-----
MFkwEwYHKoZIzj0CAQYIKoZIzj0DAQcDQgAE6Hg60YncYt/V69kLmg4LlTO9RmHR MFkwEwYHKoZIzj0CAQYIKoZIzj0DAQcDQgAE6Hg60YncYt/V69kLmg4LlTO9RmHR
@@ -118,8 +122,9 @@ Ed25519 public key:
-----END PUBLIC KEY----- -----END PUBLIC KEY-----
``` ```
The first block creates key material for the log, and I get a huge surprise: the Log ID starts The first block creates key material for the log, and I get a fun surprise: the Log ID starts
precisely with the string `IPng`! What are the odds that that would happen. Slick! precisely with the string IPng... what are the odds that that would happen!? I should tell Antonis
about this, it's dope!
As a safety precaution, Sunlight requires the operator to make the `checkpoints.db` by hand, which As a safety precaution, Sunlight requires the operator to make the `checkpoints.db` by hand, which
I'll also do: I'll also do:
@@ -136,7 +141,7 @@ When learning about [[Tessera]({{< ref 2025-07-26-ctlog-1 >}})], I already kind
conclusion that, for our case at IPng at least, running the fully cloud-native version with S3 conclusion that, for our case at IPng at least, running the fully cloud-native version with S3
storage and MySQL database, gave both poorer performance, but also more operational complexity. But storage and MySQL database, gave both poorer performance, but also more operational complexity. But
I find it interesting to compare behavior and performance, so I'll start by creating a Sunlight log I find it interesting to compare behavior and performance, so I'll start by creating a Sunlight log
using backing Minio SSD storage. using backing MinIO SSD storage.
I'll first create the bucket and a user account to access it: I'll first create the bucket and a user account to access it:
@@ -223,10 +228,11 @@ pim@ctlog-test:~$ curl -k https://ctlog-test.lab.ipng.ch:1443/log.v3.json
404 page not found 404 page not found
``` ```
I'm starting to think that using a non-standard listen port won't work. The logname is called I'm starting to think that using a non-standard listen port won't work, or more precisely, adding
a port in the `monitoringprefix` won't work. I notice that the logname is called
`ctlog-test.lab.ipng.ch:1443` which I don't think is supposed to have a portname in it. So instead, `ctlog-test.lab.ipng.ch:1443` which I don't think is supposed to have a portname in it. So instead,
I make Sunlight `listen` on port 443 and omit the port in the `submissionprefix`, and give it I make Sunlight `listen` on port 443 and omit the port in the `submissionprefix`, and give it and
privileges to bind the privileged port like so: its companion Skylight the needed privileges to bind the privileged port like so:
``` ```
pim@ctlog-test:~$ sudo setcap 'cap_net_bind_service=+ep' /usr/local/bin/sunlight pim@ctlog-test:~$ sudo setcap 'cap_net_bind_service=+ep' /usr/local/bin/sunlight
@@ -236,7 +242,7 @@ pim@ctlog-test:~$ sunlight -testcert -c /etc/sunlight/sunlight-s3.yaml
{{< image width="60%" src="/assets/ctlog/sunlight-test-s3.png" alt="Sunlight testlog / S3" >}} {{< image width="60%" src="/assets/ctlog/sunlight-test-s3.png" alt="Sunlight testlog / S3" >}}
And with that, Sunlight reports for duty. Hoi! And with that, Sunlight reports for duty and the links work. Hoi!
#### Sunlight: Loadtesting S3 #### Sunlight: Loadtesting S3
@@ -246,7 +252,7 @@ paths, and I've created a snakeoil self-signed cert. CT Hammer does not accept t
so I need to make a tiny change to the Hammer: so I need to make a tiny change to the Hammer:
``` ```
pim@minio-ssd:~/src/tesseract$ git diff pim@ctlog-test:~/src/tesseract$ git diff
diff --git a/internal/hammer/hammer.go b/internal/hammer/hammer.go diff --git a/internal/hammer/hammer.go b/internal/hammer/hammer.go
index 3828fbd..1dfd895 100644 index 3828fbd..1dfd895 100644
--- a/internal/hammer/hammer.go --- a/internal/hammer/hammer.go
@@ -286,10 +292,10 @@ pim@ctlog-test:/etc/sunlight$ T=0; O=0; while :; do \
25196 1 seconds 87 certs 25196 1 seconds 87 certs
``` ```
On the first commandline I'll start the loadtest at 100 writes/sec with a duplication probability of On the first commandline I'll start the loadtest at 100 writes/sec with the standard duplication
10%, which allows me to test Sunlights ability to avoid writing duplicates. This means I should see probability of 10%, which allows me to test Sunlights ability to avoid writing duplicates. This
on average a growth of 90/s. Check. I raise the load to 500/s: means I should see on average a growth of the tree at about 90/s. Check. I raise the write-load to
500/s:
``` ```
39421 1 seconds 443 certs 39421 1 seconds 443 certs
@@ -299,7 +305,7 @@ on average a growth of 90/s. Check. I raise the load to 500/s:
41194 1 seconds 448 certs 41194 1 seconds 448 certs
``` ```
.. and to 1000/s: .. and to 1'000/s:
``` ```
57941 1 seconds 945 certs 57941 1 seconds 945 certs
58886 1 seconds 970 certs 58886 1 seconds 970 certs
@@ -314,8 +320,8 @@ W0810 14:55:29.660710 1398779 analysis.go:134] (1 x) failed to create request: f
W0810 14:55:30.496603 1398779 analysis.go:124] (1 x) failed to create request: write leaf was not OK. Status code: 500. Body: "failed to read body: read tcp 127.0.1.1:443->127.0.0.1:44908: i/o timeout\n" W0810 14:55:30.496603 1398779 analysis.go:124] (1 x) failed to create request: write leaf was not OK. Status code: 500. Body: "failed to read body: read tcp 127.0.1.1:443->127.0.0.1:44908: i/o timeout\n"
``` ```
I raise the Hammer load to 5000/sec (which means 4500/s unique certs and 500 duplicates), and find I raise the Hammer load to 5'000/sec (which means 4'500/s unique certs and 500 duplicates), and find
the max writes/sec to max out at around 4200/s: the max committed writes/sec to max out at around 4'200/s:
``` ```
879637 1 seconds 4213 certs 879637 1 seconds 4213 certs
883850 1 seconds 4207 certs 883850 1 seconds 4207 certs
@@ -332,9 +338,9 @@ W0810 15:00:05.496459 1398779 analysis.go:124] (1 x) failed to create request: f
W0810 15:00:07.187181 1398779 analysis.go:124] (1 x) failed to create request: failed to write leaf: Post "https://ctlog-test.lab.ipng.ch/ct/v1/add-chain": EOF W0810 15:00:07.187181 1398779 analysis.go:124] (1 x) failed to create request: failed to write leaf: Post "https://ctlog-test.lab.ipng.ch/ct/v1/add-chain": EOF
``` ```
At this load of 5000/s, Minio is not very impressed. Remember in the [[other article]({{< ref At this load of 4'200/s, MinIO is not very impressed. Remember in the [[other article]({{< ref
2025-07-26-ctlog-1 >}})] I loadtested it to about 7500 ops/sec and the statistics below are about 2025-07-26-ctlog-1 >}})] I loadtested it to about 7'500 ops/sec and the statistics below are about
50 ops/sec (2800/min). I conclude that Minio is, in fact, bored of this whole activity: 50 ops/sec (2'800/min). I conclude that MinIO is, in fact, bored of this whole activity:
``` ```
pim@ctlog-test:/etc/sunlight$ mc admin trace --stats ssd pim@ctlog-test:/etc/sunlight$ mc admin trace --stats ssd
@@ -348,8 +354,8 @@ s3.PutObject 37602 (70.3%) 1982.2 6.2ms 785µs 86.7ms 6.1ms 86
s3.GetObject 15918 (29.7%) 839.1 996µs 670µs 51.3ms 912µs 51.2ms ↑46B ↓3.0K ↑38K ↓2.4M 0 s3.GetObject 15918 (29.7%) 839.1 996µs 670µs 51.3ms 912µs 51.2ms ↑46B ↓3.0K ↑38K ↓2.4M 0
``` ```
Sunlight still keeps it certificate cache on local disk. At a rate of 5000/s, the ZFS pool has a Sunlight still keeps its certificate cache on local disk. At a rate of 4'200/s, the ZFS pool has a
write rate of about 105MB/s with about 877 ZFS writes for every 5000 certificates. write rate of about 105MB/s with about 877 ZFS writes per second.
``` ```
pim@ctlog-test:/etc/sunlight$ zpool iostat -v ssd-vol0 10 pim@ctlog-test:/etc/sunlight$ zpool iostat -v ssd-vol0 10
@@ -380,24 +386,24 @@ A few interesting observations:
* The write rate to ZFS is significantly higher with Sunlight than TesseraCT (about 8:1). This is * The write rate to ZFS is significantly higher with Sunlight than TesseraCT (about 8:1). This is
likely explained because the sqlite3 database lives on ZFS here, while TesseraCT uses MariaDB likely explained because the sqlite3 database lives on ZFS here, while TesseraCT uses MariaDB
running on a different filesystem. running on a different filesystem.
* The Minio IO is a lot lighter. As I reduce the load to 1000/s, as was the case in the TesseraCT * The MinIO usage is a lot lighter. As I reduce the load to 1'000/s, as was the case in the TesseraCT
test, I can see the ratio of Get:Put was 93:4 in TesseraCT, while it's 70:30 here. TesseraCT as test, I can see the ratio of Get:Put was 93:4 in TesseraCT, while it's 70:30 here. TesseraCT as
also consuming more IOPS, running at about 10.5k requests/minute, while Sunlight is also consuming more IOPS, running at about 10.5k requests/minute, while Sunlight is
significantly calmer at 2.8k requets/minute (almost 4x less!) significantly calmer at 2.8k requests/minute (almost 4x less!)
* The burst capacity of Sunlight is a fair bit higher than TesseraCT, likely due to its more * The burst capacity of Sunlight is a fair bit higher than TesseraCT, likely due to its more
efficient use of S3 backends. efficient use of S3 backends.
***Conclusion***: Sunlight S3+Minio can handle 1000/s reliably, and can spike to 5000/s with only ***Conclusion***: Sunlight S3+MinIO can handle 1'000/s reliably, and can spike to 4'200/s with only
few errors. few errors.
#### Sunlight: Loadtesting POSIX #### Sunlight: Loadtesting POSIX
When I took a closer look at TesseraCT a few weeks ago, it struck me that while making a When I took a closer look at TesseraCT a few weeks ago, it struck me that while making a
cloud-native setup, with S3 storage would allow for a cool way to enable storage scaling and cloud-native setup, with S3 storage would allow for a cool way to enable storage scaling and
read-path redundancy, by creating replicated buckets, it does come at a significant operational read-path redundancy, by creating synchronously replicated buckets, it does come at a significant
overhead and complexity. My main concern is the amount of different moving parts, and Sunlight operational overhead and complexity. My main concern is the amount of different moving parts, and
really has one very appealing property: it can run entirely on one machine without the need for any Sunlight really has one very appealing property: it can run entirely on one machine without the need
other moving parts - even the SQL database is linked in. That's pretty slick. for any other moving parts - even the SQL database is linked in. That's pretty slick.
``` ```
pim@ctlog-test:/etc/sunlight$ cat << EOF > sunlight.yaml pim@ctlog-test:/etc/sunlight$ cat << EOF > sunlight.yaml
@@ -453,14 +459,16 @@ sunlight_sqlite_update_duration_seconds{quantile="0.99"} 0.014922489
``` ```
I'm seeing here that at a load of 100/s (with 90/s of unique certificates), the 99th percentile I'm seeing here that at a load of 100/s (with 90/s of unique certificates), the 99th percentile
add-chain lataency is 207ms, which makes sense because the `period` configuration field is set to add-chain latency is 207ms, which makes sense because the `period` configuration field is set to
200ms. The filesystem operations (discard, fetch, upload) are _de minimis_ and the sequencing 200ms. The filesystem operations (discard, fetch, upload) are _de minimis_ and the sequencing
duration is at 109ms. Excellent! duration is at 109ms. Excellent!
But can this thing go really fast? I am reminded that the CT Hammer uses more CPU than TesseraCT, But can this thing go really fast? I do remember that the CT Hammer uses more CPU than TesseraCT,
and I've seen it above also when running my 5000/s loadtest that's about all the hammer can take on and I've seen it above also when running my 5'000/s loadtest that's about all the hammer can take on
a single Dell R630. So, as I did with the TesseraCT test, I'll use the Minio SSD and Minio Disk a single Dell R630. So, as I did with the TesseraCT test, I'll use the MinIO SSD and MinIO Disk
machines to generate the load. I boot them, so now I can hammer, or shall I say jackhammer away: machines to generate the load.
I boot them, so that I can hammer, or shall I say jackhammer away:
``` ```
pim@ctlog-test:~/src/tesseract$ go run ./internal/hammer --origin=ctlog-test.lab.ipng.ch \ pim@ctlog-test:~/src/tesseract$ go run ./internal/hammer --origin=ctlog-test.lab.ipng.ch \
@@ -479,7 +487,8 @@ pim@minio-disk:~/src/tesseract$ go run ./internal/hammer --origin=ctlog-test.lab
--max_read_ops=0 --num_writers=5000 --max_write_ops=5000 --serial_offset=2000000 --max_read_ops=0 --num_writers=5000 --max_write_ops=5000 --serial_offset=2000000
``` ```
This will generate 15000/s of load, which I note makes Sunlight a fair bit more bursty: This will generate 15'000/s of load, which I note does bring Sunlight to its knees, although it does
remain stable (yaay!) with a somewhat more bursty checkpoint interval:
``` ```
5504780 1 seconds 4039 certs 5504780 1 seconds 4039 certs
@@ -501,17 +510,17 @@ pim@ctlog-test:/etc/sunlight$ while :; do curl -ksS https://ctlog-test.lab.ipng.
``` ```
This rate boils down to `(6576712-6008831)/120` or 4'700/s of written certs, which at a duplication This rate boils down to `(6576712-6008831)/120` or 4'700/s of written certs, which at a duplication
ratio of 10% means approximately 5200/s of total accepted certs. This rate, Sunlight is consuming ratio of 10% means approximately 5'200/s of total accepted certs. This rate, Sunlight is consuming
about 10.3 CPUs/s, while Skylight is at 0.1 CPUs/s and the CT Hammer is at 11.1 CPUs/s; Given the 40 about 10.3 CPUs/s, while Skylight is at 0.1 CPUs/s and the CT Hammer is at 11.1 CPUs/s; Given the 40
threads on this machine, I am not saturating the CPU, but I'm curious as this rate is significantly threads on this machine, I am not saturating the CPU, but I'm curious as this rate is significantly
lower than TesseraCT. I briefly turn off the hammer on `ctlog-test` to allow Sunlight to monopolize lower than TesseraCT. I briefly turn off the hammer on `ctlog-test` to allow Sunlight to monopolize
the entire CPU. The CPU use lowers to about 9.3 CPUs/s showing that indeed, the bottleneck is not the entire machine. The CPU use does reduce to about 9.3 CPUs/s suggesting that indeed, the bottleneck
strictly CPU: is not strictly CPU:
{{< image width="90%" src="/assets/ctlog/btop-sunlight.png" alt="Sunlight btop" >}} {{< image width="90%" src="/assets/ctlog/btop-sunlight.png" alt="Sunlight btop" >}}
When using only two CT Hammers (on `minio-ssd.lab.ipng.ch` and `minio-disk.lab.ipng.ch`), the CPU When using only two CT Hammers (on `minio-ssd.lab.ipng.ch` and `minio-disk.lab.ipng.ch`), the CPU
use on the `ctlog-test.lab.ipng.ch` machine definitely goes down (CT Hammer is kind of a CPU hog..), use on the `ctlog-test.lab.ipng.ch` machine definitely goes down (CT Hammer is kind of a CPU hog....),
but the resulting throughput doesn't change that much: but the resulting throughput doesn't change that much:
``` ```
@@ -607,11 +616,11 @@ this setup.
## Wrapup - Observations ## Wrapup - Observations
TesseraCT and Sunlight handle quite differently. Both are easily up to the task of serving the From an operators point of view, TesseraCT and Sunlight handle quite differently. Both are easily up
current write-load (which is about 250/s). to the task of serving the current write-load (which is about 250/s).
* ***S3***: When using the S3 backend, TesseraCT became quite unhappy with 800/s while Sunlight * ***S3***: When using the S3 backend, TesseraCT became quite unhappy above 800/s while Sunlight
went all the way up to 5'000/s and sent significantly less requests to Minio (about 4x less), went all the way up to 4'200/s and sent significantly less requests to MinIO (about 4x less),
while showing good telemetry on the use of S3 backends. while showing good telemetry on the use of S3 backends.
* ***POSIX***: When using normal filesystem, Sunlight seems to peak at 4'800/s while TesseraCT * ***POSIX***: When using normal filesystem, Sunlight seems to peak at 4'800/s while TesseraCT