Typo fixes and grammar improvements, h/t Claude
All checks were successful
continuous-integration/drone/push Build is passing

This commit is contained in:
2026-02-21 16:24:09 +00:00
parent 4d8f3a42e8
commit c645550081
3 changed files with 38 additions and 38 deletions

View File

@@ -60,7 +60,7 @@ with no merge delay.
{{< image width="18em" float="right" src="/assets/ctlog/MPLS Backbone - CTLog.svg" alt="ctlog at ipng" >}}
In the diagram, I've drawn an overview of IPng's network. In {{< boldcolor color="red" >}}red{{<
/boldcolor >}} a european backbone network is provided by a [[BGP Free Core
/boldcolor >}} a European backbone network is provided by a [[BGP Free Core
network]({{< ref 2022-12-09-oem-switch-2 >}})]. It operates a private IPv4, IPv6, and MPLS network, called
_IPng Site Local_, which is not connected to the internet. On top of that, IPng offers L2 and L3
services, for example using [[VPP]({{< ref 2021-02-27-network >}})].
@@ -81,7 +81,7 @@ of them will be running one of the _Log_ implementations. IPng provides two larg
for offsite backup, in case a hypervisor decides to check out, and daily backups to an S3 bucket
using Restic.
Having explained all of this, I am well aware that end to end reliability will be coming from the
Having explained all of this, I am well aware that end-to-end reliability will be coming from the
fact that there are many independent _Log_ operators, and folks wanting to validate certificates can
simply monitor many. If there is a gap in coverage, say due to any given _Log_'s downtime, this will
not necessarily be problematic. It does mean that I may have to suppress the SRE in me...
@@ -93,8 +93,8 @@ this article, maybe a simpler, more elegant design could be superior, precisely
log reliability is not _as important_ as having many available log _instances_ to choose from.
From operators in the field I understand that the world-wide generation of certificates is roughly
17M/day, which amounts of some 200-250qps of writes. Antonis explains that certs with a validity
if 180 days or less will need two CT log entries, while certs with a validity more than 180d will
17M/day, which amounts to some 200-250qps of writes. Antonis explains that certs with a validity
of 180 days or less will need two CT log entries, while certs with a validity of more than 180d will
need three CT log entries. So the write rate is roughly 2.2x that, as an upper bound.
My first thought is to see how fast my open source S3 machines can go, really. I'm curious also as
@@ -128,7 +128,7 @@ dedicated to their task of running MinIO.
{{< image width="100%" src="/assets/ctlog/minio_8kb_performance.png" alt="MinIO 8kb disk vs SSD" >}}
The left-hand side graph feels pretty natural to me. With one thread, uploading 8kB objects will
quickly hit the IOPS rate of the disks, each of which have to participate in the write due to EC:3
quickly hit the IOPS rate of the disks, each of which has to participate in the write due to EC:3
encoding when using six disks, and it tops out at ~56 PUT/s. The single thread hitting SSDs will not
hit that limit, and has ~371 PUT/s which I found a bit underwhelming. But, when performing the
loadtest with either 8 or 32 write threads, the hard disks become only marginally faster (topping
@@ -170,7 +170,7 @@ large objects:
This makes me draw an interesting conclusion: seeing as CT Logs are read/write heavy (every couple
of seconds, the Merkle tree is recomputed which is reasonably disk-intensive), SeaweedFS might be a
slight better choice. IPng Networks has three MinIO deployments, but no SeaweedFS deployments. Yet.
slightly better choice. IPng Networks has three MinIO deployments, but no SeaweedFS deployments. Yet.
# Tessera
@@ -181,11 +181,11 @@ predecessor called [[Trillian](https://github.com/google/trillian)]. The impleme
bake-in current best-practices based on the lessons learned over the past decade of building and
operating transparency logs in production environments and at scale.
Tessera was introduced at the Transparency.Dev summit in October 2024. I first watch Al and Martin
Tessera was introduced at the Transparency.Dev summit in October 2024. I first watched Al and Martin
[[introduce](https://www.youtube.com/watch?v=9j_8FbQ9qSc)] it at last year's summit. At a high
level, it wraps what used to be a whole kubernetes cluster full of components, into a single library
level, it wraps what used to be a whole Kubernetes cluster full of components, into a single library
that can be used with Cloud based services, either like AWS S3 and RDS database, or like GCP's GCS
storage and Spanner database. However, Google also made is easy to use a regular POSIX filesystem
storage and Spanner database. However, Google also made it easy to use a regular POSIX filesystem
implementation.
## TesseraCT
@@ -206,12 +206,12 @@ It's time for me to figure out what this TesseraCT thing can do .. are you ready
### TesseraCT: S3 and SQL
TesseraCT comes with a few so-called _personalities_. Those are an implementation of the underlying
TesseraCT comes with a few so-called _personalities_. These are implementations of the underlying
storage infrastructure in an opinionated way. The first personality I look at is the `aws` one in
`cmd/tesseract/aws`. I notice that this personality does make hard assumptions about the use of AWS
which is unfortunate as the documentation says '.. or self-hosted S3 and MySQL database'. However,
the `aws` personality assumes the AWS SecretManager in order to fetch its signing key. Before I
can be successful, I need to detangle that.
can be successful, I need to untangle that.
#### TesseraCT: AWS and Local Signer
@@ -339,7 +339,7 @@ infrastructure, each POST is expected to come from one of the certificate author
Then, the `--origin` flag designates how my log calls itself. In the resulting `checkpoint` file it
will enumerate a hash of the latest merged and published Merkle tree. In case a server serves
multiple logs, it uses the `--origin` flag to make the destinction which checksum belongs to which.
multiple logs, it uses the `--origin` flag to make the distinction which checksum belongs to which.
```
pim@ctlog-test:~/src/tesseract$ curl http://tesseract-test.minio-ssd.lab.ipng.ch:9000/checkpoint
@@ -363,7 +363,7 @@ sets up read and write traffic to a Static CT API log to test correctness and pe
load. The traffic is sent according to the [[Static CT API](https://c2sp.org/static-ct-api)] spec.
Slick!
The tool start a text-based UI (my favorite! also when using Cisco T-Rex loadtester) in the terminal
The tool starts a text-based UI (my favorite! also when using Cisco T-Rex loadtester) in the terminal
that shows the current status, logs, and supports increasing/decreasing read and write traffic. This
TUI allows for a level of interactivity when probing a new configuration of a log in order to find
any cliffs where performance degrades. For real load-testing applications, especially headless runs
@@ -408,7 +408,7 @@ TesseraCT accepts them.
I raise the write load by using the '>' key a few times. I notice things are great at 500qps, which
is nice because that's double what we are to expect. But I start seeing a bit more noise at 600qps.
When I raise the write-rate to 1000qps, all hell breaks loose on the logs of the server (and similar
logs in the `hammer` loadtester:
logs in the `hammer` loadtester):
```
W0727 15:54:33.419881 348475 handlers.go:168] ctlog-test.lab.ipng.ch/test-ecdsa: AddChain handler error: couldn't store the leaf: failed to fetch entry bundle at index 0: failed to fetch resource: getObject: failed to create reader for object "tile/data/000" in bucket "tesseract-test": operation error S3: GetObject, context deadline exceeded
@@ -689,7 +689,7 @@ But after a little bit of fiddling, I can assert my conclusion:
## What's Next
I am going to offer such a machine in production together with Antonis Chariton, and Jeroen Massar.
I am going to offer such a machine in production together with Antonis Chariton and Jeroen Massar.
I plan to do a few additional things:
* Test Sunlight as well on the same hardware. It would be nice to see a comparison between write