Typo fixes and grammar improvements, h/t Claude
All checks were successful
continuous-integration/drone/push Build is passing
All checks were successful
continuous-integration/drone/push Build is passing
This commit is contained in:
@@ -60,7 +60,7 @@ with no merge delay.
|
||||
{{< image width="18em" float="right" src="/assets/ctlog/MPLS Backbone - CTLog.svg" alt="ctlog at ipng" >}}
|
||||
|
||||
In the diagram, I've drawn an overview of IPng's network. In {{< boldcolor color="red" >}}red{{<
|
||||
/boldcolor >}} a european backbone network is provided by a [[BGP Free Core
|
||||
/boldcolor >}} a European backbone network is provided by a [[BGP Free Core
|
||||
network]({{< ref 2022-12-09-oem-switch-2 >}})]. It operates a private IPv4, IPv6, and MPLS network, called
|
||||
_IPng Site Local_, which is not connected to the internet. On top of that, IPng offers L2 and L3
|
||||
services, for example using [[VPP]({{< ref 2021-02-27-network >}})].
|
||||
@@ -81,7 +81,7 @@ of them will be running one of the _Log_ implementations. IPng provides two larg
|
||||
for offsite backup, in case a hypervisor decides to check out, and daily backups to an S3 bucket
|
||||
using Restic.
|
||||
|
||||
Having explained all of this, I am well aware that end to end reliability will be coming from the
|
||||
Having explained all of this, I am well aware that end-to-end reliability will be coming from the
|
||||
fact that there are many independent _Log_ operators, and folks wanting to validate certificates can
|
||||
simply monitor many. If there is a gap in coverage, say due to any given _Log_'s downtime, this will
|
||||
not necessarily be problematic. It does mean that I may have to suppress the SRE in me...
|
||||
@@ -93,8 +93,8 @@ this article, maybe a simpler, more elegant design could be superior, precisely
|
||||
log reliability is not _as important_ as having many available log _instances_ to choose from.
|
||||
|
||||
From operators in the field I understand that the world-wide generation of certificates is roughly
|
||||
17M/day, which amounts of some 200-250qps of writes. Antonis explains that certs with a validity
|
||||
if 180 days or less will need two CT log entries, while certs with a validity more than 180d will
|
||||
17M/day, which amounts to some 200-250qps of writes. Antonis explains that certs with a validity
|
||||
of 180 days or less will need two CT log entries, while certs with a validity of more than 180d will
|
||||
need three CT log entries. So the write rate is roughly 2.2x that, as an upper bound.
|
||||
|
||||
My first thought is to see how fast my open source S3 machines can go, really. I'm curious also as
|
||||
@@ -128,7 +128,7 @@ dedicated to their task of running MinIO.
|
||||
{{< image width="100%" src="/assets/ctlog/minio_8kb_performance.png" alt="MinIO 8kb disk vs SSD" >}}
|
||||
|
||||
The left-hand side graph feels pretty natural to me. With one thread, uploading 8kB objects will
|
||||
quickly hit the IOPS rate of the disks, each of which have to participate in the write due to EC:3
|
||||
quickly hit the IOPS rate of the disks, each of which has to participate in the write due to EC:3
|
||||
encoding when using six disks, and it tops out at ~56 PUT/s. The single thread hitting SSDs will not
|
||||
hit that limit, and has ~371 PUT/s which I found a bit underwhelming. But, when performing the
|
||||
loadtest with either 8 or 32 write threads, the hard disks become only marginally faster (topping
|
||||
@@ -170,7 +170,7 @@ large objects:
|
||||
|
||||
This makes me draw an interesting conclusion: seeing as CT Logs are read/write heavy (every couple
|
||||
of seconds, the Merkle tree is recomputed which is reasonably disk-intensive), SeaweedFS might be a
|
||||
slight better choice. IPng Networks has three MinIO deployments, but no SeaweedFS deployments. Yet.
|
||||
slightly better choice. IPng Networks has three MinIO deployments, but no SeaweedFS deployments. Yet.
|
||||
|
||||
# Tessera
|
||||
|
||||
@@ -181,11 +181,11 @@ predecessor called [[Trillian](https://github.com/google/trillian)]. The impleme
|
||||
bake-in current best-practices based on the lessons learned over the past decade of building and
|
||||
operating transparency logs in production environments and at scale.
|
||||
|
||||
Tessera was introduced at the Transparency.Dev summit in October 2024. I first watch Al and Martin
|
||||
Tessera was introduced at the Transparency.Dev summit in October 2024. I first watched Al and Martin
|
||||
[[introduce](https://www.youtube.com/watch?v=9j_8FbQ9qSc)] it at last year's summit. At a high
|
||||
level, it wraps what used to be a whole kubernetes cluster full of components, into a single library
|
||||
level, it wraps what used to be a whole Kubernetes cluster full of components, into a single library
|
||||
that can be used with Cloud based services, either like AWS S3 and RDS database, or like GCP's GCS
|
||||
storage and Spanner database. However, Google also made is easy to use a regular POSIX filesystem
|
||||
storage and Spanner database. However, Google also made it easy to use a regular POSIX filesystem
|
||||
implementation.
|
||||
|
||||
## TesseraCT
|
||||
@@ -206,12 +206,12 @@ It's time for me to figure out what this TesseraCT thing can do .. are you ready
|
||||
|
||||
### TesseraCT: S3 and SQL
|
||||
|
||||
TesseraCT comes with a few so-called _personalities_. Those are an implementation of the underlying
|
||||
TesseraCT comes with a few so-called _personalities_. These are implementations of the underlying
|
||||
storage infrastructure in an opinionated way. The first personality I look at is the `aws` one in
|
||||
`cmd/tesseract/aws`. I notice that this personality does make hard assumptions about the use of AWS
|
||||
which is unfortunate as the documentation says '.. or self-hosted S3 and MySQL database'. However,
|
||||
the `aws` personality assumes the AWS SecretManager in order to fetch its signing key. Before I
|
||||
can be successful, I need to detangle that.
|
||||
can be successful, I need to untangle that.
|
||||
|
||||
#### TesseraCT: AWS and Local Signer
|
||||
|
||||
@@ -339,7 +339,7 @@ infrastructure, each POST is expected to come from one of the certificate author
|
||||
|
||||
Then, the `--origin` flag designates how my log calls itself. In the resulting `checkpoint` file it
|
||||
will enumerate a hash of the latest merged and published Merkle tree. In case a server serves
|
||||
multiple logs, it uses the `--origin` flag to make the destinction which checksum belongs to which.
|
||||
multiple logs, it uses the `--origin` flag to make the distinction which checksum belongs to which.
|
||||
|
||||
```
|
||||
pim@ctlog-test:~/src/tesseract$ curl http://tesseract-test.minio-ssd.lab.ipng.ch:9000/checkpoint
|
||||
@@ -363,7 +363,7 @@ sets up read and write traffic to a Static CT API log to test correctness and pe
|
||||
load. The traffic is sent according to the [[Static CT API](https://c2sp.org/static-ct-api)] spec.
|
||||
Slick!
|
||||
|
||||
The tool start a text-based UI (my favorite! also when using Cisco T-Rex loadtester) in the terminal
|
||||
The tool starts a text-based UI (my favorite! also when using Cisco T-Rex loadtester) in the terminal
|
||||
that shows the current status, logs, and supports increasing/decreasing read and write traffic. This
|
||||
TUI allows for a level of interactivity when probing a new configuration of a log in order to find
|
||||
any cliffs where performance degrades. For real load-testing applications, especially headless runs
|
||||
@@ -408,7 +408,7 @@ TesseraCT accepts them.
|
||||
I raise the write load by using the '>' key a few times. I notice things are great at 500qps, which
|
||||
is nice because that's double what we are to expect. But I start seeing a bit more noise at 600qps.
|
||||
When I raise the write-rate to 1000qps, all hell breaks loose on the logs of the server (and similar
|
||||
logs in the `hammer` loadtester:
|
||||
logs in the `hammer` loadtester):
|
||||
|
||||
```
|
||||
W0727 15:54:33.419881 348475 handlers.go:168] ctlog-test.lab.ipng.ch/test-ecdsa: AddChain handler error: couldn't store the leaf: failed to fetch entry bundle at index 0: failed to fetch resource: getObject: failed to create reader for object "tile/data/000" in bucket "tesseract-test": operation error S3: GetObject, context deadline exceeded
|
||||
@@ -689,7 +689,7 @@ But after a little bit of fiddling, I can assert my conclusion:
|
||||
|
||||
## What's Next
|
||||
|
||||
I am going to offer such a machine in production together with Antonis Chariton, and Jeroen Massar.
|
||||
I am going to offer such a machine in production together with Antonis Chariton and Jeroen Massar.
|
||||
I plan to do a few additional things:
|
||||
|
||||
* Test Sunlight as well on the same hardware. It would be nice to see a comparison between write
|
||||
|
||||
Reference in New Issue
Block a user