Add period:100 reporting
All checks were successful
continuous-integration/drone/push Build is passing

This commit is contained in:
2025-08-10 17:35:57 +02:00
parent c32d1779f8
commit c68799703b

View File

@@ -542,10 +542,12 @@ sunlight_sqlite_update_duration_seconds{quantile="0.99"} 0.016859223
Comparing the throughput at 4'400/s with that first test of 100/s, I expect and can confirm a
significant increase in all of these metrics. The 99th percentile addchain is now 1889ms (up from
207ms) and the sequencing duration is now 1111ms (up from 109ms). I fiddle a little bit with
Sunlight's configuration file, notably the `period` and `poolsize`. This does not seem to matter
much. Setting for example `period:2000` and `poolsize:15000` still yields pretty much the same
throughput:
207ms) and the sequencing duration is now 1111ms (up from 109ms).
#### Sunlight: Effect of period
I fiddle a little bit with Sunlight's configuration file, notably the `period` and `poolsize`.
First I set `period:2000` and `poolsize:15000`, which yields pretty much the same throughput:
```
pim@ctlog-test:/etc/sunlight$ while :; do curl -ksS https://ctlog-test.lab.ipng.ch:1443/checkpoint | grep -E '^[0-9]+$'; sleep 60; done
@@ -572,6 +574,34 @@ sunlight_sequencing_duration_seconds{log="sunlight-test",quantile="0.99"} 1.5968
sunlight_sqlite_update_duration_seconds{quantile="0.99"} 0.010847308
```
Then I also set a `period:100` and `poolsize:15000`, which does improve a bit:
```
pim@ctlog-test:/etc/sunlight$ while :; do curl -ksS https://ctlog-test.lab.ipng.ch:1443/checkpoint | grep -E '^[0-9]+$'; sleep 60; done
560654
950524
1324645
1720362
```
With the same generated load of 10'000/sec with a 10% duplication rate, I am still offering roughly
9'000/sec of unique certificates, and I'm seeing `(1720362 - 560654)/180` or about 6'440/sec come
through, which is a fair bit better, at the expense of more disk activity. At this rate and with
`period:100`, the latency tail looks like this:
```
pim@ctlog-test:/etc/sunlight$ curl -ksS https://ctlog-test.lab.ipng.ch/metrics | egrep 'seconds.*quantile=\"0.99\"'
sunlight_addchain_wait_seconds{log="sunlight-test",quantile="0.99"} 1.616046445
sunlight_cache_get_duration_seconds{log="sunlight-test",quantile="0.99"} 7.5123e-05
sunlight_cache_put_duration_seconds{log="sunlight-test",quantile="0.99"} 0.534935803
sunlight_fs_op_duration_seconds{log="sunlight-test",method="discard",quantile="0.99"} 0.000377273
sunlight_fs_op_duration_seconds{log="sunlight-test",method="fetch",quantile="0.99"} 4.8893e-05
sunlight_fs_op_duration_seconds{log="sunlight-test",method="upload",quantile="0.99"} 0.054685991
sunlight_http_request_duration_seconds{endpoint="add-chain",log="sunlight-test",quantile="0.99"} 1.946445877
sunlight_sequencing_duration_seconds{log="sunlight-test",quantile="0.99"} 0.980602185
sunlight_sqlite_update_duration_seconds{quantile="0.99"} 0.018385831
```
***Conclusion***: Sunlight on POSIX can reliably handle 4'400/s (with a duplicate rate of 10%) on
this setup.
@@ -589,9 +619,9 @@ current write-load (which is about 250/s).
solutions, taking into account that TesseraCT runs MariaDB (which my setup did not use ZFS
for), while Sunlight uses sqlite3 on the ZFS pool.
***Notable***: Sunlight POSIX and S3 performance is roughly identical (both handle about 5'000/sec),
while TesseraCT POSIX performance (12'000/s) is significantly better than its S3 (800/s). Some
other observations:
***Notable***: Sunlight POSIX and S3 performance is roughly identical (both handle about
5'000/sec), while TesseraCT POSIX performance (12'000/s) is significantly better than its S3
(800/s). Some other observations:
* Sunlight has a very opinionated configuration, and can run multiple logs with one configuration
file and one binary. Its configuration was a bit constraining though, as I could not manage to