Files
vpp-containerlab/BUILDING.md

169 lines
4.8 KiB
Markdown

# Building vpp-containerlab
This document describes how to build, test and release the `vpp-containerlab` Docker image.
The image is built natively on two machines and combined into a multi-arch manifest:
- `summer` — amd64, Linux (local machine)
- `jessica-orb` — arm64, OrbStack VM on macOS, reachable via `ssh jessica-orb`
The pipeline sideloads locally-built VPP `.deb` packages rather than pulling from packagecloud,
so VPP must be compiled on both machines before building the image.
## Prerequisites
### SSH access to jessica-orb
The Docker daemon on `jessica` runs inside OrbStack's Linux VM. OrbStack listens on
`127.0.0.1:32222`; add a jump-host entry to `~/.ssh/config` on `summer` to reach it:
```
Host jessica-orb
HostName 127.0.0.1
Port 32222
User pim
ProxyCommand ssh jessica -W 127.0.0.1:32222
IdentityFile ~/.ssh/jessica-orb-key
IdentitiesOnly yes
UserKnownHostsFile /dev/null
StrictHostKeyChecking no
```
Copy OrbStack's SSH key from `jessica` to `summer`:
```bash
scp jessica:~/.orbstack/ssh/id_ed25519 ~/.ssh/jessica-orb-key
chmod 600 ~/.ssh/jessica-orb-key
```
Verify the full chain works:
```bash
ssh jessica-orb 'uname -m && docker info | head -3'
# expected: aarch64
```
### One-time setup
Install the Robot Framework venv for running tests:
```bash
make venv
```
This only needs to be re-run if `tests/requirements.txt` changes.
### Before every release
Build VPP on both machines (`make pkg-deb` in your VPP source tree on both `summer` and the
OrbStack VM on `jessica`), then verify both machines have a consistent set of `.deb` packages:
```bash
make preflight
```
This checks that `~/src/vpp/build-root` on each machine contains exactly one version of each
required package and that the version on `summer` matches the version on `jessica-orb`.
Override the path if your build root is elsewhere:
```bash
make preflight VPPDEBS=~/src/vpp/other-build-root
```
## Release pipeline
The full pipeline runs in this order:
```
preflight → build → test → push → release
```
Run everything in one shot:
```bash
make all
```
Or step through it manually:
| Step | Command | What it does |
|------|---------|--------------|
| 1 | `make preflight` | Validate VPP debs on summer and jessica-orb |
| 2 | `make build-amd64` | Build image locally for amd64 |
| 3 | `make test-amd64` | Run e2e tests against the amd64 image |
| 4 | `make sync-arm64` | Rsync working tree to jessica-orb |
| 5 | `make build-arm64` | Build image on jessica-orb for arm64 |
| 6 | `make test-arm64` | Run e2e tests on jessica-orb against the arm64 image |
| 7 | `make push-amd64` | Tag and push `:latest-amd64` to the registry |
| 8 | `make push-arm64` | Tag and push `:latest-arm64` to the registry |
| 9 | `make release` | Combine into a single `:latest` multi-arch manifest |
Convenience targets:
```bash
make build # steps 2+4+5 (both platforms)
make test # steps 3+6 (both platforms)
make push # steps 7+8 (both platforms)
```
### Promoting to :stable
`:stable` is only promoted **after** a successful `make all` — meaning both amd64 and arm64
have been built, tested, pushed and combined into `:latest`. Do not run `make stable` unless
the full pipeline completed without errors.
```bash
make all && make stable
```
`make stable` points `:stable` at the same manifest as the current `:latest-amd64` and
`:latest-arm64`, so it is always in sync with a fully tested release.
## Running a single test suite
Pass `TEST=` to restrict which suite is run:
```bash
make test-amd64 TEST=tests/01-vpp-ospf
make test TEST=tests/02-vpp-frr
```
The default is `tests/` (all suites).
## Debugging test failures
**Read the HTML log** — written after every run regardless of outcome:
```bash
xdg-open tests/out/tests-docker-log.html
```
**Deploy the topology manually** to keep containers running for inspection:
```bash
IMAGE=git.ipng.ch/ipng/vpp-containerlab:latest-amd64-test \
containerlab deploy -t tests/01-vpp-ospf/e2e-lab/vpp.clab.yml
```
Then inspect live state:
```bash
# OSPF neighbour state
containerlab exec -t tests/01-vpp-ospf/e2e-lab/vpp.clab.yml \
--label clab-node-name=vpp1 --cmd "birdc show ospf neighbor"
# Manual ping
containerlab exec -t tests/01-vpp-ospf/e2e-lab/vpp.clab.yml \
--label clab-node-name=client1 --cmd "ping -c 5 10.82.98.82"
# Tear down when done
containerlab destroy -t tests/01-vpp-ospf/e2e-lab/vpp.clab.yml --cleanup
```
**Common cause — OSPF convergence time:** 100% ping loss usually means routing is not up yet.
Tune the `Sleep` duration in the relevant `.robot` file by deploying manually and watching
`birdc show ospf neighbor` (or `vtysh -c "show ip ospf neighbor"` for FRR) until all
neighbours reach state `Full`.
**Increase robot verbosity:** add `--loglevel DEBUG` to the `robot` invocation in
`tests/rf-run.sh` temporarily.