103 lines
4.8 KiB
Markdown
103 lines
4.8 KiB
Markdown
# IPng Networks Lab environment
|
|
|
|
|
|
## High level overview
|
|
|
|
There's a disk image on each hypervisor called the `proto` image, which serves as the base
|
|
image for all VMs on it. Every now and again, the proto image is updated (Debian, FRR and VPP)
|
|
and from that base image, lab VMs are cloned from it and local filesystem overrides are put
|
|
in place on each clone. The lab is used, and when we're done with it, we simply destroy all
|
|
clones. This way, each time the lab is started, it is in a pristine state.
|
|
|
|
The `proto` image is shared among the hypervisors. Typically, maintenance will be performed
|
|
on one of the hypervisors, and then the `proto` image is snapshotted and copied to the other
|
|
machines.
|
|
|
|
### Proto maintenance
|
|
|
|
The main `vpp-proto` image runs on `hvn0.chbtl0.ipng.ch` with a VM called `vpp-proto`.
|
|
When you want to refresh the image, you can
|
|
|
|
```
|
|
spongebob:~$ ssh -A root@hvn0.chbtl0.ipng.ch
|
|
|
|
SNAP=$(date +%Y%m%d) ## 20221012
|
|
zfs snapshot ssd-vol0/vpp-proto-disk0@${SNAP}-before
|
|
virsh start --console vpp-proto
|
|
|
|
## Do the upgrades, make changes to vpp-proto's disk image
|
|
## You can always roll back to the -before image if you'd like to revert
|
|
|
|
virsh shutdown --console vpp-proto
|
|
zfs snapshot ssd-vol0/vpp-proto-disk0@${SNAP}-release
|
|
zrepl signal wakeup vpp-proto-snapshots
|
|
```
|
|
|
|
There is a `zrepl` running on this machine, which can pick up the snapshot by manually
|
|
waking up the daemon (see the last command above). Each of the hypervisors in the fleet
|
|
will watch this replication endpoint, and if they see new snapshots arrive, they will
|
|
do an incremental pull of the data to their own ZFS filesystem as a snapshot. Old/current
|
|
running labs will not be disrupted, as they will be cloned off of old snapshots.
|
|
|
|
You will find the image as `ssd-vol0/hvn0.chbtl0.ipng.ch/ssd-vol0/vpp-proto-disk0`:
|
|
```
|
|
spongebob:~$ ssh -A root@hvn0.lab.ipng.ch 'zfs list -t snap'
|
|
NAME USED AVAIL REFER MOUNTPOINT
|
|
ssd-vol0/hvn0.chbtl0.ipng.ch/ssd-vol0/vpp-proto-disk0@20221013-release 0B - 6.04G -
|
|
```
|
|
|
|
## Usage
|
|
|
|
There are three hypervisor nodes each running one isolated lab environment:
|
|
* hvn0.lab.ipng.ch runs VPP lab0
|
|
* hvn1.lab.ipng.ch runs VPP lab1
|
|
* hvn2.lab.ipng.ch runs VPP lab2
|
|
|
|
Now that we have a base image (in the form of `vpp-proto-disk0@$(date)-release`), we can
|
|
make point-in-time clones of them, copy over any specifics (like IP addresses, hostname,
|
|
SSH keys, Bird/FRR configs, etc). We do this on the lab controller `lab.ipng.ch` which:
|
|
|
|
1. Looks on the hypervisor to see if there is a running VM, and if there is, bails
|
|
1. Looks on the hypervisor to see if there is an existing cloned image, and if there is bails
|
|
1. Builds a local overlay directory using a generator and Jinja2 (ie. `build/vpp0-0/`)
|
|
1. Creates a new cloned filesystem based off of a base `vpp-proto-disk0` snapshot on the hypervisor
|
|
1. Mounts that filesystem
|
|
1. Rsync's the built overlay into that filesystem
|
|
1. Unmounts the filesystem
|
|
1. Starts the VM using the newly built filesystem
|
|
|
|
Of course, the first two steps are meant to ensure we don't clobber running labs, which can
|
|
be overridden with the `--force` flag. And when the lab is finished, it's common practice to
|
|
shut down the VMs and destroy the clones.
|
|
|
|
```
|
|
lab:~/src/ipng-lab$ ./destroy --host hvn0.lab.ipng.ch
|
|
lab:~/src/ipng-lab$ ./generate --host hvn0.lab.ipng.ch --overlay bird
|
|
lab:~/src/ipng-lab$ ./create --host hvn0.lab.ipng.ch --overlay bird
|
|
```
|
|
|
|
### Generate
|
|
|
|
The generator reads input YAML files one after another merging and overriding them as it goes along,
|
|
then for each node building a `node` dictionary alongside the `lab` and other information from the
|
|
config files. Then, it read the `overlays` dictionary for a given --overlay type, reading all the
|
|
template files from that overlay directory and assembling an output directory which will hold the
|
|
per-node overrides, emitting them to the directory specified by the --build flag. It also copies in
|
|
any per-node files (if they exist) from the overlays/$(overlay)/blobs/$(node.hostname)/ giving full
|
|
control of the filesystem's contents.
|
|
|
|
### Create
|
|
|
|
Based on a generated directory and a lab YAML description, uses SSH to connect to the hypervisor,
|
|
create a clone of the base `vpp-proto` snapshot, mount it locally in a staging directory, then rsync
|
|
over the generated overlay from files from the generator output (build/$(overlay)/$(node.hostname))
|
|
after which the directory is unmounted and the virtual machine booted from the clone.
|
|
|
|
If the VM is running, or there exists a clone, an error is printed and the process skips over that
|
|
node. It's wise to run `destroy` before `create` to ensure the hypervisors are in a pristine state.
|
|
|
|
### Destroy
|
|
|
|
Ensures that both the VMs are not running (and will stop them if they are), and their filesystem
|
|
clones are destroyed. Obviously this is the most dangerous operation of the bunch.
|