147 lines
6.7 KiB
Markdown
147 lines
6.7 KiB
Markdown
# IPng Networks Lab environment
|
|
|
|
|
|
## High level overview
|
|
|
|
There's a disk image on each hypervisor called the `proto` image, which serves as the base
|
|
image for all VMs on it. Every now and again, the proto image is updated (Debian, FRR and VPP)
|
|
and from that base image, lab VMs are cloned from it and local filesystem overrides are put
|
|
in place on each clone. The lab is used, and when we're done with it, we simply destroy all
|
|
clones. This way, each time the lab is started, it is in a pristine state.
|
|
|
|
The `proto` image is shared among the hypervisors. Typically, maintenance will be performed
|
|
on one of the hypervisors, and then the `proto` image is snapshotted and copied to the other
|
|
machines.
|
|
|
|
### Proto maintenance
|
|
|
|
The main `vpp-proto` image runs on `hvn0.chbtl0.ipng.ch` with a VM called `vpp-proto`.
|
|
When you want to refresh the image, you can
|
|
|
|
```
|
|
hvn0-chbtl0:~$ virsh start --console vpp-proto
|
|
|
|
## Do the upgrades, make changes to vpp-proto's disk image
|
|
## You can always roll back to the previous snapshot image if you'd like to revert
|
|
|
|
hvn0-chbtl0:~$ SNAP=$(date +%Y%m%d) ## 20221012
|
|
hvn0-chbtl0:~$ virsh shutdown --console vpp-proto
|
|
hvn0-chbtl0:~$ sudo zfs snapshot ssd-vol0/vpp-proto-disk0@${SNAP}-release
|
|
hvn0-chbtl0:~$ sudo zrepl signal wakeup vpp-proto-snapshots
|
|
```
|
|
|
|
There is a `zrepl` running on this machine, which can pick up the snapshot by manually
|
|
waking up the daemon (see the last command above). Each of the hypervisors in the fleet
|
|
will watch this replication endpoint, and if they see new snapshots arrive, they will
|
|
do an incremental pull of the data to their own ZFS filesystem as a snapshot. Old/current
|
|
running labs will not be disrupted, as they will be cloned off of old snapshots.
|
|
|
|
You will find the image as `ssd-vol0/hvn0.chbtl0.ipng.ch/ssd-vol0/vpp-proto-disk0`:
|
|
```
|
|
lab:~$ ssh -A root@hvn0.lab.ipng.ch 'zfs list -t snap'
|
|
NAME USED AVAIL REFER MOUNTPOINT
|
|
ssd-vol0/hvn0.chbtl0.ipng.ch/ssd-vol0/vpp-proto-disk0@20221013-release 0B - 6.04G -
|
|
```
|
|
|
|
## Install
|
|
|
|
Make sure that you're logged in to the `lab.ipng.ch` machine with an SSH key agent,
|
|
or key forwarding, so that you can manipulate the hypervisors. Installing for the first
|
|
time requires adding a few PIP packages:
|
|
|
|
```
|
|
lab:~/src/lab$ pip3 install hiyapyco
|
|
lab:~/src/lab$ pip3 install ipaddress
|
|
lab:~/src/lab$ pip3 install jinja2
|
|
lab:~/src/lab$ pip3 install jinja2-ansible-filters
|
|
```
|
|
.. after which the generator should be able to create its artifacts!
|
|
|
|
## Usage
|
|
|
|
There are three hypervisor nodes each running one isolated lab environment:
|
|
* hvn0.lab.ipng.ch runs VPP lab0
|
|
* hvn1.lab.ipng.ch runs VPP lab1
|
|
* hvn2.lab.ipng.ch runs VPP lab2
|
|
|
|
Now that we have a base image (in the form of `vpp-proto-disk0@$(date)-release`), we can
|
|
make point-in-time clones of them, copy over any specifics (like IP addresses, hostname,
|
|
SSH keys, Bird/FRR configs, etc). We do this on the lab controller `lab.ipng.ch` which:
|
|
|
|
1. Looks on the hypervisor to see if there is a running VM, and if there is, bails
|
|
1. Looks on the hypervisor to see if there is an existing cloned image, and if there is bails
|
|
1. Builds a local overlay directory using a generator and Jinja2 (ie. `build/vpp0-0/`)
|
|
1. Creates a new cloned filesystem based off of a base `vpp-proto-disk0` snapshot on the hypervisor
|
|
1. Mounts that filesystem
|
|
1. Rsync's the built overlay into that filesystem
|
|
1. Unmounts the filesystem
|
|
1. Starts the VM using the newly built filesystem
|
|
1. Commits the `openvswitch` topology configuration (see `overlays/*/ovs-config.sh`)
|
|
|
|
Of course, the first two steps are meant to ensure we don't clobber running labs, which can
|
|
be overridden with the `--force` flag. And when the lab is finished, it's common practice to
|
|
shut down the VMs and destroy the clones.
|
|
|
|
```
|
|
lab:~/src/lab$ ./generate --host hvn0.lab.ipng.ch --overlay default
|
|
lab:~/src/lab$ LAB=0 ./destroy ## remove VMs and ZFS clones
|
|
lab:~/src/lab$ LAB=0 ./create ## create ZFS 'pristine' snapshot
|
|
lab:~/src/lab$ LAB=0 ./virshall start ## Start the VMs
|
|
lab:~/src/lab$ LAB=0 ./virshall shutdown ## Gracefully stop the VMs
|
|
lab:~/src/lab$ LAB=0 ./virshall destroy ## Hard poweroff the VMs
|
|
lab:~/src/lab$ LAB=0 ./pristine ## return the lab to the latest 'pristine' snapshot
|
|
```
|
|
|
|
### Generate
|
|
|
|
The generator reads input YAML files one after another merging and overriding them as it goes along,
|
|
then for each node building a `node` dictionary alongside the `lab` and other information from the
|
|
config files. Then, it read the `overlays` dictionary for a given --overlay type, reading all the
|
|
common files from that overlay directory and assembling an output directory which will hold the
|
|
per-node overrides, emitting them to the directory specified by the --build flag. It also copies in
|
|
any per-node files (if they exist) from the overlays/$(overlay)/hostname/$(node.hostname)/ giving full
|
|
control of the filesystem's ultimate contents.
|
|
|
|
```
|
|
lab:~/src/lab$ ./generate --host hvn0.lab.ipng.ch --overlay default
|
|
lab:~/src/lab$ git status build/default/hvn0.lab.ipng.ch/
|
|
```
|
|
|
|
### Destroy
|
|
|
|
Ensures that both the VMs are not running (and will stop them if they are), and their filesystem
|
|
clones are destroyed. Obviously this is the most dangerous operation of the bunch, but the philosophy
|
|
of the lab is that the VMs can be re-created off of a stable base image and a generated build.
|
|
|
|
```
|
|
lab:~/src/lab$ LAB=0 ./command destroy ## remove VMs and ZFS clones on hvn0.lab.ipng.ch
|
|
```
|
|
|
|
### Create
|
|
|
|
Based on a generated directory and a lab YAML description, uses SSH to connect to the hypervisor,
|
|
create a clone of the base `vpp-proto` snapshot, mount it locally in a staging directory, then rsync
|
|
over the generated overlay from files from the generator output (build/$(overlay)/$(node.hostname))
|
|
after which the directory is unmounted and a specific ZFS snapshot is created called `pristine`.
|
|
The VMs are booted off of their `pristine` snapshot.
|
|
|
|
Typically, it's necessary to destroy/create, only when the build or the base image change. Otherwise,
|
|
the lab can be brought back into a _factory default_ state by rolling back to the `pristine` snapshot.
|
|
|
|
```
|
|
lab:~/src/lab$ LAB=0 ./create ## create ZFS clones, copy in the build
|
|
```
|
|
|
|
### Pristine
|
|
|
|
In the process of creating the ZFS clones and their per-node filesystems, a snapshot of each VM's
|
|
boot disk is made, and this is called the `pristine` snapshot. After using the lab, it can be quickly
|
|
brought back into a default state by rolling back the disks to the `pristine` snapshot and restarting
|
|
the virtual machines.
|
|
|
|
```
|
|
lab:~/src/lab$ LAB=0 ./command start ## Start the VMs
|
|
lab:~/src/lab$ LAB=0 ./command shutdown ## Start the VMs
|
|
lab:~/src/lab$ LAB=0 ./command pristine ## return the lab to the latest 'pristine' snapshot
|
|
```
|