Files
ipng.ch/content/articles/2022-10-14-lab-1.md
Pim van Pelt b2129702ae
All checks were successful
continuous-integration/drone/push Build is passing
remove unnecessary raw/endraw tags from Jekyll. h/t Luiz Amaral
2024-10-13 15:55:47 +02:00

646 lines
29 KiB
Markdown
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
date: "2022-10-14T19:52:11Z"
title: VPP Lab - Setup
aliases:
- /s/articles/2022/10/14/lab-1.html
---
{{< image width="200px" float="right" src="/assets/vpp/fdio-color.svg" alt="VPP" >}}
# Introduction
In a previous post ([VPP Linux CP - Virtual Machine Playground]({{< ref "2021-12-23-vpp-playground" >}})), I
wrote a bit about building a QEMU image so that folks can play with the [Vector Packet Processor](https://fd.io)
and the Linux Control Plane code. Judging by our access logs, this image has definitely been downloaded a bunch,
and I myself use it regularly when I want to tinker a little bit, without wanting to impact the production
routers at [AS8298]({{< ref "2021-02-27-network" >}}).
The topology of my tests has become a bit more complicated over time, and often just one router would not be
enough. Yet, repeatability is quite important, and I found myself constantly reinstalling / recheckpointing
the `vpp-proto` virtual machine I was using. I got my hands on some LAB hardware, so it's time for an upgrade!
## IPng Networks LAB - Physical
{{< image width="300px" float="left" src="/assets/lab/physical.png" alt="Physical" >}}
First, I specc'd out a few machines that will serve as hypervisors. From top to bottom in the picture here, two
FS.com S5680-20SQ switches -- I reviewed these earlier [[ref]({{< ref "2021-08-07-fs-switch" >}})], and I really
like these, as they come with 20x10G, 4x25G and 2x40G ports, an OOB management port and serial to configure them.
Under it, is its larger brother, with 48x10G and 8x100G ports, the FS.com S5860-48SC. Although it's a bit more
expensive, it's also necessary because I often test VPP at higher bandwidth, and as such being able to make
ethernet topologies by mixing 10, 25, 40, 100G is super useful for me. So, this switch is `fsw0.lab.ipng.ch`
and dedicated to lab experiments.
Connected to the switch are my trusty `Rhino` and `Hippo` machines. If you remember that game _Hungry Hungry Hippos_
that's where the name comes from. They are both Ryzen 5950X on ASUS B550 motherboard, with each 2x1G i350 copper
nics (pictured here not connected), and 2x100G i810 QSFP network cards (properly slotted in the motherboard'ss
PCIe v4.0 x16 slot).
Finally, three Dell R720XD machines serve as the to be built VPP testbed. They each come with 128GB of RAM, 2x500G
SSDs, two Intel 82599ES dual 10G NICs (four ports total), and four Broadcom BCM5720 1G NICs. The first 1G port is
connected to a management switch, and it doubles up as an IPMI speaker, so I can turn on/off the hypervisors
remotely. All four 10G ports are connected with DACs to `fsw0-lab`, as are two 1G copper ports (the blue UTP
cables). Everything can be turned on/off remotely, which is useful for noise, heat and overall the environment 🍀.
## IPng Networks LAB - Logical
{{< image width="200px" float="right" src="/assets/lab/logical.svg" alt="Logical" >}}
I have three of these Dell R720XD machines in the lab, and each one of them will run one complete lab environment,
consisting of four VPP virtual machines, network plumbing, and uplink. That way, I can turn on one hypervisor,
say `hvn0.lab.ipng.ch`, prepare and boot the VMs, mess around with it, and when I'm done, return the VMs to a
pristine state, and turn off the hypervisor. And, because I have three of these machines, I can run three separate
LABs at the same time, or one really big one spanning all the machines. Pictured on the right is a logical sketch
of one of the LABs (LAB id=0), with a bunch of VPP virtual machines, each four NICs daisychained together, with
a few NICs left for experimenting.
### Headend
At the top of the logical environment, I am going to be using one of our production machines (`hvn0.chbtl0.ipng.ch`)
which will run a permanently running LAB _headend_, a Debian VM called `lab.ipng.ch`. This allows me to hermetically
seal the LAB environments, letting me run them entirely in RFC1918 space, and by forcing the LAbs to be connected
under this machine, I can ensure that no unwanted traffic enters or exits the network [imagine a loadtest at
100Gbit accidentally leaking, this may or totally may not have once happened to me before ...].
### Disk images
On this production hypervisor (`hvn0.chbtl0.ipng.ch`), I'll also prepare and maintain a prototype `vpp-proto` disk
image, which will serve as a consistent image to boot the LAB virtual machines. This _main_ image will be replicated
over the network into all three `hvn0 - hvn2` hypervisor machines. This way, I can do periodical maintenance on the
_main_ `vpp-proto` image, snapshot it, publish it as a QCOW2 for downloading (see my [[VPP Linux CP - Virtual Machine
Playground]({{< ref "2021-12-23-vpp-playground" >}})] post for details on how it's built and what you can do with it
yourself!). The snapshots will then also be sync'd to all hypervisors, and from there I can use simple ZFS filesystem
_cloning_ and _snapshotting_ to maintain the LAB virtual machines.
### Networking
Each hypervisor will get an install of [Open vSwitch](https://openvswitch.org/), a production quality, multilayer virtual switch designed to
enable massive network automation through programmatic extension, while still supporting standard management interfaces
and protocols. This takes lots of the guesswork and tinkering out of Linux bridges in KVM/QEMU, and it's a perfect fit
due to its tight integration with `libvirt` (the thing most of us use in Debian/Ubuntu hypervisors). If need be, I can
add one or more of the 1G or 10G ports as well to the OVS fabric, to build more complicated topologies. And, because
the OVS infrastructure and libvirt both allow themselves to be configured over the network, I can control all aspects
of the runtime directly from the `lab.ipng.ch` headend, not having to log in to the hypervisor machines at all. Slick!
# Implementation Details
I start with image management. On the production hypervisor, I create a 6GB ZFS dataset that will serve as my `vpp-proto`
machine, and install it using the exact same method as the playground [[ref]({{< ref "2021-12-23-vpp-playground" >}})].
Once I have it the way I like it, I'll poweroff the VM, and see to this image being replicated to all hypervisors.
## ZFS Replication
Enter [zrepl](https://zrepl.github.io/), a one-stop, integrated solution for ZFS replication. This tool is incredibly
powerful, and can do snapshot management, sourcing / sinking replication, of course using incremental snapshots as they
are native to ZFS. Because this is a LAB article, not a zrepl tutorial, I'll just cut to the chase and show the
configuration I came up with.
```
pim@hvn0-chbtl0:~$ cat << EOF | sudo tee /etc/zrepl/zrepl.yml
global:
logging:
# use syslog instead of stdout because it makes journald happy
- type: syslog
format: human
level: warn
jobs:
- name: snap-vpp-proto
type: snap
filesystems:
'ssd-vol0/vpp-proto-disk0<': true
snapshotting:
type: manual
pruning:
keep:
- type: last_n
count: 10
- name: source-vpp-proto
type: source
serve:
type: stdinserver
client_identities:
- "hvn0-lab"
- "hvn1-lab"
- "hvn2-lab"
filesystems:
'ssd-vol0/vpp-proto-disk0<': true # all filesystems
snapshotting:
type: manual
EOF
pim@hvn0-chbtl0:~$ cat << EOF | sudo tee -a /root/.ssh/authorized_keys
# ZFS Replication Clients for IPng Networks LAB
command="zrepl stdinserver hvn0-lab",restrict ecdsa-sha2-nistp256 <omitted> root@hvn0.lab.ipng.ch
command="zrepl stdinserver hvn1-lab",restrict ecdsa-sha2-nistp256 <omitted> root@hvn1.lab.ipng.ch
command="zrepl stdinserver hvn2-lab",restrict ecdsa-sha2-nistp256 <omitted> root@hvn2.lab.ipng.ch
EOF
```
To unpack this, there are two jobs configured in **zrepl**:
* `snap-vpp-proto` - the purpose of this job is to track snapshots as they are created. Normally, zrepl is configured
to automatically make snapshots every hour and copy them out, but in my case, I only want to take snapshots when I changed
and released the `vpp-proto` image, not periodically. So, I set the snapshotting to manual, and let the system keep the last
ten images.
* `source-vpp-proto` - this is a source job that uses a _lazy_ (albeit fine in this lab environment) method to serve the
snapshots to clients. By adding these SSH keys to the _authorized_keys_ file, but restricting them to be able to execute
only the `zrepl stdinserver` command, and nothing else (ie. these keys cannot log in to the machine). If any given server
were to present thesze keys, I can now map them to a **zrepl client** (for example, `hvn0-lab` for the SSH key presented by
hostname `hvn0.lab.ipng.ch`. The source job now knows to serve the listed filesystems (and their dataset children, noted by
the `<` suffix), to those clients.
For the client side, each of the hypervisors gets only one job, called a _pull_ job, which will periodically wake up (every
minute) and ensure that any pending snapshots and their incrementals from the remote _source_ are slurped in and replicated
to a _root_fs_ dataset, in this case I called it `ssd-vol0/hvn0.chbtl0.ipng.ch` so I can track where the datasets come from.
```
pim@hvn0-lab:~$ sudo ssh-keygen -t ecdsa -f /etc/zrepl/ssh/identity -C "root@$(hostname -f)"
pim@hvn0-lab:~$ cat << EOF | sudo tee /etc/zrepl/zrepl.yml
global:
logging:
# use syslog instead of stdout because it makes journald happy
- type: syslog
format: human
level: warn
jobs:
- name: vpp-proto
type: pull
connect:
type: ssh+stdinserver
host: hvn0.chbtl0.ipng.ch
user: root
port: 22
identity_file: /etc/zrepl/ssh/identity
root_fs: ssd-vol0/hvn0.chbtl0.ipng.ch
interval: 1m
pruning:
keep_sender:
- type: regex
regex: '.*'
keep_receiver:
- type: last_n
count: 10
recv:
placeholder:
encryption: off
```
After restarting zrepl for each of the machines (the _source_ machine and the three _pull_ machines), I can now do the
following cool hat trick:
```
pim@hvn0-chbtl0:~$ virsh start --console vpp-proto
## Do whatever maintenance, and then poweroff the VM
pim@hvn0-chbtl0:~$ sudo zfs snapshot ssd-vol0/vpp-proto-disk0@20221019-release
pim@hvn0-chbtl0:~$ sudo zrepl signal wakeup source-vpp-proto
```
This signals the zrepl daemon to re-read the snapshots, which will pick up the newest one, and then without me doing
much of anything else:
```
pim@hvn0-lab:~$ sudo zfs list -t all | grep vpp-proto
ssd-vol0/hvn0.chbtl0.ipng.ch/ssd-vol0/vpp-proto-disk0 6.60G 367G 6.04G -
ssd-vol0/hvn0.chbtl0.ipng.ch/ssd-vol0/vpp-proto-disk0@20221013-release 499M - 6.04G -
ssd-vol0/hvn0.chbtl0.ipng.ch/ssd-vol0/vpp-proto-disk0@20221018-release 24.1M - 6.04G -
ssd-vol0/hvn0.chbtl0.ipng.ch/ssd-vol0/vpp-proto-disk0@20221019-release 0B - 6.04G -
```
That last image was just pushed automatically to all hypervisors! If they're turned off, no worries, as soon as they
start up, their local **zrepl** will make its next minutely poll, and pull in all snapshots, bringing the machine up
to date. So even when the hypervisors are normally turned off, this is zero-touch and maintenance free.
## VM image maintenance
Now that I have a stable image to work off of, all I have to do is `zfs clone` this image into new per-VM datasets,
after which I can mess around on the VMs all I want, and when I'm done, I can `zfs destroy` the clone and bring it
back to normal. However, I clearly don't want one and the same clone for each of the VMs, as they do have lots of
config files that are specific to that one _instance_. For example, the mgmt IPv4/IPv6 addresses are unique, and
the VPP and Bird/FRR configs are unique as well. But how unique are they, really?
Enter Jinja (known mostly from Ansible). I decide to make some form of per-VM config files that are generated based
on some templates. That way, I can clone the base ZFS dataset, copy in the deltas, and boot that instead. And to
be extra efficient, I can also make a per-VM `zfs snapshot` of the cloned+updated filesystem, before tinkering with
the VMs, which I'll call a `pristine` snapshot. Still with me?
1. First, clone the base dataset into a per-VM dataset, say `ssd-vol0/vpp0-0`
1. Then, generate a bunch of override files, copying them into the per-VM dataset `ssd-vol0/vpp0-0`
1. Finally, create a snapshot of that, called `ssd-vol0/vpp0-0@pristine` and boot off of that.
Now, returning the VM to a pristine state is simply a matter of shutting down the VM, performing a `zfs rollback`
to the `pristine` snapshot, and starting the VM again. Ready? Let's go!
### Generator
So off I go, writing a small Python generator that uses Jinja to read a bunch of YAML files, merging them along
the way, and then traversing a set of directories with template files and per-VM overrides, to assemble a build
output directory with a fully formed set of files that I can copy into the per-VM dataset.
Take a look at this as a minimally viable configuration:
```
pim@lab:~/src/lab$ cat config/common/generic.yaml
overlays:
default:
path: overlays/bird/
build: build/default/
lab:
mgmt:
ipv4: 192.168.1.80/24
ipv6: 2001:678:d78:101::80/64
gw4: 192.168.1.252
gw6: 2001:678:d78:101::1
nameserver:
search: [ "lab.ipng.ch", "ipng.ch", "rfc1918.ipng.nl", "ipng.nl" ]
nodes: 4
pim@lab:~/src/lab$ cat config/hvn0.lab.ipng.ch.yaml
lab:
id: 0
ipv4: 192.168.10.0/24
ipv6: 2001:678:d78:200::/60
nameserver:
addresses: [ 192.168.10.4, 2001:678:d78:201::ffff ]
hypervisor: hvn0.lab.ipng.ch
```
Here I define a common config file with fields and attributes which will apply to all LAB environments, things
such as the mgmt network, nameserver search paths, and how many VPP virtual machine nodes I want to build. Then,
for `hvn0.lab.ipng.ch`, I specify an IPv4 and IPv6 prefix assigned to it, some specific nameserver endpoints
that will point at an `unbound` running on `lab.ipng.ch` itself.
I can now create any file I'd like which may use variable substition and other jinja2 style templating. Take
for example these two files:
```
pim@lab:~/src/lab$ cat overlays/bird/common/etc/netplan/01-netcfg.yaml.j2
network:
version: 2
renderer: networkd
ethernets:
enp1s0:
optional: true
accept-ra: false
dhcp4: false
addresses: [ {{node.mgmt.ipv4}}, {{node.mgmt.ipv6}} ]
gateway4: {{lab.mgmt.gw4}}
gateway6: {{lab.mgmt.gw6}}
pim@lab:~/src/lab$ cat overlays/bird/common/etc/netns/dataplane/resolv.conf.j2
domain lab.ipng.ch
search{% for domain in lab.nameserver.search %} {{ domain }}{% endfor %}
{% for resolver in lab.nameserver.addresses %}
nameserver {{ resolver }}
{% endfor %}
```
The first file is a [[NetPlan.io](https://netplan.io/)] configuration that substitutes the correct management
IPv4 and IPv6 addresses and gateways. The second one enumerates a set of search domains and nameservers, so that
each LAB can have their own unique resolvers. I point these at the `lab.ipng.ch` uplink interface, in the case
of the LAB `hvn0.lab.ipng.ch`, this will be 192.168.10.4 and 2001:678:d78:201::ffff, but on `hvn1.lab.ipng.ch`
I can override that to become 192.168.11.4 and 2001:678:d78:211::ffff.
There's one subdirectory for each _overlay_ type (imagine that I want a lab that runs Bird2, but I may also
want one which runs FRR, or another thing still). Within the _overlay_ directory, there's one _common_
tree, with files that apply to every machine in the LAB, and a _hostname_ tree, with files that apply
only to specific nodes (VMs) in the LAB:
```
pim@lab:~/src/lab$ tree overlays/default/
overlays/default/
├── common
│   ├── etc
│   │   ├── bird
│   │   │   ├── bfd.conf.j2
│   │   │   ├── bird.conf.j2
│   │   │   ├── ibgp.conf.j2
│   │   │   ├── ospf.conf.j2
│   │   │   └── static.conf.j2
│   │   ├── hostname.j2
│   │   ├── hosts.j2
│   │   ├── netns
│   │   │   └── dataplane
│   │   │   └── resolv.conf.j2
│   │   ├── netplan
│   │   │   └── 01-netcfg.yaml.j2
│   │   ├── resolv.conf.j2
│   │   └── vpp
│   │   ├── bootstrap.vpp.j2
│   │   └── config
│   │   ├── defaults.vpp
│   │   ├── flowprobe.vpp.j2
│   │   ├── interface.vpp.j2
│   │   ├── lcp.vpp
│   │   ├── loopback.vpp.j2
│   │   └── manual.vpp.j2
│   ├── home
│   │   └── ipng
│   └── root
├── hostname
   ├── vpp0-0
      └── etc
(etc)   └── vpp
      └── config
      └── interface.vpp
```
Now all that's left to do is generate this hierarchy, and of course I can check this in to git and track changes to the
templates and their resulting generated filesystem overrides over time:
```
pim@lab:~/src/lab$ ./generate -q --host hvn0.lab.ipng.ch
pim@lab:~/src/lab$ find build/default/hvn0.lab.ipng.ch/vpp0-0/ -type f
build/default/hvn0.lab.ipng.ch/vpp0-0/home/ipng/.ssh/authorized_keys
build/default/hvn0.lab.ipng.ch/vpp0-0/etc/hosts
build/default/hvn0.lab.ipng.ch/vpp0-0/etc/resolv.conf
build/default/hvn0.lab.ipng.ch/vpp0-0/etc/bird/static.conf
build/default/hvn0.lab.ipng.ch/vpp0-0/etc/bird/bfd.conf
build/default/hvn0.lab.ipng.ch/vpp0-0/etc/bird/bird.conf
build/default/hvn0.lab.ipng.ch/vpp0-0/etc/bird/ibgp.conf
build/default/hvn0.lab.ipng.ch/vpp0-0/etc/bird/ospf.conf
build/default/hvn0.lab.ipng.ch/vpp0-0/etc/vpp/config/loopback.vpp
build/default/hvn0.lab.ipng.ch/vpp0-0/etc/vpp/config/flowprobe.vpp
build/default/hvn0.lab.ipng.ch/vpp0-0/etc/vpp/config/interface.vpp
build/default/hvn0.lab.ipng.ch/vpp0-0/etc/vpp/config/defaults.vpp
build/default/hvn0.lab.ipng.ch/vpp0-0/etc/vpp/config/lcp.vpp
build/default/hvn0.lab.ipng.ch/vpp0-0/etc/vpp/config/manual.vpp
build/default/hvn0.lab.ipng.ch/vpp0-0/etc/vpp/bootstrap.vpp
build/default/hvn0.lab.ipng.ch/vpp0-0/etc/netplan/01-netcfg.yaml
build/default/hvn0.lab.ipng.ch/vpp0-0/etc/netns/dataplane/resolv.conf
build/default/hvn0.lab.ipng.ch/vpp0-0/etc/hostname
build/default/hvn0.lab.ipng.ch/vpp0-0/root/.ssh/authorized_keys
```
## Open vSwitch maintenance
The OVS installs on each Debian hypervisor in the lab is the same. I install the required Debian packages, create
a switchfabric, add one physical network port (the one that will serve as the _uplink_ (VLAN 10 in the sketch above)
for the LAB), and all the virtio ports from KVM.
```
pim@hvn0-lab:~$ sudo vi /etc/netplan/01-netcfg.yaml
network:
vlans:
uplink:
optional: true
accept-ra: false
dhcp4: false
link: eno1
id: 200
pim@hvn0-lab:~$ sudo netplan apply
pim@hvn0-lab:~$ sudo apt install openvswitch-switch python3-openvswitch
pim@hvn0-lab:~$ sudo ovs-vsctl add-br vpplan
pim@hvn0-lab:~$ sudo ovs-vsctl add-port vpplan uplink tag=10
```
The `vpplan` switch fabric and its uplink port will persist across reboots. Then I add a small change to `libvirt`
defined virtual machines:
```
pim@hvn0-lab:~$ virsh edit vpp0-0
...
<interface type='bridge'>
<mac address='52:54:00:00:10:00'/>
<source bridge='vpplan'/>
<virtualport type='openvswitch' />
<target dev='vpp0-0-0'/>
<model type='virtio'/>
<mtu size='9216'/>
<address type='pci' domain='0x0000' bus='0x10' slot='0x00' function='0x0' multifunction='on'/>
</interface>
<interface type='bridge'>
<mac address='52:54:00:00:10:01'/>
<source bridge='vpplan'/>
<virtualport type='openvswitch' />
<target dev='vpp0-0-1'/>
<model type='virtio'/>
<mtu size='9216'/>
<address type='pci' domain='0x0000' bus='0x10' slot='0x00' function='0x1'/>
</interface>
... etc
```
That the only two things I need to do are ensure that the _source bridge_ will be called the same as
the OVS fabric, in my case `vpplan`, and the _virtualport_ type is `openvswitch`, and that's it!
Once all four `vpp0-*` virtual machines each have all four of their network cards updated, when they
boot, the hypervisor will add them each as new untagged ports in the OVS fabric.
To then build the topology that I have in mind for the LAB, where each VPP machine is daisychained to
its siblin, all we have to do is program that into the OVS configuration:
```
pim@hvn0-lab:~$ cat << EOF > ovs-config.sh
#!/bin/sh
#
# OVS configuration for the `default` overlay
LAB=${LAB:=0}
for node in 0 1 2 3; do
for int in 0 1 2 3; do
ovs-vsctl set port vpp${LAB}-${node}-${int} vlan_mode=native-untagged
done
done
# Uplink is VLAN 10
ovs-vsctl add port vpp${LAB}-0-0 tag 10
ovs-vsctl add port uplink tag 10
# Link vpp${LAB}-0 <-> vpp${LAB}-1 in VLAN 20
ovs-vsctl add port vpp${LAB}-0-1 tag 20
ovs-vsctl add port vpp${LAB}-1-0 tag 20
# Link vpp${LAB}-1 <-> vpp${LAB}-2 in VLAN 21
ovs-vsctl add port vpp${LAB}-1-1 tag 21
ovs-vsctl add port vpp${LAB}-2-0 tag 21
# Link vpp${LAB}-2 <-> vpp${LAB}-3 in VLAN 22
ovs-vsctl add port vpp${LAB}-2-1 tag 22
ovs-vsctl add port vpp${LAB}-3-0 tag 22
EOF
pim@hvn0-lab:~$ chmod 755 ovs-config.sh
pim@hvn0-lab:~$ sudo ./ovs-config.sh
```
The first block here wheels over all nodes and then for all of ther ports, sets the VLAN mode to what
OVS calleds 'native-untagged'. In this mode, the `tag` becomes the VLAN in which the port will operate,
but, to add as well dot1q tagged additional VLANs, we can use the syntax `add port ... trunks 10,20,30`.
TO see the configuration, `ovs-vsctl show port vpp0-0-0` will show the switch port configuration, while
`ovs-vsctl show interface vpp0-0-0` will show the virtual machine's NIC configuration (think of the
difference here as the switch port on the one hand, and the NIC (interface) plugged into it on the other).
### Deployment
There's three main points to consider when deploying these lab VMs:
1. Create the VMs and their ZFS datasets
1. Destroy the VMs and their ZFS datasets
1. Bring the VMs into a pristine state
#### Create
If the hypervisor doesn't yet have a LAB running, we need to create it:
```
BASE=${BASE:=ssd-vol0/hvn0.chbtl0.ipng.ch/ssd-vol0/vpp-proto-disk0@20221019-release}
BUILD=${BUILD:=default}
LAB=${LAB:=0}
## Do not touch below this line
LABDIR=/var/lab
STAGING=$LABDIR/staging
HVN="hvn${LAB}.lab.ipng.ch"
echo "* Cloning base"
ssh root@$HVN "set -x; for node in 0 1 2 3; do VM=vpp${LAB}-\${node}; \
mkdir -p $STAGING/\$VM; zfs clone $BASE ssd-vol0/\$VM; done"
sleep 1
echo "* Mounting in staging"
ssh root@$HVN "set -x; for node in 0 1 2 3; do VM=vpp${LAB}-\${node}; \
mount /dev/zvol/ssd-vol0/\$VM-part1 $STAGING/\$VM; done"
echo "* Rsyncing build"
rsync -avugP build/$BUILD/$HVN/ root@hvn${LAB}.lab.ipng.ch:$STAGING
echo "* Setting permissions"
ssh root@$HVN "set -x; for node in 0 1 2 3; do VM=vpp${LAB}-\${node}; \
chown -R root. $STAGING/\$VM/root; done"
echo "* Unmounting and snapshotting pristine state"
ssh root@$HVN "set -x; for node in 0 1 2 3; do VM=vpp${LAB}-\${node}; \
umount $STAGING/\$VM; zfs snapshot ssd-vol0/\${VM}@pristine; done"
echo "* Starting VMs"
ssh root@$HVN "set -x; for node in 0 1 2 3; do VM=vpp${LAB}-\${node}; \
virsh start \$VM; done"
echo "* Committing OVS config"
scp overlays/$BUILD/ovs-config.sh root@$HVN:$LABDIR
ssh root@$HVN "set -x; LAB=$LAB $LABDIR/ovs-config.sh"
```
After running this, the hypervisor will have 4 clones, and 4 snapshots (one for each virtual machine):
```
root@hvn0-lab:~# zfs list -t all
NAME USED AVAIL REFER MOUNTPOINT
ssd-vol0 6.80G 367G 24K /ssd-vol0
ssd-vol0/hvn0.chbtl0.ipng.ch 6.60G 367G 24K none
ssd-vol0/hvn0.chbtl0.ipng.ch/ssd-vol0 6.60G 367G 24K none
ssd-vol0/hvn0.chbtl0.ipng.ch/ssd-vol0/vpp-proto-disk0 6.60G 367G 6.04G -
ssd-vol0/hvn0.chbtl0.ipng.ch/ssd-vol0/vpp-proto-disk0@20221013-release 499M - 6.04G -
ssd-vol0/hvn0.chbtl0.ipng.ch/ssd-vol0/vpp-proto-disk0@20221018-release 24.1M - 6.04G -
ssd-vol0/hvn0.chbtl0.ipng.ch/ssd-vol0/vpp-proto-disk0@20221019-release 0B - 6.04G -
ssd-vol0/vpp0-0 43.6M 367G 6.04G -
ssd-vol0/vpp0-0@pristine 1.13M - 6.04G -
ssd-vol0/vpp0-1 25.0M 367G 6.04G -
ssd-vol0/vpp0-1@pristine 1.14M - 6.04G -
ssd-vol0/vpp0-2 42.2M 367G 6.04G -
ssd-vol0/vpp0-2@pristine 1.13M - 6.04G -
ssd-vol0/vpp0-3 79.1M 367G 6.04G -
ssd-vol0/vpp0-3@pristine 1.13M - 6.04G -
```
The last thing the create script does is commit the OVS configuration, because when the VMs are shutdown
or newly created, KVM will add them to the switching fabric as untagged/unconfigured ports.
But would you look at that! The delta between the base image and the `pristine` snapshots is about 1MB of
configuration files, the ones that I generated and rsync'd in above, and then once the machine boots, it
will have a read/write mounted filesystem as per normal, except it's a delta on top of the snapshotted,
cloned dataset.
#### Destroy
I love destroying things! But in this case, I'm removing what are essentially ephemeral disk images, as
I still have the base image to clone from. But, the destroy is conceptually very simple:
```
BASE=${BASE:=ssd-vol0/hvn0.chbtl0.ipng.ch/ssd-vol0/vpp-proto-disk0@20221018-release}
LAB=${LAB:=0}
## Do not touch below this line
HVN="hvn${LAB}.lab.ipng.ch"
echo "* Destroying VMs"
ssh root@$HVN "set -x; for node in 0 1 2 3; do VM=vpp${LAB}-\${node}; \
virsh destroy \$VM; done"
echo "* Destroying ZFS datasets"
ssh root@$HVN "set -x; for node in 0 1 2 3; do VM=vpp${LAB}-\${node}; \
zfs destroy -r ssd-vol0/\$VM; done"
```
After running this, the VMs will be shutdown and their cloned filesystems (including any snapshots
those may have) are wiped. To get back into a working state, all I must do is run `./create` again!
#### Pristine
Sometimes though, I don't need to completely destroy the VMs, but rather I want to put them back into
the state they where just after creating the LAB. Luckily, the create made a snapshot (called `pristine`)
for each VM before booting it, so bringing the LAB back to _factory default_ settings is really easy:
```
BUILD=${BUILD:=default}
LAB=${LAB:=0}
## Do not touch below this line
LABDIR=/var/lab
STAGING=$LABDIR/staging
HVN="hvn${LAB}.lab.ipng.ch"
## Bring back into pristine state
echo "* Restarting VMs from pristine snapshot"
ssh root@$HVN "set -x; for node in 0 1 2 3; do VM=vpp${LAB}-\${node}; \
virsh destroy \$VM;
zfs rollback ssd-vol0/\${VM}@pristine;
virsh start \$VM; done"
echo "* Committing OVS config"
scp overlays/$BUILD/ovs-config.sh root@$HVN:$LABDIR
ssh root@$HVN "set -x; $LABDIR/ovs-config.sh"
```
## Results
After completing this project, I have a completely hands-off, automated and autogenerated, and very maneageable set
of three LABs, each booting up in a running OSPF/OSPFv3 enabled topology for IPv4 and IPv6:
```
pim@lab:~/src/lab$ traceroute -q1 vpp0-3
traceroute to vpp0-3 (192.168.10.3), 30 hops max, 60 byte packets
1 e0.vpp0-0.lab.ipng.ch (192.168.10.5) 1.752 ms
2 e0.vpp0-1.lab.ipng.ch (192.168.10.7) 4.064 ms
3 e0.vpp0-2.lab.ipng.ch (192.168.10.9) 5.178 ms
4 vpp0-3.lab.ipng.ch (192.168.10.3) 7.469 ms
pim@lab:~/src/lab$ ssh ipng@vpp0-3
ipng@vpp0-3:~$ traceroute6 -q1 vpp2-3
traceroute to vpp2-3 (2001:678:d78:220::3), 30 hops max, 80 byte packets
1 e1.vpp0-2.lab.ipng.ch (2001:678:d78:201::3:2) 2.088 ms
2 e1.vpp0-1.lab.ipng.ch (2001:678:d78:201::2:1) 6.958 ms
3 e1.vpp0-0.lab.ipng.ch (2001:678:d78:201::1:0) 8.841 ms
4 lab0.lab.ipng.ch (2001:678:d78:201::ffff) 7.381 ms
5 e0.vpp2-0.lab.ipng.ch (2001:678:d78:221::fffe) 8.304 ms
6 e0.vpp2-1.lab.ipng.ch (2001:678:d78:221::1:21) 11.633 ms
7 e0.vpp2-2.lab.ipng.ch (2001:678:d78:221::2:22) 13.704 ms
8 vpp2-3.lab.ipng.ch (2001:678:d78:220::3) 15.597 ms
```
If you read this far, thanks! Each of these three LABs come with 4x10Gbit DPDK based packet generators (Cisco T-Rex),
four VPP machines running either Bird2 or FRR, and together they are connected to a 100G capable switch.
**These LABs are for rent, and we offer hands-on training on them.** Please **[contact](/s/contact/)** us for
daily/weekly rates, and custom training sessions.
I checked the generator and deploy scripts in to a git repository, which I'm happy to share if there's
an interest. But because it contains a few implementation details and doesn't do a lot of fool-proofing, as well as
because most of this can be easily recreated by interested parties from this blogpost, I decided not to publish
the LAB project github, but on our private git.ipng.ch server instead. Mail us if you'd like to take a closer look,
I'm happy to share the code.