474 lines
23 KiB
Markdown
474 lines
23 KiB
Markdown
---
|
||
date: "2023-12-17T13:37:00Z"
|
||
title: Debian on IPng's VPP Routers
|
||
aliases:
|
||
- /s/articles/2023/12/17/defra0-debian.html
|
||
---
|
||
|
||
{{< image width="200px" float="right" src="/assets/debian-vpp/debian-logo.png" alt="Debian" >}}
|
||
|
||
# Introduction
|
||
|
||
When IPng Networks first built out a european network, I was running the Disaggregated Network
|
||
Operating System [[ref](https://www.danosproject.org/)], initially based on AT&T’s “dNOS” software
|
||
framework. Over time though, the DANOS project slowed down, and the developers with whom I had a
|
||
pretty good relationship all left for greener pastures.
|
||
|
||
In 2019, Pierre Pfister (and several others) built a VPP _router sandbox_ [[ref](https://wiki.fd.io/view/VPP_Sandbox/router)],
|
||
which graduated into a feature called the Linux Control Plane plugin
|
||
[[ref](https://s3-docs.fd.io/vpp/22.10/developer/plugins/lcp.html)]. Lots of folks put in an effort
|
||
for the Linux Control Plane, notably Neale Ranns from Cisco (these days Graphiant), and Matt Smith
|
||
and Jon Loeliger from Netgate (who ship this as TNSR [[ref](https://netgate.com/tnsr)], check it out!).
|
||
I helped as well, by adding a bunch of Netlink handling and VPP->Linux synchronization code,
|
||
which I've written about a bunch on this blog in the 2021 VPP development series [[ref]({{< ref "2021-08-12-vpp-1" >}})].
|
||
|
||
At the time, Ubuntu and CentOS were the supported platforms, so I installed a bunch of Ubuntu
|
||
machines when doing the deploy with my buddy Fred from IP-Max [[ref](https://ip-max.net)]. But as
|
||
time went by, I fell back to my old habit of running Debian on hypervisors and VMs for the services
|
||
at IPng Networks. After some time automating mostly everything with Ansible and Kees, I got tired of
|
||
those places where I needed branches like `if Ubuntu then ... elif Debian then ... elif OpenBSD
|
||
then ... else panic`.
|
||
|
||
I took stock of the fleet at the end of 2023, and I found the following:
|
||
|
||
* ***OpenBSD***: 3 virtual machines, bastion jumphosts connected to Internet and IPng Site Local
|
||
* ***Ubuntu***: 4 physical machines, VPP routers (`nlams0`, `defra0`, `chplo0` and `usfmt0`)
|
||
* ***Debian***: 22 physical machines and 116 virtual machines, running internal and public services,
|
||
almost all of these machines are entirely in IPng Site Local [[ref]({{< ref "2023-03-11-mpls-core" >}})], not connected to the
|
||
internet at all.
|
||
|
||
It became clear to me that I could make a small sprint to standardize all physical hardware on
|
||
Debian Bookworm, and move away from Ubuntu LTS. In case you're wondering: there's **nothing wrong with
|
||
Ubuntu**, although I will admit I'm not a big fan of `snapd` and `cloud-init` but they are easily
|
||
disabled. I guess with the way the situation evolved in AS8298, I ended up running a fair few more Debian
|
||
physical (and virtual) machines, so I'll make an executive decision to move to Debian. By the way,
|
||
the fun thing about IPng is that being the _Chief of Everything_ (COE), I get to make those calls
|
||
unilaterally :)
|
||
|
||
## Upgrading to Debian
|
||
|
||
Luckily, I already have a fair number of VPP routers that have been deployed on Debian (mostly
|
||
_Bullseye_, but one of them is _Bookworm_), and my LAB environment [[ref]({{< ref "2022-10-14-lab-1" >}})] is running Debian Bookworm as well. Although its native habitat is Ubuntu, I
|
||
regularly run VPP in a Debian environment, for example when Adrian contributed the MPLS code
|
||
[[ref]({{< ref "2023-05-21-vpp-mpls-3" >}})], he also recommended Debian 12, because that ships
|
||
with a modern libnl which supports a few bits and pieces he needed.
|
||
|
||
### Preparations
|
||
|
||
{{< image width="300px" float="right" src="/assets/debian-vpp/defra0.png" alt="Frankfurt" >}}
|
||
|
||
OK, while my network is not large, it'a also not completely devoid of customers, so instead of a
|
||
YOLO, I decide to make an action plan that roughly looks like this:
|
||
|
||
1. Notify customers of upcoming maintenance
|
||
1. For each of the routers to-be-upgraded:
|
||
1. Check the borgmatic daily backups
|
||
1. Drain traffic away from the router
|
||
1. Use IPMI to re-install it remotely
|
||
1. Put the VPP, Bird, SSH configs back
|
||
1. Undrain the router
|
||
1. Drink my advents-calendar tea!
|
||
|
||
When deploying a datacenter site, I am adamant to have a consistent and dependable environment. At
|
||
each site, specifically those that are a bit further away, I deploy a standard issue PCEngines
|
||
APU [[ref](https://pcengines.ch/)] with 802.11ac WiFi, serial, and IPMI access to any machine that may be
|
||
there. If you ever visit a datacenter floor where I'm present, look for an SSID like `AS8298 FRA` in the
|
||
case of the Frankfurt site. The password is `IPngGuest`, you're welcome to some bits of bandwidth in a
|
||
pinch :)
|
||
|
||
You can find the APU in the picture to the right. All the way at the top, you'll see a
|
||
small blue machine with two antenna's sticking out. It's connected to my carrier, AS25091's packet
|
||
factory Cisco ASR9010, for out of band connectivity. Then, all the way at the bottom, you can see my
|
||
Supermicro SYS-5018D-FN8T called `defra0.ipng.ch` paired with a Centec MPLS switch for transport and
|
||
breakout ports 😍.
|
||
|
||
When I installed all of this kit, I did two specific things that will greatly benefit me now:
|
||
|
||
1. I enabled IPMI KVM and Serial-over-LAN on the Supermicro, so I can reach it over its dedicated
|
||
IPMI port, and see what its VGA does. Also, in case anything weird happens to VPP and/or the
|
||
Centec switches and IPng Site Local becomes unavailable, I can still log in and take a look via serial.
|
||
2. I installed Samba on the APU, which allows me to instruct the IPMI to insert a virtual USB 'stick' by
|
||
means of mounting a SAMBA share. This is incredibly useful in scenarios such as this reinstall!
|
||
|
||
Allthough I do trust it, I would hate to reboot the machine to find that IPMI or serial doesn't
|
||
work. So let me make sure that the machine is still good go to:
|
||
|
||
```
|
||
pim@summer:~$ ssh -L 8443:defra0-ipmi:443 cons0.defra0
|
||
pim@cons0-defra0:~$ ipmitool -I lanplus -H defra0-ipmi -U ${IPMI_USER} -P ${IPMI_PASS} sol activate
|
||
[SOL Session operational. Use ~? for help]
|
||
|
||
defra0 login:
|
||
```
|
||
|
||
Nice going! Checking the samba configuration, it is super straightforward:
|
||
```
|
||
pim@cons0-defra0:~$ cat /etc/samba/smb.conf
|
||
[global]
|
||
workgroup = WINSHARE
|
||
server string = Ubuntu Samba %v
|
||
netbios name = console
|
||
security = user
|
||
map to guest = bad user
|
||
dns proxy = no
|
||
server min protocol = NT1
|
||
#============================ Share Definitions ==============================
|
||
|
||
[share]
|
||
path = /var/samba
|
||
browsable = yes
|
||
writable = no
|
||
guest ok = yes
|
||
read only = yes
|
||
|
||
pim@cons0-defra0:/var/samba$ ls -lrt
|
||
total 2306000
|
||
-rw-r--r-- 1 pim pim 441450496 Feb 10 2021 danos-2012-base-amd64.iso
|
||
-rw-r--r-- 1 pim pim 1261371392 Aug 24 2021 ubuntu-20.04.3-live-server-amd64.iso
|
||
-rw-r--r-- 1 pim pim 658505728 Dec 17 17:20 debian-12.4.0-amd64-netinst.iso
|
||
|
||
pim@cons0-defra0:~$ ip -br a
|
||
internal UP 172.16.13.1/24 fd25:8c03:9b1c:100d::1/64 fe80::b49b:1cff:feb2:7f2f/64
|
||
external UP 46.20.246.50/29 2a02:2528:ff01::2/64 fe80::d8fe:8ff:fe73:8c99/64
|
||
wlp4s0 UP 172.16.14.1/24 fd25:8c03:9b1c:100e::1/64 fe80::6f0:21ff:fe9b:562e/64
|
||
```
|
||
|
||
You can see the lifecycle progression on this server. In Feb'21, I installed DANOS 20.12, then
|
||
moving to Ubuntu LTS 20.04 around Aug'21, and now it is time to advance once again, this time to
|
||
Debian 12.
|
||
|
||
As a final pre-flight check, while using the port forwarding I set up (`-L` flag above), I will log
|
||
in to the IPMI controller remotely, to insert this CD image into the virtual CDROM drive, like so:
|
||
|
||
{{< image src="/assets/debian-vpp/supermicro-ipmi.png" alt="IPMI" >}}
|
||
|
||
And indeed, it pops up in the running Ubuntu router:
|
||
```
|
||
pim@defra0:~$ uname -a
|
||
Linux defra0 5.4.0-109-generic #123-Ubuntu SMP Fri Apr 8 09:10:54 UTC 2022 x86_64 GNU/Linux
|
||
pim@defra0:~$ uptime
|
||
15:51:10 up 600 days, 17:40, 1 user, load average: 3.44, 3.30, 3.31
|
||
pim@defra0:~$ dmesg | tail -10
|
||
[51852396.194030] usb 2-4.2: New USB device strings: Mfr=0, Product=0, SerialNumber=0
|
||
[51852396.215804] usb-storage 2-4.2:1.0: USB Mass Storage device detected
|
||
[51852396.215993] scsi host6: usb-storage 2-4.2:1.0
|
||
[51852396.216107] usbcore: registered new interface driver usb-storage
|
||
[51852396.219915] usbcore: registered new interface driver uas
|
||
[51852396.232081] scsi 6:0:0:0: CD-ROM ATEN Virtual CDROM YS0J PQ: 0 ANSI: 0 CCS
|
||
[51852396.232475] scsi 6:0:0:0: Attached scsi generic sg1 type 5
|
||
[51852396.251038] sr 6:0:0:0: [sr0] scsi3-mmc drive: 40x/40x cd/rw xa/form2 cdda tray
|
||
[51852396.251047] cdrom: Uniform CD-ROM driver Revision: 3.20
|
||
[51852396.267643] sr 6:0:0:0: Attached scsi CD-ROM sr0
|
||
```
|
||
|
||
I just love it when this stuff works. And it's nice to see the happenstance of the machine being up
|
||
for 600 days. Good power, great operating system and awesome hosting provider. Thanks for the service
|
||
so far, my sweet little Ubuntu router ❤️ !
|
||
|
||
## Installing
|
||
|
||
### Drain
|
||
|
||
Considering there is live traffic on the network, typically what an operator would do is drain the
|
||
links to route around the maintenance. To do this in my case, I need to make two changes, notably
|
||
draining OSPF and eBGP.
|
||
|
||
***OSPF***: In AS8298, all backbone connections use OSPF, and typically traffic from Zurich to Amsterdam
|
||
will be over Frankfurt because the OSPF cost is slightly lower than the other way around. I've
|
||
decided to standardize the OSPF link cost to be in tenths of milliseconds. In other words, if the
|
||
latency from `chrma0` to `defra0` is 5.6 ms, the OSPF cost will be 56. One way for me to avoid using
|
||
the Frankfurt router, is to make the cost of all traffic in- and out of the router be synthetically
|
||
high. I do this by adding +1000 to the OSPF cost.
|
||
|
||
***BGP***: But there are also a bunch of internet exchanges (such as Kleyrex, DE-CIX and LoCIX), and two IP
|
||
transit upstreams (IP-Max and Meerfarbig) connected to this router in Frankfurt. I do not want
|
||
them to send IPng any traffic here during the maintenance, so I will drain eBGP as well by setting the
|
||
groups to _shutdown_ state in Kees.
|
||
|
||
```
|
||
pim@squanchy:~/src/ipng-kees$ git diff
|
||
diff --git a/config/defra0.ipng.ch.yaml b/config/defra0.ipng.ch.yaml
|
||
index 869058c..105630c 100644
|
||
--- a/config/defra0.ipng.ch.yaml
|
||
+++ b/config/defra0.ipng.ch.yaml
|
||
@@ -151,12 +151,13 @@ vppcfg:
|
||
ospf:
|
||
xe1-0.304:
|
||
description: chrma0
|
||
- cost: 56
|
||
+ cost: 1056
|
||
xe1-1.302:
|
||
description: defra0
|
||
- cost: 61
|
||
+ cost: 1061
|
||
|
||
ebgp:
|
||
+ shutdown: true
|
||
groups:
|
||
decix_dus:
|
||
local-addresses: [ 185.1.171.43/23, 2001:7f8:9e::206a:0:1/64 ]
|
||
```
|
||
|
||
By raising the OSPF cost, the network will route around the machine that I want to play with:
|
||
|
||
```
|
||
pim@squanchy:~/src/ipng-kees$ traceroute nlams0.ipng.ch
|
||
traceroute to defra0.ipng.ch (194.1.163.32), 64 hops max, 40 byte packets
|
||
1 chbtl0 (194.1.163.66) 0.492 ms 0.64 ms 0.615 ms
|
||
2 chrma0 (194.1.163.17) 1.268 ms 1.196 ms 1.194 ms
|
||
3 chplo0 (194.1.163.51) 5.682 ms 5.514 ms 5.603 ms
|
||
4 frpar0 (194.1.163.40) 14.481 ms 14.605 ms 14.58 ms
|
||
5 frggh0 (194.1.163.30) 19.545 ms 18.61 ms 18.684 ms
|
||
6 nlams0 (194.1.163.32) 47.613 ms 47.765 ms 47.584 ms
|
||
```
|
||
|
||
And by setting the sessions to _shutdown_, Kees will make it regenerate all of the BGP sessions
|
||
with an `export none` and a low `bgp_local_pref`, which will make the router itself stop announcing
|
||
any prefixes, for example this session in Düsseldorf:
|
||
|
||
```
|
||
@@ -25,11 +25,11 @@ protocol bgp decix_dus_56890_ipv4_1 {
|
||
source address 185.1.171.43;
|
||
neighbor 185.1.170.252 as 56890;
|
||
default bgp_med 0;
|
||
- default bgp_local_pref 200;
|
||
+ default bgp_local_pref 0; # shutdown
|
||
ipv4 {
|
||
import keep filtered;
|
||
import filter ebgp_decix_dus_56890_import;
|
||
- export filter ebgp_decix_dus_56890_export;
|
||
+ export none; # shutdown
|
||
receive limit 250000 action restart;
|
||
next hop self on;
|
||
};
|
||
```
|
||
|
||
{{< image width="80px" float="left" src="/assets/shared/warning.png" alt="Warning" >}}
|
||
|
||
This is where it's a good idea to grab some tea. Quite a few internet providers have
|
||
incredibly slow convergence, so just by stopping the announcment of `AS8298:AS-IPNG` prefixes at
|
||
this internet exchange, doesn't mean things get updated too quickly. It makes sense to wait a few
|
||
minutes (by default I wait 15min) so that every router that might be a slow-poke (I'm looking at
|
||
you, Juniper!) has time to update their RIB and FIB.
|
||
|
||
VPP itself pretty immediately flips all of its paths to other places, and it converges a full table
|
||
of 950K IPv4 and 195K IPv6 routes in about 7 seconds or so, but not everybody has such fast CPUs in
|
||
their vendor-silicon-fancypants-router :-)
|
||
|
||
### Upgrade
|
||
|
||
The tea in my advents calendar for December 17th is _Whittard's Lemon & Ginger_ infusion, and it is
|
||
delicious. What could possibly go wrong?! Now that the router is fully drained, I start a ping to
|
||
the loopback, and flip the virtual powerswitch on the IPMI console. A few seconds later, the machine
|
||
expectedly stops pinging and ... the world doesn't end, my SSH session to a hypervisor in Amsterdam
|
||
is still alive, and most importantly, Spotify is still playing music:
|
||
|
||
```
|
||
pim@squanchy:~/src/ipng-kees$ ping defra0.ipng.ch
|
||
PING defra0.ipng.ch (194.1.163.7): 56 data bytes
|
||
64 bytes from 194.1.163.7: icmp_seq=0 ttl=62 time=6.3 ms
|
||
64 bytes from 194.1.163.7: icmp_seq=1 ttl=62 time=6.5 ms
|
||
64 bytes from 194.1.163.7: icmp_seq=2 ttl=62 time=6.2 ms
|
||
...
|
||
```
|
||
|
||
I open the IPMI KVM console, hit F10 and select the CDROM option, which has my previously inserted
|
||
Debian 12 _netinst_ ISO:
|
||
|
||
{{< image src="/assets/debian-vpp/debian-ipmi.png" alt="Debian on IPMI" >}}
|
||
|
||
At this point I can't help but smile. I'm sitting here in Brüttisellen, roughly 400km south of
|
||
this computer in Frankfurt, and I am looking at the VGA output of a fresh Debian installer. Come on,
|
||
you have to admit, that's pretty slick! Installing Debian follows pretty precisely my previous VPP#7
|
||
article [[ref]({{< ref "2021-09-21-vpp-7" >}})]. I go through the installer options and a few
|
||
minutes later, it's mission accomplished. I give the router its IPv4/IPv6 address in _IPng Site
|
||
Local_, so that it has management network connectivity, and just before it wants to reboot, I
|
||
quickly edit `/etc/default/grub` to turn on serial output, just like in the article:
|
||
|
||
```
|
||
GRUB_CMDLINE_LINUX="console=tty0 console=ttyS0,115200n8 isolcpus=1,2,3,5,6,7"
|
||
GRUB_TERMINAL=serial
|
||
GRUB_SERIAL_COMMAND="serial --unit=0 --speed=115200 --stop=1 --parity=no --word=8"
|
||
```
|
||
|
||
As the machine reboots, I eject the CDROM from the IPMI web interface, and attach to the
|
||
serial-over-lan interface instead. Booyah, it boots!
|
||
|
||
### Configure
|
||
|
||
On my workstation, I mount yesterday's Borg backup for the machine, because instead of doing the
|
||
whole router build over from scratch, I'm going to selectively copy a few bits and pieces over, in
|
||
the interest of time. Also, it's nice to actually use borgbackup for once, although Fred and I have
|
||
made grateful use of it in an emergency when one of IP-Max's hypervisors failed in Geneva.
|
||
|
||
```
|
||
pim@summer:~$ sudo borg mount ssh://${BORG_REPO}/defra0.ipng.ch/ /var/borgbackup/
|
||
Enter passphrase for key ssh://${BORG_REPO}/defra0.ipng.ch:
|
||
|
||
pim@summer:~$ sudo ls -l /var/borgbackup/defra0-2023-12-17T01:45:47.983599
|
||
bin boot cdrom etc home lib lib32 lib64 libx32 lost+found media mnt opt
|
||
root sbin srv tmp usr var
|
||
```
|
||
|
||
In case you're wondering why I mount the backup as root, it's because that way I can guarantee all
|
||
the correct users/permissions etc are present in the restore. I've done a practice run of the
|
||
upgrade, yesterday, at `chplo0.ipng.ch`, so by now I think I have a pretty good handle on what needs
|
||
to happen, so while connected to the freshly installed Debian Bookworm machine via serial-over-lan,
|
||
here's what I do:
|
||
|
||
```
|
||
root@defra0:~# apt install sudo rsync net-tools traceroute snmpd snmp iptables ipmitool bird2 \
|
||
lm-sensors netplan.io build-essential borgmatic unbound tcpdump \
|
||
libnl-3-200 libnl-route-3-200
|
||
|
||
root@defra0:~# adduser pim sudo
|
||
root@defra0:~# adduser pim bird
|
||
root@defra0:~# systemctl stop bird; systemctl disable bird; systemctl mask bird
|
||
root@defra0:~# sensors-detect --auto
|
||
root@defra0:~# export REPO=summer.net.ipng.ch:/var/borgbackup/defra0-2023-12-17T01:45:47.983599
|
||
|
||
root@defra0:~# mv /etc/network/interfaces /etc/network/interfaces.orig
|
||
root@defra0:~# rsync -avugP $REPO/etc/netplan/ /etc/netplan/
|
||
|
||
root@defra0:~# rm -f /etc/ssh/ssh_host*
|
||
root@defra0:~# rsync -avugP $REPO/etc/ssh/ssh_host* /etc/ssh/
|
||
|
||
root@defra0:~# rsync -avugP $REPO/etc/sysctl.d/80* /etc/sysctl.d/
|
||
root@defra0:~# rsync -avugP $REPO/etc/bird/ /etc/bird/
|
||
root@defra0:~# rsync -avugP $REPO/etc/vpp/ /etc/vpp/
|
||
root@defra0:~# rsync -avugP $REPO/etc/borgmatic/ /etc/borgmatic/
|
||
root@defra0:~# rsync -avugP $REPO/etc/rc.local /etc/rc.local
|
||
root@defra0:~# rsync -avugP $REPO/lib/systemd/system/*dataplane* /lib/systemd/system
|
||
```
|
||
|
||
I decide to selectively copy only the specific configuration files necessary to boot the dataplane.
|
||
This means the systemd services (like snmpd, sshd, and their network namespace), and all the Bird
|
||
and VPP config files. Because I prefer not to have to clear the SSH host keys, I also copy the old
|
||
SSH host keys over. And considering IPng Networks standardizes on netplan for interface config, I'll
|
||
move the Debian-default `interfaces` out of the way.
|
||
|
||
Finally, I add a few finishing touches and reboot one last time to ensure things are settled:
|
||
|
||
```
|
||
root@defra0:~# cat << EOF | tee -a /etc/modules
|
||
coretemp
|
||
mpls_router
|
||
vfio_pci
|
||
EOF
|
||
root@defra0:~# update-initramfs -k all -u
|
||
root@defra0:~# update-grub
|
||
|
||
root@defra0:~# mkdir -p /etc/systemd/system/unbound.service.d/
|
||
root@defra0:~# mkdir -p /etc/systemd/system/snmpd.service.d/
|
||
root@defra0:~# cat << EOF | tee /etc/systemd/system/unbound.service.d/override.conf
|
||
[Service]
|
||
NetworkNamespacePath=/var/run/netns/dataplane
|
||
EOF
|
||
root@defra0:~# cp /etc/systemd/system/unbound.service.d/override.conf \
|
||
/etc/systemd/system/snmpd.service.d/override.conf
|
||
root@defra0:~# reboot
|
||
```
|
||
|
||
The machine once again comes up, and now it's loaded the VFIO and MPLS kernel modules, so I'm ready
|
||
for the grand finale, which is installing VPP at the same version as the other routers in the fleet:
|
||
|
||
```
|
||
root@defra0:~# mkdir -p /var/log/vpp/
|
||
root@defra0:~# wget -m --no-parent https://ipng.ch/media/vpp/bookworm/24.02-rc0~175-g31d4891cf/
|
||
root@defra0:~# dpkg -i ipng.ch/media/vpp/bookworm/24.02-rc0~175-g31d4891cf/*.deb
|
||
root@defra0:~# adduser pim vpp
|
||
root@defra0:~# vppctl show version
|
||
vpp v24.02-rc0~175-g31d4891cf built by pim on bookworm-builder at 2023-12-09T12:54:52
|
||
```
|
||
|
||
In the corner of my eye, I see one of my xterms move. Hah! It's the ping I left running on
|
||
squanchy before, check it out:
|
||
|
||
```
|
||
pim@squanchy:~/src/ipng-kees$ ping defra0.ipng.ch
|
||
PING defra0.ipng.ch (194.1.163.7): 56 data bytes
|
||
64 bytes from 194.1.163.7: icmp_seq=0 ttl=62 time=6.3 ms
|
||
64 bytes from 194.1.163.7: icmp_seq=1 ttl=62 time=6.5 ms
|
||
64 bytes from 194.1.163.7: icmp_seq=2 ttl=62 time=6.2 ms
|
||
...
|
||
64 bytes from 194.1.163.7: icmp_seq=1484 ttl=62 time=6.5 ms
|
||
64 bytes from 194.1.163.7: icmp_seq=1485 ttl=62 time=6.6 ms
|
||
64 bytes from 194.1.163.7: icmp_seq=1486 ttl=62 time=6.8 ms
|
||
```
|
||
|
||
One think-o I made is that the Bird configs that I just put back from the backup were those from before I
|
||
set the drains (remember, raising the OSPF cost and setting the EBGP sessions to _shutdown_) so they
|
||
are now all alive again. But it's all good - the dataplane came up, Bird2 came up and formed OSPF
|
||
and OSPFv3 adjacencies a few seconds later, and BGP sessions all shot to life. I take a quick look
|
||
at the state of the dataplane to make sure I'm not accidentally introducing a broken router:
|
||
|
||
```
|
||
pim@defra0:~$ birdc show route count
|
||
BIRD 2.0.12 ready.
|
||
6782372 of 6782372 routes for 958020 networks in table master4
|
||
1848350 of 1848350 routes for 198255 networks in table master6
|
||
1620753 of 1620753 routes for 405189 networks in table t_roa4
|
||
367875 of 367875 routes for 91969 networks in table t_roa6
|
||
Total: 10619350 of 10619350 routes for 1653433 networks in 4 tables
|
||
|
||
pim@defra0:~$ vppctl show ip fib summary | awk '{ TOTAL += $2 } END { print TOTAL }'
|
||
958664
|
||
pim@defra0:~$ vppctl show ip6 fib summary | awk '{ TOTAL += $2 } END { print TOTAL }'
|
||
198322
|
||
```
|
||
|
||
OK, looking at the output I can conclude that my think-o was benign and the router has all routes
|
||
accounted for in the RIB, it has slurped in the RPKI tables, and it has successfully transferred all
|
||
of this into VPP's FIB. So this entire upgrade took 1482 seconds, which is just under 25 minutes.
|
||
***Gnarly!***
|
||
|
||
### Post Install
|
||
|
||
The machine is up and running, and there's one last thing for me to do, which is perform an Ansble
|
||
run to make sure that the whole machine is configured correctly (for example, the correct access
|
||
list for _Unbound_, the correct IPv4/IPv6 firewall for the Linux controlplane, the correct SSH
|
||
daemon options, working mailer and NTP daemon, et cetera).
|
||
|
||
So I fire off a one-shot Ansible playbook run, and it pokes and prods the machine a bit:
|
||
|
||
{{< image src="/assets/debian-vpp/ansible.png" alt="Ansible" >}}
|
||
|
||
Now the machine is completely up-to-snuff, its latest VPP SNMP agent Prometheus exporter, Bird
|
||
exporter, and so on are all good. I check LibreNMS and indeed, the machine is back with a half an
|
||
hour or so of monitoring data missing. I'm still grinning as I write this, as most Juniper and Cisco
|
||
_firmware_ upgrades take more than 30min, while for me the whole thing from start to finish was less
|
||
than that.
|
||
|
||
## Results
|
||
|
||
{{< image src="/assets/debian-vpp/smokeping.png" alt="Smokeping" >}}
|
||
|
||
This article describes how I managed to upgrade the entire network of routers, remotely, from the
|
||
comfort of my home, while sipping tea, and without having a single network outage. The bump in the
|
||
graph is the moment at which I drained `defra0` and traffic from the monitoring machine at `nlams0`
|
||
had to go via France to my house at `chbtl0`. No packets were lost in the making of this upgrade!
|
||
|
||
Yesterday I practiced on `chplo0`, and today for this article I did `defra0`, after which I also
|
||
did the last remaining router `nlams0`. Every router is now up to date running Debian Bookworm as
|
||
well as VPP version 24.02 (including a bunch of desirable fixes for IPFIX/Flowprobe):
|
||
|
||
```
|
||
pim@squanchy:~/src/ipng-kees$ ./doall.sh 'echo -n $(hostname -s):\ ; vppctl show version'
|
||
chbtl0: vpp v24.02-rc0~175-g31d4891cf built by pim on bookworm-builder at 2023-12-09T12:54:52
|
||
chbtl1: vpp v24.02-rc0~175-g31d4891cf built by pim on bookworm-builder at 2023-12-09T12:54:52
|
||
chgtg0: vpp v24.02-rc0~175-g31d4891cf built by pim on bookworm-builder at 2023-12-09T12:54:52
|
||
chplo0: vpp v24.02-rc0~175-g31d4891cf built by pim on bookworm-builder at 2023-12-09T12:54:52
|
||
chrma0: vpp v24.02-rc0~175-g31d4891cf built by pim on bookworm-builder at 2023-12-09T12:54:52
|
||
ddln0: vpp v24.02-rc0~175-g31d4891cf built by pim on bookworm-builder at 2023-12-09T12:54:52
|
||
ddln1: vpp v24.02-rc0~175-g31d4891cf built by pim on bookworm-builder at 2023-12-09T12:54:52
|
||
defra0: vpp v24.02-rc0~175-g31d4891cf built by pim on bookworm-builder at 2023-12-09T12:54:52
|
||
frggh0: vpp v24.02-rc0~175-g31d4891cf built by pim on bookworm-builder at 2023-12-09T12:54:52
|
||
frpar0: vpp v24.02-rc0~175-g31d4891cf built by pim on bookworm-builder at 2023-12-09T12:54:52
|
||
nlams0: vpp v24.02-rc0~175-g31d4891cf built by pim on bookworm-builder at 2023-12-09T12:54:52
|
||
usfmt0: vpp v24.02-rc0~175-g31d4891cf built by pim on bullseye-builder at 2023-12-09T16:27:33
|
||
```
|
||
|
||
For the hawk-eyed, yes `usfmt0` has not been done. I don't have Supermicro with IPMI there, so the
|
||
next time I visit California, I'll make a stop at the local Hurricane Electric datacenter to upgrade
|
||
that last one :-)
|