Merge branch 'main' of git.ipng.ch:ipng/ipng.ch
All checks were successful
continuous-integration/drone/push Build is passing
All checks were successful
continuous-integration/drone/push Build is passing
This commit is contained in:
464
content/articles/2025-05-03-containerlab-1.md
Normal file
464
content/articles/2025-05-03-containerlab-1.md
Normal file
@ -0,0 +1,464 @@
|
||||
---
|
||||
date: "2025-05-03T15:07:23Z"
|
||||
title: 'VPP in Containerlab - Part 1'
|
||||
---
|
||||
|
||||
{{< image float="right" src="/assets/containerlab/containerlab.svg" alt="Containerlab Logo" width="12em" >}}
|
||||
|
||||
# Introduction
|
||||
|
||||
From time to time the subject of containerized VPP instances comes up. At IPng, I run the routers in
|
||||
AS8298 on bare metal (Supermicro and Dell hardware), as it allows me to maximize performance.
|
||||
However, VPP is quite friendly in virtualization. Notably, it runs really well on virtual machines
|
||||
like Qemu/KVM or VMWare. I can pass through PCI devices directly to the host, and use CPU pinning to
|
||||
allow the guest virtual machine access to the underlying physical hardware. In such a mode, VPP
|
||||
performance almost the same as on bare metal. But did you know that VPP can also run in Docker?
|
||||
|
||||
The other day I joined the [[ZANOG'25](https://nog.net.za/event1/zanog25/)] in Durban, South Africa.
|
||||
One of the presenters was Nardus le Roux of Nokia, and he showed off a project called
|
||||
[[Containerlab](https://containerlab.dev/)], which provides a CLI for orchestrating and managing
|
||||
container-based networking labs. It starts the containers, builds a virtual wiring between them to
|
||||
create lab topologies of users choice and manages labs lifecycle.
|
||||
|
||||
Quite regularly I am asked 'when will you add VPP to Containerlab?', but at ZANOG I made a promise
|
||||
to actually add them. Here I go, on a journey to integrate VPP into Containerlab!
|
||||
|
||||
## Containerized VPP
|
||||
|
||||
The folks at [[Tigera](https://www.tigera.io/project-calico/)] maintain a project called _Calico_,
|
||||
which accelerates Kubernetes CNI (Container Network Interface) by using [[FD.io](https://fd.io)]
|
||||
VPP. Since the origins of Kubernetes are to run containers in a Docker environment, it stands to
|
||||
reason that it should be possible to run a containerized VPP. I start by reading up on how they
|
||||
create their Docker image, and I learn a lot.
|
||||
|
||||
### Docker Build
|
||||
|
||||
Considering IPng runs bare metal Debian (currently Bookworm) machines, my Docker image will be based
|
||||
on `debian:bookworm` as well. The build starts off quite modest:
|
||||
|
||||
```
|
||||
pim@summer:~$ mkdir -p src/vpp-containerlab
|
||||
pim@summer:~/src/vpp-containerlab$ cat < EOF > Dockerfile.bookworm
|
||||
FROM debian:bookworm
|
||||
ARG DEBIAN_FRONTEND=noninteractive
|
||||
ARG VPP_INSTALL_SKIP_SYSCTL=true
|
||||
ARG REPO=release
|
||||
RUN apt-get update && apt-get -y install curl procps && apt-get clean
|
||||
|
||||
# Install VPP
|
||||
RUN curl -s https://packagecloud.io/install/repositories/fdio/${REPO}/script.deb.sh | bash
|
||||
RUN apt-get update && apt-get -y install vpp vpp-plugin-core && apt-get clean
|
||||
|
||||
CMD ["/usr/bin/vpp","-c","/etc/vpp/startup.conf"]
|
||||
EOF
|
||||
pim@summer:~/src/vpp-containerlab$ docker build -f Dockerfile.bookworm . -t pimvanpelt/vpp-containerlab
|
||||
```
|
||||
|
||||
One gotcha - when I install the upstream VPP debian packages, they generate a `sysctl` file which it
|
||||
tries to execute. However, I can't set sysctl's in the container, so the build fails. I take a look
|
||||
at the VPP source code and find `src/pkg/debian/vpp.postinst` which helpfully contains a means to
|
||||
override setting the sysctl's, using an environment variable called `VPP_INSTALL_SKIP_SYSCTL`.
|
||||
|
||||
### Running VPP in Docker
|
||||
|
||||
With the Docker image built, I need to tweak the VPP startup configuration a little bit, to allow it
|
||||
to run well in a Docker environment. There are a few things I make note of:
|
||||
1. We may not have huge pages on the host machine, so I'll set all the page sizes to the
|
||||
linux-default 4kB rather than 2MB or 1GB hugepages. This creates a performance regression, but
|
||||
in the case of Containerlab, we're not here to build high performance stuff, but rather users
|
||||
will be doing functional testing.
|
||||
1. DPDK requires either UIO of VFIO kernel drivers, so that it can bind its so-called _poll mode
|
||||
driver_ to the network cards. It also requires huge pages. Since my first version will be
|
||||
using only virtual ethernet interfaces, I'll disable DPDK and VFIO alltogether.
|
||||
1. VPP can run any number of CPU worker threads. In its simplest form, I can also run it with only
|
||||
one thread. Of course, this will not be a high performance setup, but since I'm already not
|
||||
using hugepages, I'll use only 1 thread.
|
||||
|
||||
The VPP `startup.conf` configuration file I came up with:
|
||||
|
||||
```
|
||||
pim@summer:~/src/vpp-containerlab$ cat < EOF > clab-startup.conf
|
||||
unix {
|
||||
interactive
|
||||
log /var/log/vpp/vpp.log
|
||||
full-coredump
|
||||
cli-listen /run/vpp/cli.sock
|
||||
cli-prompt vpp-clab#
|
||||
cli-no-pager
|
||||
poll-sleep-usec 100
|
||||
}
|
||||
|
||||
api-trace {
|
||||
on
|
||||
}
|
||||
|
||||
memory {
|
||||
main-heap-size 512M
|
||||
main-heap-page-size 4k
|
||||
}
|
||||
buffers {
|
||||
buffers-per-numa 16000
|
||||
default data-size 2048
|
||||
page-size 4k
|
||||
}
|
||||
|
||||
statseg {
|
||||
size 64M
|
||||
page-size 4k
|
||||
per-node-counters on
|
||||
}
|
||||
|
||||
plugins {
|
||||
plugin default { enable }
|
||||
plugin dpdk_plugin.so { disable }
|
||||
}
|
||||
EOF
|
||||
```
|
||||
|
||||
Just a couple of notes for those who are running VPP in production. Each of the `*-page-size` config
|
||||
settings take the normal Linux pagesize of 4kB, which effectively avoids VPP from using anhy
|
||||
hugepages. Then, I'll specifically disable the DPDK plugin, although I didn't install it in the
|
||||
Dockerfile build, as it lives in its own dedicated Debian package called `vpp-plugin-dpdk`. Finally,
|
||||
I'll make VPP use less CPU by telling it to sleep for 100 microseconds between each poll iteration.
|
||||
In production environments, VPP will use 100% of the CPUs it's assigned, but in this lab, it will
|
||||
not be quite as hungry. By the way, even in this sleepy mode, it'll still easily handle a gigabit
|
||||
of traffic!
|
||||
|
||||
Now, VPP wants to run as root and it needs a few host features, notably tuntap devices and vhost,
|
||||
and a few capabilites, notably NET_ADMIN and SYS_PTRACE. I take a look at the
|
||||
[[manpage](https://man7.org/linux/man-pages/man7/capabilities.7.html)]:
|
||||
* ***CAP_SYS_NICE***: allows to set real-time scheduling, CPU affinity, I/O scheduling class, and
|
||||
to migrate and move memory pages.
|
||||
* ***CAP_NET_ADMIN***: allows to perform various network-relates operations like interface
|
||||
configs, routing tables, nested network namespaces, multicast, set promiscuous mode, and so on.
|
||||
* ***CAP_SYS_PTRACE***: allows to trace arbitrary processes using `ptrace(2)`, and a few related
|
||||
kernel system calls.
|
||||
|
||||
Being a networking dataplane implementation, VPP wants to be able to tinker with network devices.
|
||||
This is not typically allowed in Docker containers, although the Docker developers did make some
|
||||
consessions for those containers that need just that little bit more access. They described it in
|
||||
their
|
||||
[[docs](https://docs.docker.com/engine/containers/run/#runtime-privilege-and-linux-capabilities)] as
|
||||
follows:
|
||||
|
||||
| The --privileged flag gives all capabilities to the container. When the operator executes docker
|
||||
| run --privileged, Docker enables access to all devices on the host, and reconfigures AppArmor or
|
||||
| SELinux to allow the container nearly all the same access to the host as processes running outside
|
||||
| containers on the host. Use this flag with caution. For more information about the --privileged
|
||||
| flag, see the docker run reference.
|
||||
|
||||
{{< image width="4em" float="left" src="/assets/shared/warning.png" alt="Warning" >}}
|
||||
In this moment, I feel I should point out that running a Docker container with `--privileged` flag
|
||||
set does give it _a lot_ of privileges. A container with `--privileged` is not a securely sandboxed
|
||||
process. Containers in this mode can get a root shell on the host and take control over the system.
|
||||
|
||||
With that little fineprint warning out of the way, I am going to Yolo like a boss:
|
||||
|
||||
```
|
||||
pim@summer:~/src/vpp-containerlab$ docker run --name clab-pim \
|
||||
--cap-add=NET_ADMIN --cap-add=SYS_NICE --cap-add=SYS_PTRACE \
|
||||
--device=/dev/net/tun:/dev/net/tun --device=/dev/vhost-net:/dev/vhost-net \
|
||||
--privileged -v $(pwd)/clab-startup.conf:/etc/vpp/startup.conf:ro \
|
||||
docker.io/pimvanpelt/vpp-containerlab
|
||||
clab-pim
|
||||
```
|
||||
|
||||
### Configuring VPP in Docker
|
||||
|
||||
And with that, the Docker container is running! I post a screenshot on
|
||||
[[Mastodon](https://ublog.tech/@IPngNetworks/114392852468494211)] and my buddy John responds with a
|
||||
polite but firm insistence that I explain myself. Here you go, buddy :)
|
||||
|
||||
In another terminal, I can play around with this VPP instance a little bit:
|
||||
```
|
||||
pim@summer:~$ docker exec -it clab-pim bash
|
||||
root@d57c3716eee9:/# ip -br l
|
||||
lo UNKNOWN 00:00:00:00:00:00 <LOOPBACK,UP,LOWER_UP>
|
||||
eth0@if530566 UP 02:42:ac:11:00:02 <BROADCAST,MULTICAST,UP,LOWER_UP>
|
||||
|
||||
root@d57c3716eee9:/# ps auxw
|
||||
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
|
||||
root 1 2.2 0.2 17498852 160300 ? Rs 15:11 0:00 /usr/bin/vpp -c /etc/vpp/startup.conf
|
||||
root 10 0.0 0.0 4192 3388 pts/0 Ss 15:11 0:00 bash
|
||||
root 18 0.0 0.0 8104 4056 pts/0 R+ 15:12 0:00 ps auxw
|
||||
|
||||
root@d57c3716eee9:/# vppctl
|
||||
_______ _ _ _____ ___
|
||||
__/ __/ _ \ (_)__ | | / / _ \/ _ \
|
||||
_/ _// // / / / _ \ | |/ / ___/ ___/
|
||||
/_/ /____(_)_/\___/ |___/_/ /_/
|
||||
|
||||
vpp-clab# show version
|
||||
vpp v25.02-release built by root on d5cd2c304b7f at 2025-02-26T13:58:32
|
||||
vpp-clab# show interfaces
|
||||
Name Idx State MTU (L3/IP4/IP6/MPLS) Counter Count
|
||||
local0 0 down 0/0/0/0
|
||||
```
|
||||
|
||||
Slick! I can see that the container has an `eth0` device, which Docker has connected to the main
|
||||
bridged network. For now, there's only one process running, pid 1 proudly shows VPP (as in Docker,
|
||||
the `CMD` field will simply replace `init`. Later on, I can imagine running a few more daemons like
|
||||
SSH and so on, but for now, I'm happy.
|
||||
|
||||
Looking at VPP itself, it has no network interfaces yet, except for the default `local0` interface.
|
||||
|
||||
### Adding Interfaces in Docker
|
||||
|
||||
But if I don't have DPDK, how will I add interfaces? Enter `veth(4)`. From the
|
||||
[[manpage](https://man7.org/linux/man-pages/man4/veth.4.html)], I learn that veth devices are
|
||||
virtual Ethernet devices. They can act as tunnels between network namespaces to create a bridge to
|
||||
a physical network device in another namespace, but can also be used as standalone network devices.
|
||||
veth devices are always created in interconnected pairs.
|
||||
|
||||
Of course, Docker users will recognize this. It's like bread and butter for containers to
|
||||
communicate with one another - and with the host they're running on. I can simply create a Docker
|
||||
network and attach one half of it to a running container, like so:
|
||||
|
||||
```
|
||||
pim@summer:~$ docker network create --driver=bridge clab-network \
|
||||
--subnet 192.0.2.0/24 --ipv6 --subnet 2001:db8::/64
|
||||
5711b95c6c32ac0ed185a54f39e5af4b499677171ff3d00f99497034e09320d2
|
||||
pim@summer:~$ docker network connect clab-network clab-pim --ip '' --ip6 ''
|
||||
```
|
||||
|
||||
The first command here creates a new network called `clab-network` in Docker. As a result, a new
|
||||
bridge called `br-5711b95c6c32` shows up on the host. The bridge name is chosen from the UUID of the
|
||||
Docker object. Seeing as I added an IPv4 and IPv6 subnet to the bridge, it gets configured with the
|
||||
first address in both:
|
||||
|
||||
```
|
||||
pim@summer:~/src/vpp-containerlab$ brctl show br-5711b95c6c32
|
||||
bridge name bridge id STP enabled interfaces
|
||||
br-5711b95c6c32 8000.0242099728c6 no veth021e363
|
||||
|
||||
|
||||
pim@summer:~/src/vpp-containerlab$ ip -br a show dev br-5711b95c6c32
|
||||
br-5711b95c6c32 UP 192.0.2.1/24 2001:db8::1/64 fe80::42:9ff:fe97:28c6/64 fe80::1/64
|
||||
```
|
||||
|
||||
The second command creates a `veth` pair, and puts one half of it in the bridge, and this interface
|
||||
is called `veth021e363` above. The other half of it pops up as `eth1` in the Docker container:
|
||||
|
||||
```
|
||||
pim@summer:~/src/vpp-containerlab$ docker exec -it clab-pim bash
|
||||
root@d57c3716eee9:/# ip -br l
|
||||
lo UNKNOWN 00:00:00:00:00:00 <LOOPBACK,UP,LOWER_UP>
|
||||
eth0@if530566 UP 02:42:ac:11:00:02 <BROADCAST,MULTICAST,UP,LOWER_UP>
|
||||
eth1@if530577 UP 02:42:c0:00:02:02 <BROADCAST,MULTICAST,UP,LOWER_UP>
|
||||
```
|
||||
|
||||
One of the many awesome features of VPP is its ability to attach to these `veth` devices by means of
|
||||
its `af-packet` driver, by reusing the same MAC address (in this case `02:42:c0:00:02:02`). I first
|
||||
take a look at the linux [[manpage](https://man7.org/linux/man-pages/man7/packet.7.html)] for it,
|
||||
and then read up on the VPP
|
||||
[[documentation](https://fd.io/docs/vpp/v2101/gettingstarted/progressivevpp/interface)] on the
|
||||
topic.
|
||||
|
||||
|
||||
However, my attention is drawn to Docker assigning an IPv4 and IPv6 address to the container:
|
||||
```
|
||||
root@d57c3716eee9:/# ip -br a
|
||||
lo UNKNOWN 127.0.0.1/8 ::1/128
|
||||
eth0@if530566 UP 172.17.0.2/16
|
||||
eth1@if530577 UP 192.0.2.2/24 2001:db8::2/64 fe80::42:c0ff:fe00:202/64
|
||||
root@d57c3716eee9:/# ip addr del 192.0.2.2/24 dev eth1
|
||||
root@d57c3716eee9:/# ip addr del 2001:db8::2/64 dev eth1
|
||||
```
|
||||
|
||||
I decide to remove them from here, as in the end, `eth1` will be owned by VPP so _it_ should be
|
||||
setting the IPv4 and IPv6 addresses. For the life of me, I don't see how I can avoid Docker from
|
||||
assinging IPv4 and IPv6 addresses to this container ... and the
|
||||
[[docs](https://docs.docker.com/engine/network/)] seem to be off as well, as they suggest I can pass
|
||||
a flagg `--ipv4=False` but that flag doesn't exist, at least not on my Bookworm Docker variant. I
|
||||
make a mental note to discuss this with the folks in the Containerlab community.
|
||||
|
||||
|
||||
Anyway, armed with this knowledge I can bind the container-side veth pair called `eth1` to VPP, like
|
||||
so:
|
||||
|
||||
```
|
||||
root@d57c3716eee9:/# vppctl
|
||||
_______ _ _ _____ ___
|
||||
__/ __/ _ \ (_)__ | | / / _ \/ _ \
|
||||
_/ _// // / / / _ \ | |/ / ___/ ___/
|
||||
/_/ /____(_)_/\___/ |___/_/ /_/
|
||||
|
||||
vpp-clab# create host-interface name eth1 hw-addr 02:42:c0:00:02:02
|
||||
vpp-clab# set interface name host-eth1 eth1
|
||||
vpp-clab# set interface mtu 1500 eth1
|
||||
vpp-clab# set interface ip address eth1 192.0.2.2/24
|
||||
vpp-clab# set interface ip address eth1 2001:db8::2/64
|
||||
vpp-clab# set interface state eth1 up
|
||||
vpp-clab# show int addr
|
||||
eth1 (up):
|
||||
L3 192.0.2.2/24
|
||||
L3 2001:db8::2/64
|
||||
local0 (dn):
|
||||
```
|
||||
|
||||
## Results
|
||||
|
||||
After all this work, I've successfully created a Docker image based on Debian Bookworm and VPP 25.02
|
||||
(the current stable release version), started a container with it, added a network bridge in Docker,
|
||||
which binds the host `summer` to the container. Proof, as they say, is in the ping-pudding:
|
||||
|
||||
```
|
||||
pim@summer:~/src/vpp-containerlab$ ping -c5 2001:db8::2
|
||||
PING 2001:db8::2(2001:db8::2) 56 data bytes
|
||||
64 bytes from 2001:db8::2: icmp_seq=1 ttl=64 time=0.113 ms
|
||||
64 bytes from 2001:db8::2: icmp_seq=2 ttl=64 time=0.056 ms
|
||||
64 bytes from 2001:db8::2: icmp_seq=3 ttl=64 time=0.202 ms
|
||||
64 bytes from 2001:db8::2: icmp_seq=4 ttl=64 time=0.102 ms
|
||||
64 bytes from 2001:db8::2: icmp_seq=5 ttl=64 time=0.100 ms
|
||||
|
||||
--- 2001:db8::2 ping statistics ---
|
||||
5 packets transmitted, 5 received, 0% packet loss, time 4098ms
|
||||
rtt min/avg/max/mdev = 0.056/0.114/0.202/0.047 ms
|
||||
pim@summer:~/src/vpp-containerlab$ ping -c5 192.0.2.2
|
||||
PING 192.0.2.2 (192.0.2.2) 56(84) bytes of data.
|
||||
64 bytes from 192.0.2.2: icmp_seq=1 ttl=64 time=0.043 ms
|
||||
64 bytes from 192.0.2.2: icmp_seq=2 ttl=64 time=0.032 ms
|
||||
64 bytes from 192.0.2.2: icmp_seq=3 ttl=64 time=0.019 ms
|
||||
64 bytes from 192.0.2.2: icmp_seq=4 ttl=64 time=0.041 ms
|
||||
64 bytes from 192.0.2.2: icmp_seq=5 ttl=64 time=0.027 ms
|
||||
|
||||
--- 192.0.2.2 ping statistics ---
|
||||
5 packets transmitted, 5 received, 0% packet loss, time 4063ms
|
||||
rtt min/avg/max/mdev = 0.019/0.032/0.043/0.008 ms
|
||||
```
|
||||
|
||||
And in case that simple ping-test wasn't enough to get you excited, here's a packet trace from VPP
|
||||
itself, while I'm performing this ping:
|
||||
|
||||
```
|
||||
vpp-clab# trace add af-packet-input 100
|
||||
vpp-clab# wait 3
|
||||
vpp-clab# show trace
|
||||
------------------- Start of thread 0 vpp_main -------------------
|
||||
Packet 1
|
||||
|
||||
00:07:03:979275: af-packet-input
|
||||
af_packet: hw_if_index 1 rx-queue 0 next-index 4
|
||||
block 47:
|
||||
address 0x7fbf23b7d000 version 2 seq_num 48 pkt_num 0
|
||||
tpacket3_hdr:
|
||||
status 0x20000001 len 98 snaplen 98 mac 92 net 106
|
||||
sec 0x68164381 nsec 0x258e7659 vlan 0 vlan_tpid 0
|
||||
vnet-hdr:
|
||||
flags 0x00 gso_type 0x00 hdr_len 0
|
||||
gso_size 0 csum_start 0 csum_offset 0
|
||||
00:07:03:979293: ethernet-input
|
||||
IP4: 02:42:09:97:28:c6 -> 02:42:c0:00:02:02
|
||||
00:07:03:979306: ip4-input
|
||||
ICMP: 192.0.2.1 -> 192.0.2.2
|
||||
tos 0x00, ttl 64, length 84, checksum 0x5e92 dscp CS0 ecn NON_ECN
|
||||
fragment id 0x5813, flags DONT_FRAGMENT
|
||||
ICMP echo_request checksum 0xc16 id 21197
|
||||
00:07:03:979315: ip4-lookup
|
||||
fib 0 dpo-idx 9 flow hash: 0x00000000
|
||||
ICMP: 192.0.2.1 -> 192.0.2.2
|
||||
tos 0x00, ttl 64, length 84, checksum 0x5e92 dscp CS0 ecn NON_ECN
|
||||
fragment id 0x5813, flags DONT_FRAGMENT
|
||||
ICMP echo_request checksum 0xc16 id 21197
|
||||
00:07:03:979322: ip4-receive
|
||||
fib:0 adj:9 flow:0x00000000
|
||||
ICMP: 192.0.2.1 -> 192.0.2.2
|
||||
tos 0x00, ttl 64, length 84, checksum 0x5e92 dscp CS0 ecn NON_ECN
|
||||
fragment id 0x5813, flags DONT_FRAGMENT
|
||||
ICMP echo_request checksum 0xc16 id 21197
|
||||
00:07:03:979323: ip4-icmp-input
|
||||
ICMP: 192.0.2.1 -> 192.0.2.2
|
||||
tos 0x00, ttl 64, length 84, checksum 0x5e92 dscp CS0 ecn NON_ECN
|
||||
fragment id 0x5813, flags DONT_FRAGMENT
|
||||
ICMP echo_request checksum 0xc16 id 21197
|
||||
00:07:03:979323: ip4-icmp-echo-request
|
||||
ICMP: 192.0.2.1 -> 192.0.2.2
|
||||
tos 0x00, ttl 64, length 84, checksum 0x5e92 dscp CS0 ecn NON_ECN
|
||||
fragment id 0x5813, flags DONT_FRAGMENT
|
||||
ICMP echo_request checksum 0xc16 id 21197
|
||||
00:07:03:979326: ip4-load-balance
|
||||
fib 0 dpo-idx 5 flow hash: 0x00000000
|
||||
ICMP: 192.0.2.2 -> 192.0.2.1
|
||||
tos 0x00, ttl 64, length 84, checksum 0x88e1 dscp CS0 ecn NON_ECN
|
||||
fragment id 0x2dc4, flags DONT_FRAGMENT
|
||||
ICMP echo_reply checksum 0x1416 id 21197
|
||||
00:07:03:979325: ip4-rewrite
|
||||
tx_sw_if_index 1 dpo-idx 5 : ipv4 via 192.0.2.1 eth1: mtu:1500 next:3 flags:[] 0242099728c60242c00002020800 flow hash: 0x00000000
|
||||
00000000: 0242099728c60242c00002020800450000542dc44000400188e1c0000202c000
|
||||
00000020: 02010000141652cd00018143166800000000399d0900000000001011
|
||||
00:07:03:979326: eth1-output
|
||||
eth1 flags 0x02180005
|
||||
IP4: 02:42:c0:00:02:02 -> 02:42:09:97:28:c6
|
||||
ICMP: 192.0.2.2 -> 192.0.2.1
|
||||
tos 0x00, ttl 64, length 84, checksum 0x88e1 dscp CS0 ecn NON_ECN
|
||||
fragment id 0x2dc4, flags DONT_FRAGMENT
|
||||
ICMP echo_reply checksum 0x1416 id 21197
|
||||
00:07:03:979327: eth1-tx
|
||||
af_packet: hw_if_index 1 tx-queue 0
|
||||
tpacket3_hdr:
|
||||
status 0x1 len 108 snaplen 108 mac 0 net 0
|
||||
sec 0x0 nsec 0x0 vlan 0 vlan_tpid 0
|
||||
vnet-hdr:
|
||||
flags 0x00 gso_type 0x00 hdr_len 0
|
||||
gso_size 0 csum_start 0 csum_offset 0
|
||||
buffer 0xf97c4:
|
||||
current data 0, length 98, buffer-pool 0, ref-count 1, trace handle 0x0
|
||||
local l2-hdr-offset 0 l3-hdr-offset 14
|
||||
IP4: 02:42:c0:00:02:02 -> 02:42:09:97:28:c6
|
||||
ICMP: 192.0.2.2 -> 192.0.2.1
|
||||
tos 0x00, ttl 64, length 84, checksum 0x88e1 dscp CS0 ecn NON_ECN
|
||||
fragment id 0x2dc4, flags DONT_FRAGMENT
|
||||
ICMP echo_reply checksum 0x1416 id 21197
|
||||
```
|
||||
|
||||
Well, that's a mouthfull, isn't it! Here, I get to show you VPP in action. After receiving the
|
||||
packet on its `af-packet-input` node from 192.0.2.1 (Summer, who is pinging us) to 192.0.2.2 (the
|
||||
VPP container), the packet traverses the dataplane graph. It goes through `ethernet-input`, then
|
||||
`ip4-input`, which sees it's destined to an IPv4 address configured, so the packet is handed to
|
||||
`ip4-receive`. That one sees that the IP protocol is ICMP, so it hands the packet to
|
||||
`ip4-icmp-input` which notices that the packet is an ICMP echo request, so off to
|
||||
`ip4-icmp-echo-request` our little packet goes. The ICMP plugin in VPP now answers by
|
||||
`ip4-rewrite`'ing the packet, sending the return to 192.0.2.1 at MAC address `02:42:09:97:28:c6`
|
||||
(this is Summer, the host doing the pinging!), after which the newly created ICMP echo-reply is
|
||||
handed to `eth1-output` which marshalls it back into the kernel's AF_PACKET interface using
|
||||
`eth1-tx`.
|
||||
|
||||
Boom. I could not be more pleased.
|
||||
|
||||
## What's Next
|
||||
|
||||
This was a nice exercise for me! I'm going this direction becaue the
|
||||
[[Containerlab](https://containerlab.dev)] framework will start containers with given NOS images,
|
||||
not too dissimilar from the one I just made, and then attaches `veth` pairs between the containers.
|
||||
I started dabbling with a [[pull-request](https://github.com/srl-labs/containerlab/pull/2569)], but
|
||||
I got stuck with a part of the Containerlab code that pre-deploys config files into the containers.
|
||||
You see, I will need to generate two files:
|
||||
|
||||
1. A `startup.conf` file that is specific to the containerlab Docker container. I'd like them to
|
||||
each set their own hostname so that the CLI has a unique prompt. I can do this by setting `unix
|
||||
{ cli-prompt {{ .ShortName }}# }` in the template renderer.
|
||||
1. Containerlab will know all of the veth pairs that are planned to be created into each VPP
|
||||
container. I'll need it to then write a little snippet of config that does the `create
|
||||
host-interface` spiel, to attach these `veth` pairs to the VPP dataplane.
|
||||
|
||||
I reached out to Roman from Nokia, who is one of the authors and current maintainer of Containerlab.
|
||||
Roman was keen to help out, and seeing as he knows the COntainerlab stuff well, and I know the VPP
|
||||
stuff well, this is a reasonable partnership! Soon, he and I plan to have a bare-bones setup that
|
||||
will connect a few VPP containers together with an SR Linux node in a lab. Stand by!
|
||||
|
||||
Once we have that, there's still quite some work for me to do. Notably:
|
||||
* Configuration persistence. `clab` allows you to save the running config. For that, I'll need to
|
||||
introduce [[vppcfg](https://github.com/pimvanpelt/vppcfg.git)] and a means to invoke it when
|
||||
the lab operator wants to save their config, and then reconfigure VPP when the container
|
||||
restarts.
|
||||
* I'll need to have a few files from `clab` shared with the host, notably the `startup.conf` and
|
||||
`vppcfg.yaml`, as well as some manual pre- and post-flight configuration for the more esoteric
|
||||
stuff. Building the plumbing for this is a TODO for now.
|
||||
|
||||
## Acknowledgements
|
||||
|
||||
I wanted to give a shout-out to Nardus le Roux who inspired me to contribute this Containerlab VPP
|
||||
node type, and to Roman Dodin for his help getting the Containerlab parts squared away when I got a
|
||||
little bit stuck.
|
||||
|
||||
First order of business: get it to ping at all ... it'll go faster from there on out :)
|
1
static/assets/containerlab/containerlab.svg
Normal file
1
static/assets/containerlab/containerlab.svg
Normal file
File diff suppressed because one or more lines are too long
After Width: | Height: | Size: 21 KiB |
Reference in New Issue
Block a user