All checks were successful
continuous-integration/drone/push Build is passing
349 lines
16 KiB
Markdown
349 lines
16 KiB
Markdown
---
|
|
date: "2021-08-13T15:33:14Z"
|
|
title: VPP Linux CP - Part2
|
|
aliases:
|
|
- /s/articles/2021/08/13/vpp-2.html
|
|
params:
|
|
asciinema: true
|
|
---
|
|
|
|
|
|
{{< image width="200px" float="right" src="/assets/vpp/fdio-color.svg" alt="VPP" >}}
|
|
|
|
# About this series
|
|
|
|
Ever since I first saw VPP - the Vector Packet Processor - I have been deeply impressed with its
|
|
performance and versatility. For those of us who have used Cisco IOS/XR devices, like the classic
|
|
_ASR_ (aggregation services router), VPP will look and feel quite familiar as many of the approaches
|
|
are shared between the two. One thing notably missing, is the higher level control plane, that is
|
|
to say: there is no OSPF or ISIS, BGP, LDP and the like. This series of posts details my work on a
|
|
VPP _plugin_ which is called the **Linux Control Plane**, or LCP for short, which creates Linux network
|
|
devices that mirror their VPP dataplane counterpart. IPv4 and IPv6 traffic, and associated protocols
|
|
like ARP and IPv6 Neighbor Discovery can now be handled by Linux, while the heavy lifting of packet
|
|
forwarding is done by the VPP dataplane. Or, said another way: this plugin will allow Linux to use
|
|
VPP as a software ASIC for fast forwarding, filtering, NAT, and so on, while keeping control of the
|
|
interface state (links, addresses and routes) itself. When the plugin is completed, running software
|
|
like [FRR](https://frrouting.org/) or [Bird](https://bird.network.cz/) on top of VPP and achieving
|
|
>100Mpps and >100Gbps forwarding rates will be well in reach!
|
|
|
|
In this second post, let's make the plugin a bit more useful by making it copy forward state changes
|
|
to interfaces in VPP, into their Linux CP counterparts.
|
|
|
|
## My test setup
|
|
|
|
I'm using the same setup from the [previous post]({{< ref "2021-08-12-vpp-1" >}}). The goal of this
|
|
post is to show what code needed to be written and which changes needed to be made to the plugin, in
|
|
order to propagate changes to VPP interfaces to the Linux TAP devices.
|
|
|
|
### Startingpoint
|
|
|
|
The `linux-cp` plugin that ships with VPP 21.06, even with my [changes](https://gerrit.fd.io/r/c/vpp/+/33481)
|
|
is still _only_ able to create _LIP_ devices. It's not very user friendly to have to
|
|
apply state changes meticulously on both sides, but it can be done:
|
|
|
|
```
|
|
vppctl lcp create TenGigabitEthernet3/0/0 host-if e0
|
|
vppctl set interface state TenGigabitEthernet3/0/0 up
|
|
vppctl set interface mtu packet 9000 TenGigabitEthernet3/0/0
|
|
vppctl set interface ip address TenGigabitEthernet3/0/0 10.0.1.1/30
|
|
vppctl set interface ip address TenGigabitEthernet3/0/0 2001:db8:0:1::1/64
|
|
ip link set e0 up
|
|
ip link set e0 mtu 9000
|
|
ip addr add 10.0.1.1/30 dev e0
|
|
ip addr add 2001:db8:0:1::1/64 dev e0
|
|
```
|
|
|
|
In this snippet, we can see that after creating the _LIP_, thus conjuring up the unconfigured
|
|
`e0` interface in Linux, I changed the VPP interface in three ways:
|
|
1. I set the state of the VPP interface to 'up'
|
|
1. I set the MTU of the VPP interface to 9000
|
|
1. I add an IPv4 and IPv6 address to the interface
|
|
|
|
Because state does not (yet) propagate, I have to make those changes as well on the Linux side
|
|
with the subsequent `ip` commands.
|
|
|
|
### Configuration
|
|
|
|
I can imagine that operators want to have more control and facilitate the Linux and VPP changes
|
|
themselves. This is why I'll start off by adding a variable called `lcp_sync`, along with a
|
|
startup configuration keyword and a CLI setter. This allows me to turn the whole sync behavior on
|
|
and off, for example in `startup.conf`:
|
|
|
|
```
|
|
linux-cp {
|
|
default netns dataplane
|
|
lcp-sync
|
|
}
|
|
```
|
|
|
|
And in the CLI:
|
|
```
|
|
DBGvpp# show lcp
|
|
lcp default netns dataplane
|
|
lcp lcp-sync on
|
|
|
|
DBGvpp# lcp lcp-sync off
|
|
DBGvpp# show lcp
|
|
lcp default netns dataplane
|
|
lcp lcp-sync off
|
|
```
|
|
|
|
The prep work for the rest of the interface syncer starts with this
|
|
[[commit](https://git.ipng.ch/ipng/lcpng/commit/2d00de080bd26d80ce69441b1043de37e0326e0a)], and
|
|
for the rest of this blog post, the behavior will be in the 'on' position.
|
|
|
|
### Change interface: state
|
|
|
|
Immediately, I find a dissonance between VPP and Linux: When Linux sets a parent interface down,
|
|
all children go to state `M-DOWN`. When Linux sets a parent interface up, all of its children
|
|
automatically go to state `UP` and `LOWER_UP`. To illustrate:
|
|
|
|
```
|
|
ip link set enp66s0f1 down
|
|
ip link add link enp66s0f1 name foo type vlan id 1234
|
|
ip link set foo down
|
|
## Both interfaces are down, which makes sense because I set them both down
|
|
ip link | grep enp66s0f1
|
|
9: enp66s0f1: <BROADCAST,MULTICAST> mtu 9000 qdisc mq state DOWN mode DEFAULT group default qlen 1000
|
|
61: foo@enp66s0f1: <BROADCAST,MULTICAST,M-DOWN> mtu 9000 qdisc noop state DOWN mode DEFAULT group default qlen 1000
|
|
|
|
ip link set enp66s0f1 up
|
|
ip link | grep enp66s0f1
|
|
## Both interfaces are up, which doesn't make sense because I only changed one of them!
|
|
9: enp66s0f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UP mode DEFAULT group default qlen 1000
|
|
61: foo@enp66s0f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc noqueue state UP mode DEFAULT group default qlen 1000
|
|
```
|
|
|
|
VPP does not work this way. In VPP, the admin state of each interface is individually
|
|
controllable, so it's possible to bring up the parent while leaving the sub-interface in
|
|
the state it was. I did notice that you can't bring up a sub-interface if its parent
|
|
is down, which I found counterintuitive, but that's neither here nor there.
|
|
|
|
All of this is to say that we have to be careful when copying state forward, because as
|
|
this [[commit](https://git.ipng.ch/ipng/lcpng/commit/7c15c84f6c4739860a85c599779c199cb9efef03)]
|
|
shows, issuing `set int state ... up` on an interface, won't touch its sub-interfaces in VPP, but
|
|
the subsequent netlink message to bring the _LIP_ for that interface up, **will** update the
|
|
children, thus desynchronising Linux and VPP: Linux will have interface **and all its
|
|
sub-interfaces** up unconditionally; VPP will have the interface up and its sub-interfaces in
|
|
whatever state they were before.
|
|
|
|
To address this, a second
|
|
[[commit](https://git.ipng.ch/ipng/lcpng/commit/a3dc56c01461bdffcac8193ead654ae79225220f)] was
|
|
needed. I'm not too sure I want to keep this behavior, but for now, it results in an intuitive
|
|
end-state, which is that all interfaces states are exactly the same between Linux and VPP.
|
|
|
|
```
|
|
DBGvpp# create sub TenGigabitEthernet3/0/0 10
|
|
DBGvpp# lcp create TenGigabitEthernet3/0/0 host-if e0
|
|
DBGvpp# lcp create TenGigabitEthernet3/0/0.10 host-if e0.10
|
|
DBGvpp# set int state TenGigabitEthernet3/0/0 up
|
|
## Correct: parent is up, sub-int is not
|
|
694: e0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UNKNOWN mode DEFAULT group default qlen 1000
|
|
695: e0.10@e0: <BROADCAST,MULTICAST> mtu 9000 qdisc noqueue state DOWN mode DEFAULT group default qlen 1000
|
|
|
|
DBGvpp# set int state TenGigabitEthernet3/0/0.10 up
|
|
## Correct: both interfaces up
|
|
694: e0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UNKNOWN mode DEFAULT group default qlen 1000
|
|
695: e0.10@e0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc noqueue state UP mode DEFAULT group default qlen 1000
|
|
|
|
DBGvpp# set int state TenGigabitEthernet3/0/0 down
|
|
DBGvpp# set int state TenGigabitEthernet3/0/0.10 down
|
|
DBGvpp# set int state TenGigabitEthernet3/0/0 up
|
|
## Correct: only the parent is up
|
|
694: e0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UNKNOWN mode DEFAULT group default qlen 1000
|
|
695: e0.10@e0: <BROADCAST,MULTICAST> mtu 9000 qdisc noqueue state DOWN mode DEFAULT group default qlen 1000
|
|
```
|
|
|
|
### Change interface: MTU
|
|
|
|
Finally, a straight forward
|
|
[[commit](https://git.ipng.ch/ipng/lcpng/commit/39bfa1615fd1cafe5df6d8fc9d34528e8d3906e2)], or
|
|
so I thought. When the MTU changes in VPP (with `set interface mtu packet N <int>`), there is
|
|
callback that can be registered which copies this into the _LIP_. I did notice a specific corner
|
|
case: In VPP, a sub-interface can have a larger MTU than its parent. In Linux, this cannot happen,
|
|
so the following remains problematic:
|
|
|
|
```
|
|
DBGvpp# create sub TenGigabitEthernet3/0/0 10
|
|
DBGvpp# set int mtu packet 1500 TenGigabitEthernet3/0/0
|
|
DBGvpp# set int mtu packet 9000 TenGigabitEthernet3/0/0.10
|
|
## Incorrect: sub-int has larger MTU than parent, valid in VPP, not in Linux
|
|
694: e0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UNKNOWN mode DEFAULT group default qlen 1000
|
|
695: e0.10@e0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
|
|
```
|
|
|
|
I think the best way to ensure this works is to _clamp_ the sub-int to a maximum MTU of
|
|
that of its parent, and revert the user's request to change the VPP sub-int to anything
|
|
higher than that, perhaps logging an error explaining why. This means two things:
|
|
1. Any change in VPP of a child MTU to larger than its parent, must be reverted.
|
|
1. Any change in VPP of a parent MTU should ensure all children are clamped to at most that.
|
|
|
|
I addressed the issue in this
|
|
[[commit](https://git.ipng.ch/ipng/lcpng/commit/79a395b3c9f0dae9a23e6fbf10c5f284b1facb85)].
|
|
|
|
### Change interface: IP Addresses
|
|
|
|
There are three scenarios in which IP addresses will need to be copied from
|
|
VPP into the companion Linux devices:
|
|
|
|
1. `set interface ip address` adds an IPv4 or IPv6 address. This is handled by
|
|
`lcp_itf_ip[46]_add_del_interface_addr()` which is a callback installed in
|
|
`lcp_itf_pair_init()` at plugin initialization time.
|
|
1. `set interface ip address del` removes addresses. This is also handled by
|
|
`lcp_itf_ip[46]_add_del_interface_addr()` but curiously there is no
|
|
upstream `vnet_netlink_del_ip[46]_addr()` so I had to write them inline here.
|
|
I will try to get them upstreamed, as they appear to be obvious companions
|
|
in `vnet/device/netlink.h`.
|
|
1. This one is easy to overlook, but upon _LIP_ creation, it could be that there
|
|
are already L3 addresses present on the VPP interface. If so, set them in the
|
|
_LIP_ with `lcp_itf_set_interface_addr()`.
|
|
|
|
This means with this
|
|
[[commit](https://git.ipng.ch/ipng/lcpng/commit/f7e1bb951d648a63dfa27d04ded0b6261b9e39fe)], at
|
|
any time a new _LIP_ is created, the IPv4 and IPv6 address on the VPP interface are fully copied
|
|
over by the third change, while at runtime, new addresses can be set/removed as well by the first
|
|
and second change.
|
|
|
|
### Further work
|
|
|
|
I noticed that [Bird](https://bird.network.cz/) periodically scans the Linux
|
|
interface list and (re)learns information from them. I have a suspicion that
|
|
such a feature might be useful in the VPP plugin as well: I can imagine a
|
|
periodical process that walks over the _LIP_ interface list, and compares
|
|
what it finds in Linux with what is configured in VPP. What's not entirely
|
|
clear to me is which direction should 'trump', that is, should the Linux
|
|
state be forced into VPP, or should the VPP state be forced into Linux? I
|
|
don't yet have a good feeling of the answer, so I'll punt on that for now.
|
|
|
|
## Results
|
|
|
|
After applying the configuration to VPP (in Appendix), here's the results:
|
|
|
|
```
|
|
pim@hippo:~/src/lcpng$ ip ro
|
|
default via 194.1.163.65 dev enp6s0 proto static
|
|
10.0.1.0/30 dev e0 proto kernel scope link src 10.0.1.1
|
|
10.0.2.0/30 dev e0.1234 proto kernel scope link src 10.0.2.1
|
|
10.0.3.0/30 dev e0.1235 proto kernel scope link src 10.0.3.1
|
|
10.0.4.0/30 dev e0.1236 proto kernel scope link src 10.0.4.1
|
|
10.0.5.0/30 dev e0.1237 proto kernel scope link src 10.0.5.1
|
|
194.1.163.64/27 dev enp6s0 proto kernel scope link src 194.1.163.88
|
|
|
|
pim@hippo:~/src/lcpng$ fping 10.0.1.2 10.0.2.2 10.0.3.2 10.0.4.2 10.0.5.2
|
|
10.0.1.2 is alive
|
|
10.0.2.2 is alive
|
|
10.0.3.2 is alive
|
|
10.0.4.2 is alive
|
|
10.0.5.2 is alive
|
|
|
|
pim@hippo:~/src/lcpng$ fping6 2001:db8:0:1::2 2001:db8:0:2::2 \
|
|
2001:db8:0:3::2 2001:db8:0:4::2 2001:db8:0:5::2
|
|
2001:db8:0:1::2 is alive
|
|
2001:db8:0:2::2 is alive
|
|
2001:db8:0:3::2 is alive
|
|
2001:db8:0:4::2 is alive
|
|
2001:db8:0:5::2 is alive
|
|
|
|
```
|
|
|
|
In case you were wondering: my previous post ended in the same huzzah moment. It did.
|
|
|
|
The difference is that now the VPP configuration is _much shorter_! Comparing
|
|
the Appendix from this post with my [first post]({{< ref "2021-08-12-vpp-1" >}}), after
|
|
all of this work I no longer have to manually copy the configuration (like link states,
|
|
MTU changes, IP addresses) from VPP into Linux, instead the plugin does all of this work
|
|
for me, and I can configure both sides entirely with `vppctl` commands!
|
|
|
|
### Bonus screencast!
|
|
|
|
Humor me as I take the code out for a six minute screencast [[asciinema](/assets/vpp/430411.cast),
|
|
[gif](/assets/vpp/430411.gif)] :-)
|
|
|
|
{{< asciinema src="/assets/vpp/430411.cast" >}}
|
|
|
|
|
|
## Credits
|
|
|
|
I'd like to make clear that the Linux CP plugin is a great collaboration between several great folks
|
|
and that my work stands on their shoulders. I've had a little bit of help along the way from Neale
|
|
Ranns, Matthew Smith and Jon Loeliger, and I'd like to thank them for their work!
|
|
|
|
## Appendix
|
|
|
|
#### Ubuntu config
|
|
```
|
|
# Untagged interface
|
|
ip addr add 10.0.1.2/30 dev enp66s0f0
|
|
ip addr add 2001:db8:0:1::2/64 dev enp66s0f0
|
|
ip link set enp66s0f0 up mtu 9000
|
|
|
|
# Single 802.1q tag 1234
|
|
ip link add link enp66s0f0 name enp66s0f0.q type vlan id 1234
|
|
ip link set enp66s0f0.q up mtu 9000
|
|
ip addr add 10.0.2.2/30 dev enp66s0f0.q
|
|
ip addr add 2001:db8:0:2::2/64 dev enp66s0f0.q
|
|
|
|
# Double 802.1q tag 1234 inner-tag 1000
|
|
ip link add link enp66s0f0.q name enp66s0f0.qinq type vlan id 1000
|
|
ip link set enp66s0f0.qinq up mtu 9000
|
|
ip addr add 10.0.3.3/30 dev enp66s0f0.qinq
|
|
ip addr add 2001:db8:0:3::2/64 dev enp66s0f0.qinq
|
|
|
|
# Single 802.1ad tag 2345
|
|
ip link add link enp66s0f0 name enp66s0f0.ad type vlan id 2345 proto 802.1ad
|
|
ip link set enp66s0f0.ad up mtu 9000
|
|
ip addr add 10.0.4.2/30 dev enp66s0f0.ad
|
|
ip addr add 2001:db8:0:4::2/64 dev enp66s0f0.ad
|
|
|
|
# Double 802.1ad tag 2345 inner-tag 1000
|
|
ip link add link enp66s0f0.ad name enp66s0f0.qinad type vlan id 1000 proto 802.1q
|
|
ip link set enp66s0f0.qinad up mtu 9000
|
|
ip addr add 10.0.5.2/30 dev enp66s0f0.qinad
|
|
ip addr add 2001:db8:0:5::2/64 dev enp66s0f0.qinad
|
|
```
|
|
|
|
#### VPP config
|
|
```
|
|
## Look mom, no `ip` commands!! :-)
|
|
vppctl set interface state TenGigabitEthernet3/0/0 up
|
|
vppctl lcp create TenGigabitEthernet3/0/0 host-if e0
|
|
vppctl set interface mtu packet 9000 TenGigabitEthernet3/0/0
|
|
vppctl set interface ip address TenGigabitEthernet3/0/0 10.0.1.1/30
|
|
vppctl set interface ip address TenGigabitEthernet3/0/0 2001:db8:0:1::1/64
|
|
|
|
vppctl create sub TenGigabitEthernet3/0/0 1234
|
|
vppctl set interface mtu packet 9000 TenGigabitEthernet3/0/0.1234
|
|
vppctl lcp create TenGigabitEthernet3/0/0.1234 host-if e0.1234
|
|
vppctl set interface state TenGigabitEthernet3/0/0.1234 up
|
|
vppctl set interface ip address TenGigabitEthernet3/0/0.1234 10.0.2.1/30
|
|
vppctl set interface ip address TenGigabitEthernet3/0/0.1234 2001:db8:0:2::1/64
|
|
|
|
vppctl create sub TenGigabitEthernet3/0/0 1235 dot1q 1234 inner-dot1q 1000 exact-match
|
|
vppctl set interface state TenGigabitEthernet3/0/0.1235 up
|
|
vppctl set interface mtu packet 9000 TenGigabitEthernet3/0/0.1235
|
|
vppctl lcp create TenGigabitEthernet3/0/0.1235 host-if e0.1235
|
|
vppctl set interface ip address TenGigabitEthernet3/0/0.1235 10.0.3.1/30
|
|
vppctl set interface ip address TenGigabitEthernet3/0/0.1235 2001:db8:0:3::1/64
|
|
|
|
vppctl create sub TenGigabitEthernet3/0/0 1236 dot1ad 2345 exact-match
|
|
vppctl set interface state TenGigabitEthernet3/0/0.1236 up
|
|
vppctl lcp create TenGigabitEthernet3/0/0.1236 host-if e0.1236
|
|
vppctl set interface mtu packet 9000 TenGigabitEthernet3/0/0.1236
|
|
vppctl set interface ip address TenGigabitEthernet3/0/0.1236 10.0.4.1/30
|
|
vppctl set interface ip address TenGigabitEthernet3/0/0.1236 2001:db8:0:4::1/64
|
|
|
|
vppctl create sub TenGigabitEthernet3/0/0 1237 dot1ad 2345 inner-dot1q 1000 exact-match
|
|
vppctl set interface state TenGigabitEthernet3/0/0.1237 up
|
|
vppctl set interface mtu packet 9000 TenGigabitEthernet3/0/0.1237
|
|
vppctl set interface ip address TenGigabitEthernet3/0/0.1237 10.0.5.1/30
|
|
vppctl set interface ip address TenGigabitEthernet3/0/0.1237 2001:db8:0:5::1/64
|
|
vppctl lcp create TenGigabitEthernet3/0/0.1237 host-if e0.1237
|
|
```
|
|
|
|
#### Final note
|
|
|
|
You may have noticed that the [commit] links are all git commits in my private working copy. I want
|
|
to wait until my [previous work](https://gerrit.fd.io/r/c/vpp/+/33481) is reviewed and submitted
|
|
before piling on more changes. Feel free to contact vpp-dev@ for more information in the mean time
|
|
:-)
|