Rewrite all images to Hugo format
This commit is contained in:
444
content/articles/2022-03-27-vppcfg-1.md
Normal file
444
content/articles/2022-03-27-vppcfg-1.md
Normal file
@ -0,0 +1,444 @@
|
||||
---
|
||||
date: "2022-03-27T14:19:23Z"
|
||||
title: VPP Configuration - Part1
|
||||
---
|
||||
|
||||
{{< image width="200px" float="right" src="/assets/vpp/fdio-color.svg" alt="VPP" >}}
|
||||
|
||||
# About this series
|
||||
|
||||
I use VPP - Vector Packet Processor - extensively at IPng Networks. Earlier this year, the VPP community
|
||||
merged the [Linux Control Plane]({%post_url 2021-08-12-vpp-1 %}) plugin. I wrote about its deployment
|
||||
to both regular servers like the [Supermicro]({%post_url 2021-09-21-vpp-7 %}) routers that run on our
|
||||
[AS8298]({% post_url 2021-02-27-network %}), as well as virtual machines running in
|
||||
[KVM/Qemu]({% post_url 2021-12-23-vpp-playground %}).
|
||||
|
||||
Now that I've been running VPP in production for about half a year, I can't help but notice one specific
|
||||
drawback: VPP is a programmable dataplane, and _by design_ it does not include any configuration or
|
||||
controlplane management stack. It's meant to be integrated into a full stack by operators. For end-users,
|
||||
this unfortunately means that typing on the CLI won't persist any configuration, and if VPP is restarted,
|
||||
it will not pick up where it left off. There's one developer convenience in the form of the `exec`
|
||||
command-line (and startup.conf!) option, which will read a file and apply the contents to the CLI line
|
||||
by line. However, if any typo is made in the file, processing immediately stops. It's meant as a convenience
|
||||
for VPP developers, and is certainly not a useful configuration method for all but the simplest topologies.
|
||||
|
||||
Luckily, VPP comes with an extensive set of APIs to allow it to be programmed. So in this series of posts,
|
||||
I'll detail the work I've done to create a configuration utility that can take a YAML configuration file,
|
||||
compare it to a running VPP instance, and step-by-step plan through the API calls needed to safely apply
|
||||
the configuration to the dataplane. Welcome to `vppcfg`!
|
||||
|
||||
In this first post, let's take a look at tablestakes: writing a YAML specification which models the main
|
||||
configuration elements of VPP, and then ensures that the YAML file is both syntactically as well as
|
||||
semantically correct.
|
||||
|
||||
**Note**: Code is on [my Github](https://github.com/pimvanpelt/vppcfg), but it's not quite ready for
|
||||
prime-time yet. Take a look, and engage with us on GitHub (pull requests preferred over issues themselves)
|
||||
or reach out by [contacting us](/s/contact/).
|
||||
|
||||
## YAML Specification
|
||||
|
||||
I decide to use [Yamale](https://github.com/23andMe/Yamale/), which is a schema description language
|
||||
and validator for [YAML](http://www.yaml.org/spec/1.2/spec.html). YAML is a very simple, text/human-readable
|
||||
annotation format that can be used to store a wide range of data types. An interesting, but quick introduction
|
||||
to the YAML language can be found on CraftIRC's [GitHub](https://github.com/Animosity/CraftIRC/wiki/Complete-idiot's-introduction-to-yaml)
|
||||
page.
|
||||
|
||||
The first order of business for me is to devise a YAML file specification which models the configuration
|
||||
options of VPP objects in an idiomatic way. It's apealing to make the decision to immediately build a
|
||||
higher level abstraction, but I resist the urge and instead look at the types of objects that exist in
|
||||
VPP, for example the `VNET_DEVICE_CLASS` types:
|
||||
|
||||
* ***ethernet_simulated_device_class***: Loopbacks
|
||||
* ***bvi_device_class***: Bridge Virtual Interfaces
|
||||
* ***dpdk_device_class***: DPDK Interfaces
|
||||
* ***rdma_device_class***: RDMA Interfaces
|
||||
* ***bond_device_class***: BondEthernet Interfaces
|
||||
* ***vxlan_device_class***: VXLAN Tunnels
|
||||
|
||||
There are several others, but I decide to start with these, as I'll be needing each one of these in my
|
||||
own network. Looking over the device class specification, I learn a lot about how they are configured,
|
||||
which arguments and of which type they need, and which data-structures they are represent as in VPP
|
||||
internally.
|
||||
|
||||
### Syntax Validation
|
||||
|
||||
Yamale first reads a _schema_ definition file, and then holds a given YAML file against the definition
|
||||
and shows if the file has a syntax that is well-formed or not. As a practical example, let me start
|
||||
with the following definition:
|
||||
|
||||
```
|
||||
$ cat << EOF > schema.yaml
|
||||
sub-interfaces: map(include('sub-interface'),key=int())
|
||||
---
|
||||
sub-interface:
|
||||
description: str(exclude='\'"',len=64,required=False)
|
||||
lcp: str(max=15,matches='[a-z]+[a-z0-9-]*',required=False)
|
||||
mtu: int(min=128,max=9216,required=False)
|
||||
addresses: list(ip(version=6),required=False)
|
||||
encapsulation: include('encapsulation',required=False)
|
||||
---
|
||||
encapsulation:
|
||||
dot1q: int(min=1,max=4095,required=False)
|
||||
dot1ad: int(min=1,max=4095,required=False)
|
||||
inner-dot1q: int(min=1,max=4095,required=False)
|
||||
exact-match: bool(required=False)
|
||||
EOF
|
||||
```
|
||||
|
||||
This snippet creates two types, one called `sub-interface` and the other called `encapsulation`. The fields
|
||||
of the sub-interface, for example the `description` field, must follow the given typing to be valid. In the
|
||||
case of the description, it must be at most 64 characters long and it must not contain the ` or "
|
||||
characters. The designation `required=False` notes that this is an optional field and may be omitted.
|
||||
The `lcp` field is also a string but it must match a certain regular expression, and start with a lowercase
|
||||
letter. The `MTU` field must be an integer between 128 and 9216, and so on.
|
||||
|
||||
One nice feature of Yamale is the ability to reference other object types. I do this here with the `encapsulation`
|
||||
field, which references an object type of the same name, and again, is optional. This means that when the
|
||||
`encapsulation` field is encountered in the YAML file Yamale is validating, it'll hold the contents of that
|
||||
field to the schema below. There, we have `dot1q`, `dot1ad`, `inner-dot1q` and `exact-match` fields, which are
|
||||
all optional.
|
||||
|
||||
Then, at the top of the file, I create the entrypoint schema, which expects YAML files to contain a map
|
||||
called `sub-interfaces` which is keyed by integers and contains values of type `sub-interface`, tying it all
|
||||
together.
|
||||
|
||||
Yamale comes with a commandline utility to do direct schema validation, which is handy. Let me demonstrate with
|
||||
the following terrible YAML:
|
||||
```
|
||||
$ cat << EOF > bad.yaml
|
||||
sub-interfaces:
|
||||
100:
|
||||
description: "Pim's illegal description"
|
||||
lcp: "NotAGoodName-AmIRite"
|
||||
mtu: 16384
|
||||
addresses: 192.0.2.1
|
||||
encapsulation: False
|
||||
EOF
|
||||
|
||||
$ yamale -s schemal.yaml bad.yaml
|
||||
Validating /home/pim/bad.yaml...
|
||||
Validation failed!
|
||||
Error validating data '/home/pim/bad.yaml' with schema '/home/pim/schema.yaml'
|
||||
sub-interfaces.100.description: 'Pim's illegal description' contains excluded character '''
|
||||
sub-interfaces.100.lcp: Length of NotAGoodName-AmIRite is greater than 15
|
||||
sub-interfaces.100.lcp: NotAGoodName-AmIRite is not a regex match.
|
||||
sub-interfaces.100.mtu: 16384 is greater than 9216
|
||||
sub-interfaces.100.addresses: '192.0.2.1' is not a list.
|
||||
sub-interfaces.100.encapsulation : 'False' is not a map
|
||||
```
|
||||
|
||||
This file trips so many syntax violations, it should be a crime! In fact every single field is invalid. The one that
|
||||
is closest to being correct is the `addresses` field, but there I've set it up as a _list_ (not a scalar), and even
|
||||
then, the list elements are expected to be IPv6 addresses, not IPv4 ones.
|
||||
|
||||
So let me try again:
|
||||
|
||||
```
|
||||
$ cat << EOF > good.yaml
|
||||
sub-interfaces:
|
||||
100:
|
||||
description: "Core: switch.example.com Te0/1"
|
||||
lcp: "xe3-0-0"
|
||||
mtu: 9216
|
||||
addresses: [ 2001:db8::1, 2001:db8:1::1 ]
|
||||
encapsulation:
|
||||
dot1q: 100
|
||||
exact-match: True
|
||||
EOF
|
||||
|
||||
$ yamale good.yaml
|
||||
Validating /home/pim/good.yaml...
|
||||
Validation success! 👍
|
||||
```
|
||||
|
||||
### Semantic Validation
|
||||
|
||||
When using Yamale, I can make a good start in _syntax_ validation, that is to say, if a field is present, it follows
|
||||
a prescribed type. But that's not the whole story, though. There are many configuration files I can think of that
|
||||
would be syntactically correct, but still make no sense in practice. For example, creating an encapsulation which
|
||||
has both `dot1q` as well as `dot1ad`, or creating a _LIP_ (Linux Interface Pair) for sub-interface which does not
|
||||
have `exact-match` set. Or how's about having two sub-interfaces with the same exact encapsulation?
|
||||
|
||||
Here's where _semantic_ validation comes in to play. So I set out to create all sorts of constraints, and after
|
||||
reading the (Yamale validated, so syntactically correct) YAML file, I can hand it into a set of validators that
|
||||
check for violations of these constraints. By means of example, let me create a few constraints that might capture
|
||||
the issues described above:
|
||||
|
||||
1. If a sub-interface has encapsulation:
|
||||
1. It MUST have `dot1q` OR `dot1ad` set
|
||||
1. It MUST NOT have `dot1q` AND `dot1ad` both set
|
||||
1. If a sub-interface has one or more `addresses`:
|
||||
1. Its encapsulation MUST be set to `exact-match`
|
||||
1. It MUST have an `lcp` set.
|
||||
1. Each individual `address` MUST NOT occur in any other interface
|
||||
|
||||
## Config Validation
|
||||
|
||||
After spending a few weeks thinking about the problem, I came up with 59 semantic constraints, that is to say
|
||||
things that might appear OK, but will yield impossible to implement or otherwise erratic VPP configurations.
|
||||
This article would be a bad place to discuss them all, so I will talk about the structure of `vppcfg` instead.
|
||||
|
||||
First, a `Validator` class is instantiated with the Yamale schema. Then, a YAML file is read and passed to the
|
||||
validator's `validate()` method. It will first run Yamale on the YAML file and make note of any issues that arise.
|
||||
If so, it will enumerate them in a list and return (bool, [list-of-messages]). The validation will have failed
|
||||
if the boolean returned is _false_, and if so, the list of messages will help understand which constraint was
|
||||
violated.
|
||||
|
||||
The `vppcfg` schema consists of toplevel types, which are validated in order:
|
||||
|
||||
* ***validate_bondethernets()***'s job is to ensure that anything configured in the `bondethernets` toplevel map
|
||||
is correct. For example, if a _BondEthernet_ device is created there, its members should reference existing
|
||||
interfaces, and it itself should make an appearance in the `interfaces` map, and the MTU of each member should
|
||||
be equal to the MTU of the _BondEthernet_, and so on. See `config/bondethernet.py` for a complete rundown.
|
||||
* ***validate_loopbacks()*** is pretty straight forward. It makes a few common assertions, such as that if the
|
||||
loopback has addresses, it must also have an LCP, and if it has an LCP, that no other interface has the same
|
||||
LCP name, and that all of the addresses configured are unique.
|
||||
* ***validate_vxlan_tunnels()*** Yamale already asserts that the `local` and `remote` fields are present and an
|
||||
IP address. The semantic validator ensures that the address family of the tunnel endpoints are the same, and that
|
||||
the used `VNI` is unique.
|
||||
* ***validate_bridgedomains()*** fiddles with its _Bridge Virtual Interface_, making sure that its addresses and
|
||||
LCP name are unique. Further, it makes sure that a given member interface is in at most one bridge, and that said
|
||||
member is in L2 mode, in other words, that it doesn't have an LCP or an address. An L2 interface can be either in
|
||||
a bridgedomain, or act as an L2 Cross Connect, but not both. Finally, it asserts that each member has an MTU
|
||||
identical to the bridge's MTU value.
|
||||
* ***validate_interfaces()*** is by far the most complex, but a few common things worth calling out is that each
|
||||
sub-interface must have a unique encapsulation, and if a given QinQ or QinAD 2-tagged sub-interface has an LCP,
|
||||
that there exist a parent Dot1Q or Dot1AD interface with the correct encapsulation, and that it also has an LCP.
|
||||
See `config/interface.py` for an extensive overview.
|
||||
|
||||
## Testing
|
||||
|
||||
Of course, in a configuration model so complex as a VPP router, being able to do a lot of validation helps ensure that
|
||||
the constraints above are implemented correctly. To help this along, I use _regular_ unittesting as provided by
|
||||
the Python3 [unittest](https://docs.python.org/3/library/unittest.html) framework, but I extend it to run as well
|
||||
a special kind of test which I call a `YAMLTest`.
|
||||
|
||||
### Unit Testing
|
||||
|
||||
This is bread and butter, and should be straight forward for software engineers. I took a model of so called
|
||||
test-driven development, where I start off by writing a test, which of course fails because the code hasn't been
|
||||
implemented yet. Then I implement the code, and run this and all other unittests expecting them to pass.
|
||||
|
||||
Let me give an example based on BondEthernets, with a YAML config file as follows:
|
||||
|
||||
```
|
||||
bondethernets:
|
||||
BondEthernet0:
|
||||
interfaces: [ GigabitEthernet1/0/0, GigabitEthernet1/0/1 ]
|
||||
interfaces:
|
||||
GigabitEthernet1/0/0:
|
||||
mtu: 3000
|
||||
GigabitEthernet1/0/1:
|
||||
mtu: 3000
|
||||
GigabitEthernet2/0/0:
|
||||
mtu: 3000
|
||||
sub-interfaces:
|
||||
100:
|
||||
mtu: 2000
|
||||
|
||||
BondEthernet0:
|
||||
mtu: 3000
|
||||
lcp: "be012345678"
|
||||
addresses: [ 192.0.2.1/29, 2001:db8::1/64 ]
|
||||
sub-interfaces:
|
||||
100:
|
||||
mtu: 2000
|
||||
addresses: [ 192.0.2.9/29, 2001:db8:1::1/64 ]
|
||||
```
|
||||
|
||||
As I mentioned when discussing the semantic constraints, there's a few here that jump out at me. First, the
|
||||
BondEthernet members `Gi1/0/0` and `Gi1/0/1` must exist. There is one BondEthernet defined in this file (obvious,
|
||||
I know, but bear with me), and `Gi2/0/0` is not a bond member, and certainly `Gi2/0/0.100` is not a bond member,
|
||||
because having a sub-interface as an LACP member would be super weird. Taking things like this into account, here's
|
||||
a few tests that could assert that the behavior of the `bondethernets` map in the YAML config is correct:
|
||||
|
||||
```
|
||||
class TestBondEthernetMethods(unittest.TestCase):
|
||||
def setUp(self):
|
||||
with open("unittest/test_bondethernet.yaml", "r") as f:
|
||||
self.cfg = yaml.load(f, Loader = yaml.FullLoader)
|
||||
|
||||
def test_get_by_name(self):
|
||||
ifname, iface = bondethernet.get_by_name(self.cfg, "BondEthernet0")
|
||||
self.assertIsNotNone(iface)
|
||||
self.assertEqual("BondEthernet0", ifname)
|
||||
self.assertIn("GigabitEthernet1/0/0", iface['interfaces'])
|
||||
self.assertNotIn("GigabitEthernet2/0/0", iface['interfaces'])
|
||||
|
||||
ifname, iface = bondethernet.get_by_name(self.cfg, "BondEthernet-notexist")
|
||||
self.assertIsNone(iface)
|
||||
self.assertIsNone(ifname)
|
||||
|
||||
def test_members(self):
|
||||
self.assertTrue(bondethernet.is_bond_member(self.cfg, "GigabitEthernet1/0/0"))
|
||||
self.assertTrue(bondethernet.is_bond_member(self.cfg, "GigabitEthernet1/0/1"))
|
||||
self.assertFalse(bondethernet.is_bond_member(self.cfg, "GigabitEthernet2/0/0"))
|
||||
self.assertFalse(bondethernet.is_bond_member(self.cfg, "GigabitEthernet2/0/0.100"))
|
||||
|
||||
def test_is_bondethernet(self):
|
||||
self.assertTrue(bondethernet.is_bondethernet(self.cfg, "BondEthernet0"))
|
||||
self.assertFalse(bondethernet.is_bondethernet(self.cfg, "BondEthernet-notexist"))
|
||||
self.assertFalse(bondethernet.is_bondethernet(self.cfg, "GigabitEthernet1/0/0"))
|
||||
|
||||
def test_enumerators(self):
|
||||
ifs = bondethernet.get_bondethernets(self.cfg)
|
||||
self.assertEqual(len(ifs), 1)
|
||||
self.assertIn("BondEthernet0", ifs)
|
||||
self.assertNotIn("BondEthernet-noexist", ifs)
|
||||
```
|
||||
|
||||
Every single function that is defined in the file `config/bondethernet.py` (there are four) will have
|
||||
an accompanying unittest to ensure it works as expected. And every validator module, will have a suite
|
||||
of unittests fully covering their functionality. In total, I wrote a few dozen unit tests like this,
|
||||
in an attempt to be reasonably certain that the config validator functionality works as advertised.
|
||||
|
||||
### YAML Testing
|
||||
|
||||
I added one additional class of unittest called a ***YAMLTest***. What happens here is that a certain YAML configuration
|
||||
file, which may be valid or have errors, is offered to the end to end config parser (so both the Yamale schema
|
||||
validator as well as the semantic validators), and all errors are accounted for. As an example, two sub-interfaces
|
||||
on the same parent cannot have the same encapsulation, so offering the following file to the config validator
|
||||
is _expected_ to trip errors:
|
||||
|
||||
```
|
||||
$ cat unittest/yaml/error-subinterface1.yaml << EOF
|
||||
test:
|
||||
description: "Two subinterfaces can't have the same encapsulation"
|
||||
errors:
|
||||
expected:
|
||||
- "sub-interface .*.100 does not have unique encapsulation"
|
||||
- "sub-interface .*.102 does not have unique encapsulation"
|
||||
count: 2
|
||||
---
|
||||
interfaces:
|
||||
GigabitEthernet1/0/0:
|
||||
sub-interfaces:
|
||||
100:
|
||||
description: "VLAN 100"
|
||||
101:
|
||||
description: "Another VLAN 100, but without exact-match"
|
||||
encapsulation:
|
||||
dot1q: 100
|
||||
102:
|
||||
description: "Another VLAN 100, but without exact-match"
|
||||
encapsulation:
|
||||
dot1q: 100
|
||||
exact-match: True
|
||||
EOF
|
||||
```
|
||||
|
||||
You can see the file here has two YAML documents (separated by `---`), the first one explains to the YAMLTest
|
||||
class what to expect. There can either be no errors (in which case `test.errors.count=0`), or there can be
|
||||
specific errors that are expected. In this case, `Gi1/0/0.100` and `Gi1/0/0/102` have the same encapsulation
|
||||
but `Gi1/0/0.101` is unique (if you're curious, this is because the encap on 100 and 102 has exact-match,
|
||||
but the one one 101 does _not_ have exact-match).
|
||||
|
||||
The implementation of this YAMLTest class is in `tests.py`, which in turn runs all YAML tests on the files it
|
||||
finds in `unittest/yaml/*.yaml` (currently 47 specific cases are tested there, which covered 100% of the
|
||||
semantic constraints), and regular unittests (currently 42, which is a coincidence, I swear!)
|
||||
|
||||
# What's next?
|
||||
|
||||
These tests, together, give me a pretty strong assurance that any given YAML file that passes the validator,
|
||||
is indeed a valid configuration for VPP. In my next post, I'll go one step further, and talk about applying
|
||||
the configuration to a running VPP instance, which is of course the overarching goal. But I would not want
|
||||
to mess up my (or your!) VPP router by feeding it garbage, so the lions' share of my time so far on this project
|
||||
has been to assert the YAML file is both syntactically and semantically valid.
|
||||
|
||||
|
||||
In the mean time, you can take a look at my code on [GitHub](https://github.com/pimvanpelt/vppcfg), but to
|
||||
whet your appetite, here's a hefty configuration that demonstrates all implemented types:
|
||||
|
||||
```
|
||||
bondethernets:
|
||||
BondEthernet0:
|
||||
interfaces: [ GigabitEthernet3/0/0, GigabitEthernet3/0/1 ]
|
||||
|
||||
interfaces:
|
||||
GigabitEthernet3/0/0:
|
||||
mtu: 9000
|
||||
description: "LAG #1"
|
||||
GigabitEthernet3/0/1:
|
||||
mtu: 9000
|
||||
description: "LAG #2"
|
||||
|
||||
HundredGigabitEthernet12/0/0:
|
||||
lcp: "ice0"
|
||||
mtu: 9000
|
||||
addresses: [ 192.0.2.17/30, 2001:db8:3::1/64 ]
|
||||
sub-interfaces:
|
||||
1234:
|
||||
mtu: 1200
|
||||
lcp: "ice0.1234"
|
||||
encapsulation:
|
||||
dot1q: 1234
|
||||
exact-match: True
|
||||
1235:
|
||||
mtu: 1100
|
||||
lcp: "ice0.1234.1000"
|
||||
encapsulation:
|
||||
dot1q: 1234
|
||||
inner-dot1q: 1000
|
||||
exact-match: True
|
||||
|
||||
HundredGigabitEthernet12/0/1:
|
||||
mtu: 2000
|
||||
description: "Bridged"
|
||||
|
||||
BondEthernet0:
|
||||
mtu: 9000
|
||||
lcp: "be0"
|
||||
sub-interfaces:
|
||||
100:
|
||||
mtu: 2500
|
||||
l2xc: BondEthernet0.200
|
||||
encapsulation:
|
||||
dot1q: 100
|
||||
exact-match: False
|
||||
200:
|
||||
mtu: 2500
|
||||
l2xc: BondEthernet0.100
|
||||
encapsulation:
|
||||
dot1q: 200
|
||||
exact-match: False
|
||||
500:
|
||||
mtu: 2000
|
||||
encapsulation:
|
||||
dot1ad: 500
|
||||
exact-match: False
|
||||
501:
|
||||
mtu: 2000
|
||||
encapsulation:
|
||||
dot1ad: 501
|
||||
exact-match: False
|
||||
vxlan_tunnel1:
|
||||
mtu: 2000
|
||||
|
||||
loopbacks:
|
||||
loop0:
|
||||
lcp: "lo0"
|
||||
addresses: [ 10.0.0.1/32, 2001:db8::1/128 ]
|
||||
loop1:
|
||||
lcp: "bvi1"
|
||||
addresses: [ 10.0.1.1/24, 2001:db8:1::1/64 ]
|
||||
|
||||
bridgedomains:
|
||||
bd1:
|
||||
mtu: 2000
|
||||
bvi: loop1
|
||||
interfaces: [ BondEthernet0.500, BondEthernet0.501, HundredGigabitEthernet12/0/1, vxlan_tunnel1 ]
|
||||
bd11:
|
||||
mtu: 1500
|
||||
|
||||
vxlan_tunnels:
|
||||
vxlan_tunnel1:
|
||||
local: 192.0.2.1
|
||||
remote: 192.0.2.2
|
||||
vni: 101
|
||||
```
|
||||
|
||||
The vision for my VPP Configuration utility is that it can move from any existing VPP configuration to any
|
||||
other (validated successfully) configuration with a minimal amount of steps, and that it will plan its
|
||||
way declaratively from A to B, ordering the calls to the API safely and quickly. Interested? Good, because
|
||||
I do expect that a utility like this would be very valuable to serious VPP users!
|
||||
|
Reference in New Issue
Block a user