From cc0be8341f3d045fbaf341d63c8fb7aa980e0f8b Mon Sep 17 00:00:00 2001 From: Pim van Pelt Date: Tue, 10 Aug 2021 20:49:55 +0200 Subject: [PATCH] Add some rambling notes on VPP and LCP and linux interface naming --- README.md | 101 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 101 insertions(+) diff --git a/README.md b/README.md index 09204ed..5d4b06e 100644 --- a/README.md +++ b/README.md @@ -42,3 +42,104 @@ is created with protocol 802.1q ? VPP only allows sub interfaces to carry one Rather than applying heuristics and adding bugs, it is not possible to create VPP interfaces via Linux, only the other way around. Create any L3 capable interface or subinterface in VPP, and it'll be created in Linux as well. + +### Notes + +We'll be able to see if VPP changes the interfaces with a bunch of callback +functions: +``` +pim@hippo:~/src/vpp$ grep -r VNET.*_FUNCTION src/vnet/interface.h +#define VNET_HW_INTERFACE_ADD_DEL_FUNCTION(f) \ +#define VNET_HW_INTERFACE_LINK_UP_DOWN_FUNCTION(f) \ +#define VNET_SW_INTERFACE_MTU_CHANGE_FUNCTION(f) \ +#define VNET_SW_INTERFACE_ADD_DEL_FUNCTION(f) \ +#define VNET_SW_INTERFACE_ADMIN_UP_DOWN_FUNCTION(f) \ +``` + +Super useful for MTU changes, admin up/dn and link up/dn changes (to copy +these into the TAP and the host interface). + +Notably missing here is some form of SW_INTERFACE_L2_L3_CHANGE_FUNCTION(f) +which would be useful to remove an LCP if the iface goes into L2 mode, and +(re)create an LCP if the iface goes into L3 mode. + +It will also be very useful to create an IP_ADDRESS_ADD_DEL_FUNCTION(f) +of sorts so that we know when VPP sets an IPv4/IPv6 address (so that we +can copy this into the host interface). + +#### LCP names + +The maximum length of an interface name in Linux is 15 characters. I'd love +to be able to make interfaces like you might find in DANOS: +dp0p6s0f1 but this is already 9 characters, and the slot and PCI bus can be +double digits. The Linux idiom is to make a link as a child of another link, +like: + ip link add link eth0 name eth0.1234 type vlan id 1234 proto dot1q + +You can also make QinQ interfaces in the same way: + ip link add link eth0.1234 name eth0.1234.1000 type vlan id 1234 + +This last interface will have 5 characters .1000 for the inner, 5 characters +.1234 for the outer, leaving 5 characters for the full interface name. + +I can see two ways out of this: +1. Make main interfaces very short +For example `et0` for DPDK interfaces, `be1` for BondEthernet, `lo4` for +Loopback interfaces, and possibly `bvi5` for BridgeVirtualInterface (these +are the L3 interfaces in an L2 bridge-group). In such a world, we can create +any number of subinterfaces in a Linux _idiomatic way_, like et192.1234.1000 +but BVIs will be limited to 100 interfaces and ethernet's to 1000. This seems +okay, but does paint us in the corner possibly in the future. + +1. Strictly follow VPP's naming +VPP will always have exactly one (integer) numbered subinterface, like +`TenGigabitEthernet3/0/2.1234567890` and the function of such a subint can take +multiple forms (dot1q, dot1ad, double-tagged, etc). In this world, we can create +interface names in Linux that map very cleanly to VPP's subinterfaces, and we +can also auto-create subinterfaces by reading a netlink link message, provided +the chosen name follows an . pattern, and maps to a known +LCP. + +Here's how I can see the latter approach working: + +Creation of VPP sub-interface directly creates the corresponding LCP device +name `e0.1234`. Creating a more complex dot1q/dot1q or dot1ad or dot1ad/dot1q +sub-interface will again create a mirror LCP with device name `e0.1235`, +reusing the sub-int number from VPP directly. That's good. + +Reading a netlink new link message is a bit trickier, but here's what I +propose: +* On a physical interface `e0`, one can create a dot1q or dot1ad link with a + linux name of `e0.1234`; in such a case, a corresponding sub-int will be + made by the plugin in VPP. +* On a linux interface `e0.1234`, only a dot1q interface can be made, but + its name is required to follow the . pattern, and if so, + the plugin will make a VPP interface corresponding to that number, but + considering we already know if the parent is either a .1q interface or a + .1ad interface, the correct type of VPP interface can be selected. + * If an interface name is not valid, the plugin will destroy the link and + log an error. + * Because we'll want to use one TAP interface per LCP (see rationale + above), the plugin will have to destroy the link and recreate it as a + new interface with the same name, but not as a child of the originally + requested parent interface. + +With this, we can: + +1. Create a dot1q subinterface on `e0`: + `ip link add link e0 name e0.1234 type vlan id 1234` +1. Create a dot1q outer, dot1q inner subinterface on `e0`: + `ip link add link e0.1234 name e0.1235 type vlan id 1000` +1. Create a dot1ad subinterface on `e0`: + `ip link add link e0 name e0.1236 type vlan id 2345 proto 802.1ad` +1. Create a dot1ad outer, dot1q inner subinterface on `e0`: + `ip link add link e0.1236 name e0.1237 type vlan id 2345 proto 802.1q` +1. Fail to create an interface which has an invalid name: + `ip link add link e0.1234 name e0.1234.2345 type vlan id 2345 proto 802.1q` + (this interface will be destroyed by the plugin and an error logged) + +For each of these interface creations, the user is asking them to be created as +a child of an existing interface (either `e0` or `e0.1234`), but the plugin +will destroy that interface again, and recreate it as a top-level interface +int the namespace, with an accompanying tap interface. So in this plugin, every +LCP has exactly one TAP, and that TAP interface is never a sub-int.