All checks were successful
continuous-integration/drone/push Build is passing
444 lines
21 KiB
Markdown
444 lines
21 KiB
Markdown
---
|
||
date: "2024-04-06T10:17:54Z"
|
||
title: VPP with loopback-only OSPFv3 - Part 1
|
||
aliases:
|
||
- /s/articles/2024/04/07/vpp-ospf.html
|
||
params:
|
||
asciinema: true
|
||
---
|
||
|
||
{{< image width="200px" float="right" src="/assets/vpp-ospf/bird-logo.svg" alt="Bird" >}}
|
||
|
||
# Introduction
|
||
|
||
A few weeks ago I took a good look at the [[Babel]({{< ref "2024-03-06-vpp-babel-1" >}})] protocol.
|
||
I found a set of features there that I really appreciated. The first was a latency aware routing
|
||
protocol - this is useful for mesh (wireless) networks but it is also a good fit for IPng's usecase,
|
||
notably because it makes use of carrier ethernet which, if any link in the underlying MPLS network
|
||
fails, will automatically re-route but sometimes with much higher latency. In these cases, Babel can
|
||
reconverge on its own to a topology that has the lowest end to end latency.
|
||
|
||
But a second really cool find, is that Babel can use IPv6 nexthops for IPv4 destinations - which is
|
||
_super_ useful because it will allow me to retire all of the IPv4 /31 point to point networks
|
||
between my routers. AS8298 has about half of a /24 tied up in these otherwise pointless (pun
|
||
intended) transit networks.
|
||
|
||
In the same week, my buddy Benoit asked a question about OSPFv3 on the Bird users mailinglist
|
||
[[ref](https://www.mail-archive.com/bird-users@network.cz/msg07961.html)] which may or may not have
|
||
been because I had been messing around with Babel using only IPv4 loopback interfaces. And just a
|
||
few weeks before that, the incomparable Nico from [[Ungleich](https://ungleich.ch/)] had a very
|
||
similar question [[ref](https://bird.network.cz/pipermail/bird-users/2024-January/017370.html)].
|
||
|
||
These three folks have something in common - we're all trying to conserve IPv4 addresses!
|
||
|
||
# OSPFv3 with IPv4 🙁
|
||
|
||
Nico's thread referenced [[RFC 5838](https://datatracker.ietf.org/doc/html/rfc5838)] which defines
|
||
support for multiple address families in OSPFv3. It does this by mapping a given address family to
|
||
a specific instance of OSPFv3 using the _instance id_ and adding a new option to the _options field_
|
||
that tells neighbors that multiple address families are supported in this instance (and thus, that
|
||
the neighbor should not assume all link state advertisements are IPv6-only).
|
||
|
||
This way, multiple instances can run on the same router, and they will only form adjacencies with
|
||
neighbors that are operating in the same address family. This in itself doesn't change much: rather
|
||
than using IPv4 multicast in the hello's while forming adjacencies, OSPFv3 will use IPv6 link local
|
||
addresses for them.
|
||
|
||
RFC 5838, Section 2.5 says:
|
||
> Although IPv6 link local addresses could be used as next hops for IPv4 address families, it is
|
||
> desirable to have IPv4 next-hop addresses. [ ... ] In order to achieve this, the link's IPv4
|
||
> address will be advertised in the "link local address" field of the IPv4 instance's Link-LSA.
|
||
> This address is placed in the first 32 bits of the "link local address" field and is used for IPv4
|
||
> next-hop calculations. The remaining bits MUST be set to zero.
|
||
|
||
First my hopes are raised by saying IPv6 link local addresses _could_ be used as
|
||
next hops (just like Babel, yaay!), but then it goes on to say the link local address field will be
|
||
overridden with an IPv4 address in the top 32 bits. That's ... gross. I understand why this was
|
||
done, it allows for a minimal deviation of the OSPFv3 protocol, but this unfortunate choice
|
||
precludes the ability for IPv6 nexthops to be used. Crap on a cracker!
|
||
|
||
# OSPFv3 with IPv4 🥰
|
||
|
||
But wait, not all is lost! Remember in my [[VPP Babel]({{< ref "2024-03-06-vpp-babel-1" >}})]
|
||
article I mentioned that VPP has this ability to run _unnumbered_ interfaces? To recap, this is a
|
||
configuration where a primary interface, typically a loopback, will have an IPv4 and IPv6 address,
|
||
say **192.168.10.2/32** and **2001:678:d78:200::2/128** and other interfaces will borrow from that.
|
||
That will allow for the IPv4 address to be present on multiple interfaces, like so:
|
||
|
||
```
|
||
pim@vpp0-2:~$ ip -br a
|
||
lo UNKNOWN 127.0.0.1/8 ::1/128
|
||
loop0 UNKNOWN 192.168.10.2/32 2001:678:d78:200::2/128 fe80::dcad:ff:fe00:0/64
|
||
e0 UP 192.168.10.2/32 2001:678:d78:200::2/128 fe80::5054:ff:fef0:1120/64
|
||
e1 UP 192.168.10.2/32 2001:678:d78:200::2/128 fe80::5054:ff:fef0:1121/64
|
||
```
|
||
|
||
### VPP Changes
|
||
|
||
Historically in VPP, broadcast medium like ethernet will respond to ARP requests only if the
|
||
requestor is in the same subnet. With these point to point interfaces, the remote will _never_ be in
|
||
the same subnet, because we're using /32 addresses here! VPP logs these as invalid ARP requests.
|
||
With a small change though, I can make VPP become tolerant of this scenario, and the consensus
|
||
in the VPP community is that this is OK.
|
||
|
||
{{< image src="/assets/vpp-ospf/vpp-diff1.png" alt="VPP Diff #1" >}}
|
||
|
||
Check out [[40482](https://gerrit.fd.io/r/c/vpp/+/40482)] for the full change, but in a nutshell,
|
||
just before deciding to return an error because the requesting source address is not directly
|
||
connected (called an `attached` route in VPP), I'll change the condition to allow for it, if and
|
||
only if the ARP request comes from an _unnumbered_ interface.
|
||
|
||
I think this is a good direction, if only because most other popular implementations (including
|
||
Linux, FreeBSD, Cisco IOS/XR and Juniper) will answer ARP requests that are onlink but not directly
|
||
connected, in the same way.
|
||
|
||
### Bird2 Changes
|
||
|
||
Meanwhile, in the Bird community, we were thinking about solving this problem in a different way.
|
||
Babel allows a feature to use IPv6 transit networks with IPv4 destinations, by specifying an option
|
||
called `extended next hop`. With this option, Babel will set a nexthop across address families. It
|
||
may sound freaky at first, but it's not too strange when you think about it. Take a look at my
|
||
explanation in the [[Babel]({{< ref "2024-03-06-vpp-babel-1" >}})] article on how IPv6 neighbor
|
||
discovery can take the place of IPv4 ARP resolution to figure out the ethernet next hop.
|
||
|
||
So our initial take was: why don't we do that with OSPFv3 as well? We thought of a trick to
|
||
get that Link LSA hack from RFC5838 removed: what if Bird, upon setting the `extended next hop`
|
||
feature on an interface, would simply put the IPv6 address back like it was, rather than overwriting
|
||
it with the IPv4 address? That way, we'd just learn routes to IPv4 destinations with nexthops on
|
||
IPv6 linklocal addresses. It would break compatibility with other vendors, but seeing as it is an
|
||
optional feature which defaults to off, perhaps it is a reasonable compromise...
|
||
|
||
Ondrej started to work on it, but came back a few days later with a different solution, which is
|
||
quite clever. Any IPv4 router needs at least one IPv4 address anyways, to be able to send ICMP
|
||
messages, so there is no need to put IPv4 addresses on links. Ondrej's theory corroborates my
|
||
previous comments for Babel's IPv4-less routing:
|
||
|
||
> I’ve learned so far that I (a) MAY use IPv6 link-local networks in order to forward IPv4 packets,
|
||
> as I can use IPv6 NDP to find the link-layer next hop; and (b) each router SHOULD be able to
|
||
> originate ICMPv4 packets, therefore it needs at least one IPv4 address.
|
||
>
|
||
> These two claims mean that I need at most one IPv4 address on each router.
|
||
|
||
Ondrej's proposal for Bird2 will, when OSPFv3 is used with IPv4 destinations, keep the RFC5838
|
||
behavior and try to _find_ a working IPv4 address to put in the Link LSA:
|
||
{{< image src="/assets/vpp-ospf/bird-diff1.png" alt="Bird Diff #1" >}}
|
||
|
||
He adds a function `update_loopback_addr()`, which scans all interfaces for an IPv4 address, and if
|
||
there are multiple, prefer host addresses, then addresses from OSPF stub interfaces, and finally
|
||
just any old IPv4 address. Now that IPv4 address can be simply used to put in the Link LSA. Slick!
|
||
|
||
His change also removes next-hop-in-address-range check for OSPFv3 when using IPv4, and
|
||
automatically adds onlink flag to such routes, which newly accepts next hops that are not directly
|
||
connected:
|
||
{{< image src="/assets/vpp-ospf/bird-diff2.png" alt="Bird Diff #2" >}}
|
||
|
||
I realize when reading the code that this change paired with the [[Gerrit](https://gerrit.fd.io/r/c/vpp/+/40482)]
|
||
are perfect partners:
|
||
1. Ondrej's change will make the Link LSA be _onlink_, which is a way to describe that the next hop
|
||
is not directly connected, in other words nexthop `192.168.10.3/32`, while the router itself is
|
||
`192.168.10.2/32`.
|
||
1. My change will make VPP answer for ARP requests in such a scenario where the router with an
|
||
_unnumbered_ interface with `192.168.10.3/32` will respond to a request from the not directly
|
||
connected _onlink_ peer at `192.168.10.2`.
|
||
|
||
## Tying it together
|
||
|
||
With all of that, I am ready to demonstrate two working solutions now. I first compile Bird2 with
|
||
Ondrej's [[commit](https://gitlab.nic.cz/labs/bird/-/commit/280daed57d061eb1ebc89013637c683fe23465e8)].
|
||
Then, I compile VPP with my pending [[gerrit](https://gerrit.fd.io/r/c/vpp/+/40482)]. Finally,
|
||
to demonstrate how `update_loopback_addr()` might work, I compile `lcpng` with my previous
|
||
[[commit](https://git.ipng.ch/ipng/lcpng/commit/a960d64a87849d312b32d9432ffb722672c14878)],
|
||
which allows me to inhibit copying forward addresses from VPP to Linux, when using _unnumbered_
|
||
interfaces.
|
||
|
||
I take an IPng lab instance out for a spin with this updated Bird2 and VPP+lcpng environment:
|
||
|
||
{{< image src="/assets/vpp-mpls/LAB v2.svg" alt="Lab Setup" >}}
|
||
|
||
### Solution 1: Somewhat unnumbered
|
||
|
||
I configure an otherwise empty VPP dataplane as follows:
|
||
|
||
```
|
||
vpp0-3# lcp lcp-sync on
|
||
vpp0-3# lcp lcp-sync-unnumbered on
|
||
|
||
vpp0-3# create loopback interface instance 0
|
||
vpp0-3# set interface state loop0 up
|
||
vpp0-3# set interface ip address loop0 192.168.10.3/32
|
||
vpp0-3# set interface ip address loop0 2001:678:d78:200::3/128
|
||
|
||
vpp0-3# set interface mtu 9000 GigabitEthernet10/0/0
|
||
vpp0-3# set interface mtu packet 9000 GigabitEthernet10/0/0
|
||
vpp0-3# set interface unnumbered GigabitEthernet10/0/0 use loop0
|
||
vpp0-3# set interface state GigabitEthernet10/0/0 up
|
||
|
||
vpp0-3# lcp create loop0 host-if loop0
|
||
vpp0-3# lcp create GigabitEthernet10/0/0 host-if e0
|
||
```
|
||
|
||
Which yields the following configuration:
|
||
```
|
||
pim@vpp0-3:~$ ip -br a
|
||
lo UNKNOWN 127.0.0.1/8 ::1/128
|
||
loop0 UNKNOWN 192.168.10.3/32 2001:678:d78:200::3/128 fe80::dcad:ff:fe00:0/64
|
||
e0 UP 192.168.10.3/32 2001:678:d78:200::3/128 fe80::5054:ff:fef0:1130/64
|
||
pim@vpp0-3:~$ ip route get 182.168.10.2
|
||
RTNETLINK answers: Network is unreachable
|
||
```
|
||
|
||
I can see that VPP copied forward the IPv4/IPv6 addresses to interface `e0`, and because there's no
|
||
routing protocol running yet, the neighbor router `vpp0-2` is unreachable. Let me fix that, next. I
|
||
start bird in the VPP dataplane network namespace, and configure it as follows:
|
||
|
||
```
|
||
router id 192.168.10.3;
|
||
|
||
protocol device { scan time 30; }
|
||
protocol direct { ipv4; ipv6; check link yes; }
|
||
|
||
protocol kernel kernel4 {
|
||
ipv4 { import none; export where source != RTS_DEVICE; };
|
||
learn off; scan time 300;
|
||
}
|
||
|
||
protocol kernel kernel6 {
|
||
ipv6 { import none; export where source != RTS_DEVICE; };
|
||
learn off; scan time 300;
|
||
}
|
||
|
||
protocol bfd bfd1 {
|
||
interface "e*" {
|
||
interval 100 ms;
|
||
multiplier 20;
|
||
};
|
||
}
|
||
|
||
protocol ospf v3 ospf4 {
|
||
ipv4 { export all; import where (net ~ [ 192.168.10.0/24+, 0.0.0.0/0 ]); };
|
||
area 0 {
|
||
interface "loop0" { stub yes; };
|
||
interface "e0" { type pointopoint; cost 5; bfd on; };
|
||
};
|
||
}
|
||
|
||
protocol ospf v3 ospf6 {
|
||
ipv6 { export all; import where (net ~ [ 2001:678:d78:200::/56, ::/0 ]); };
|
||
area 0 {
|
||
interface "loop0" { stub yes; };
|
||
interface "e0" { type pointopoint; cost 5; bfd on; };
|
||
};
|
||
}
|
||
```
|
||
|
||
This minimal Bird2 configuration will configure the main protocols `device`, `direct`,
|
||
and two kernel protocols `kernel4` and `kernel6`, which are instructed to export learned routes from
|
||
the kernel for all but directly connected routes (because the Linux kernel and VPP already have
|
||
these when an interface is brought up, this avoids duplicate connected route entries).
|
||
|
||
If you haven't come across it yet, _Bidirectional Forwarding Detection_ or _BFD_ is a protocol that
|
||
repeatedly sends UDP packets between routers, to be able to detect if the forwarding is interrupted
|
||
even if the interface link stays up. It's described in detail in
|
||
[[RFC5880](https://www.rfc-editor.org/rfc/rfc5880.txt)], and I use it at IPng Networks all over the
|
||
place.
|
||
|
||
{{< image width="100px" float="left" src="/assets/shared/brain.png" alt="Brain" >}}
|
||
|
||
Then I'll configure two OSPF protocols, one for IPv4 called `ospf4` and another for IPv6 called
|
||
`ospf6`. It's easy to overlook, but while usually the IPv4 protocol is OSPFv2 and the IPv6 protocol
|
||
is OSPFv3, here _both_ are using OSPFv3! I'll instruct Bird to erect a _BFD_ session for any
|
||
neighbor it establishes an adjacency with. If at any point the BFD session times out (currently at
|
||
20x100ms or 2.0s), OSPF will tear down the adjacency.
|
||
|
||
The OSPFv3 protocols each define one channel, in which I allow Bird to export anything, but import
|
||
only those routes that are in the LAB IPv4 (192.168.10.0/24) and IPv6 (2001:687:d78:200::/56), and
|
||
I'll also allow a default to be learned over OSPF for both address families. That'll come in handy
|
||
later.
|
||
|
||
I start up Bird on the rightmost two routers in the lab (`vpp0-3` and `vpp0-2`). Looking at
|
||
`vpp0-3`, Bird starts sending IPv6 hello packets on interface `e0`, and pretty quickly finds not one
|
||
but two neighbors:
|
||
|
||
```
|
||
pim@vpp0-3:~$ birdc show ospf neighbors
|
||
BIRD v2.15.1-4-g280daed5-x ready.
|
||
ospf4:
|
||
Router ID Pri State DTime Interface Router IP
|
||
192.168.10.2 1 Full/PtP 30.870 e0 fe80::5054:ff:fef0:1121
|
||
|
||
ospf6:
|
||
Router ID Pri State DTime Interface Router IP
|
||
192.168.10.2 1 Full/PtP 30.870 e0 fe80::5054:ff:fef0:1121
|
||
```
|
||
|
||
Bird is able to sort out which is which on account of the 'normal' IPv6 OSPFv3 having an _instance
|
||
id_ value of 0 (IPv6 Unicast), and the IPv4 OSPFv3 having an _instance id_ of 64 (IPv4 Unicast).
|
||
Further, the IPv4 variant will set the AF-bit in the OSPFv3 options, so the peer will know it
|
||
supports using the Link LSA to model IPv4 nexthops rather than IPv6 nexthops.
|
||
|
||
Indeed, routes are quickly learned:
|
||
|
||
```
|
||
pim@vpp0-3:~$ birdc show route table master4
|
||
BIRD v2.15.1-4-g280daed5-x ready.
|
||
Table master4:
|
||
192.168.10.3/32 unicast [direct1 13:02:56.883] * (240)
|
||
dev loop0
|
||
unicast [direct1 13:02:56.883] (240)
|
||
dev e0
|
||
unicast [ospf4 13:02:56.980] I (150/0) [192.168.10.3]
|
||
dev loop0
|
||
dev e0
|
||
192.168.10.2/32 unicast [ospf4 13:03:04.980] * I (150/5) [192.168.10.2]
|
||
via 192.168.10.2 on e0 onlink
|
||
|
||
```
|
||
|
||
They are quickly propagated both to the Linux kernel, and by means of Netlink into the Linux Control
|
||
Plane plugin in VPP, which programs it into VPP's _FIB_:
|
||
|
||
```
|
||
pim@vpp0-3:~$ ip ro
|
||
192.168.10.2 via 192.168.10.2 dev e0 proto bird metric 32 onlink
|
||
|
||
pim@vpp0-3:~$ vppctl show ip fib 192.168.10.2
|
||
ipv4-VRF:0, fib_index:0, flow hash:[src dst sport dport proto flowlabel ] epoch:0 flags:none locks:[adjacency:1, default-route:1, lcp-rt:1, ]
|
||
192.168.10.2/32 fib:0 index:23 locks:3
|
||
lcp-rt-dynamic refs:1 src-flags:added,contributing,active,
|
||
path-list:[40] locks:2 flags:shared, uPRF-list:22 len:1 itfs:[1, ]
|
||
path:[53] pl-index:40 ip4 weight=1 pref=32 attached-nexthop: oper-flags:resolved,
|
||
192.168.10.2 GigabitEthernet10/0/0
|
||
[@0]: ipv4 via 192.168.10.2 GigabitEthernet10/0/0: mtu:9000 next:6 flags:[] 525400f01121525400f011300800
|
||
|
||
adjacency refs:1 entry-flags:attached, src-flags:added, cover:-1
|
||
path-list:[43] locks:1 uPRF-list:24 len:1 itfs:[1, ]
|
||
path:[56] pl-index:43 ip4 weight=1 pref=0 attached-nexthop: oper-flags:resolved,
|
||
192.168.10.2 GigabitEthernet10/0/0
|
||
[@0]: ipv4 via 192.168.10.2 GigabitEthernet10/0/0: mtu:9000 next:6 flags:[] 525400f01121525400f011300800
|
||
Extensions:
|
||
path:56
|
||
forwarding: unicast-ip4-chain
|
||
[@0]: dpo-load-balance: [proto:ip4 index:28 buckets:1 uRPF:22 to:[0:0]]
|
||
[0] [@5]: ipv4 via 192.168.10.2 GigabitEthernet10/0/0: mtu:9000 next:6 flags:[] 525400f01121525400f011300800
|
||
```
|
||
|
||
The neighbor is reachable, over IPv6 (which is nothing special), but also over IPv4:
|
||
|
||
```
|
||
pim@vpp0-3:~$ ping -c5 2001:678:d78:200::2
|
||
PING 2001:678:d78:200::2(2001:678:d78:200::2) 56 data bytes
|
||
64 bytes from 2001:678:d78:200::2: icmp_seq=1 ttl=64 time=2.16 ms
|
||
64 bytes from 2001:678:d78:200::2: icmp_seq=2 ttl=64 time=3.69 ms
|
||
64 bytes from 2001:678:d78:200::2: icmp_seq=3 ttl=64 time=2.66 ms
|
||
64 bytes from 2001:678:d78:200::2: icmp_seq=4 ttl=64 time=2.30 ms
|
||
64 bytes from 2001:678:d78:200::2: icmp_seq=5 ttl=64 time=2.92 ms
|
||
|
||
--- 2001:678:d78:200::2 ping statistics ---
|
||
5 packets transmitted, 5 received, 0% packet loss, time 4006ms
|
||
rtt min/avg/max/mdev = 2.164/2.747/3.687/0.540 ms
|
||
|
||
pim@vpp0-3:~$ ping -c5 192.168.10.2
|
||
PING 192.168.10.2 (192.168.10.2) 56(84) bytes of data.
|
||
64 bytes from 192.168.10.2: icmp_seq=1 ttl=64 time=3.58 ms
|
||
64 bytes from 192.168.10.2: icmp_seq=2 ttl=64 time=3.40 ms
|
||
64 bytes from 192.168.10.2: icmp_seq=3 ttl=64 time=3.28 ms
|
||
64 bytes from 192.168.10.2: icmp_seq=4 ttl=64 time=3.32 ms
|
||
64 bytes from 192.168.10.2: icmp_seq=5 ttl=64 time=3.29 ms
|
||
|
||
--- 192.168.10.2 ping statistics ---
|
||
5 packets transmitted, 5 received, 0% packet loss, time 4007ms
|
||
rtt min/avg/max/mdev = 3.283/3.374/3.577/0.109 ms
|
||
```
|
||
|
||
**✅ OSPFv3 with IPv4/IPv6 on-link nexthops works!**
|
||
|
||
### Solution 2: Truly unnumbered
|
||
|
||
However, Ondrej's patch does something in addition to this. I repeat the same setup, except now I
|
||
set one additional feature when starting up VPP: `lcp lcp-sync-unnumbered off`
|
||
|
||
What happens next is that VPP's dataplane looks subtly different. It has created an _unnumbered_
|
||
interface keyed off of `loop0`, but it doesn't propagate the addresses to Linux.
|
||
|
||
```
|
||
pim@vpp0-3:~$ ip -br a
|
||
lo UNKNOWN 127.0.0.1/8 ::1/128
|
||
loop0 UNKNOWN 192.168.10.3/32 2001:678:d78:200::3/128 fe80::dcad:ff:fe00:0/64
|
||
e0 UP fe80::5054:ff:fef0:1130/64
|
||
```
|
||
|
||
With `e0` only having a linklocal address, Bird can still form an adjacency with its neighbor
|
||
`vpp0-2`, because adjacencies in OSPFv3 are formed using IPv6 only. However, the clever trick to
|
||
walk the list of interfaces in `update_loopback_addr()` will be able to find a usable IPv4 address,
|
||
and use that to put in the Link LSA using RFC5838. In this case, it finds `192.168.10.3` from
|
||
interface `loop0` so it'll use that to signal the next hop for LSAs that it sends.
|
||
|
||
Now I start the same VPP and Bird configuration on all four VPP routers, but on `vpp0-0` I'll add a
|
||
static route out of the LAB to the internet:
|
||
|
||
```
|
||
protocol static static4 {
|
||
ipv4 { export all; };
|
||
route 0.0.0.0/0 via 192.168.10.4;
|
||
}
|
||
|
||
protocol static static6 {
|
||
ipv6 { export all; };
|
||
route ::/0 via 2001:678:d78:201::ffff;
|
||
}
|
||
```
|
||
|
||
These two default routes from `vpp0-0` quickly propagate through the network, where `vpp0-3`
|
||
ultimately sees this:
|
||
|
||
```
|
||
pim@vpp0-3:~$ ip -br a
|
||
lo UNKNOWN 127.0.0.1/8 ::1/128
|
||
loop0 UNKNOWN 192.168.10.3/32 2001:678:d78:200::3/128 fe80::dcad:ff:fe00:0/64
|
||
e0 UP fe80::5054:ff:fef0:1130/64
|
||
|
||
pim@vpp0-3:~$ ip ro
|
||
default via 192.168.10.2 dev e0 proto bird metric 32 onlink
|
||
192.168.10.0 via 192.168.10.2 dev e0 proto bird metric 32 onlink
|
||
192.168.10.1 via 192.168.10.2 dev e0 proto bird metric 32 onlink
|
||
192.168.10.2 via 192.168.10.2 dev e0 proto bird metric 32 onlink
|
||
192.168.10.4/31 via 192.168.10.2 dev e0 proto bird metric 32 onlink
|
||
|
||
pim@vpp0-3:~$ ip -6 ro
|
||
2001:678:d78:200:: via fe80::5054:ff:fef0:1121 dev e0 proto bird metric 32 pref medium
|
||
2001:678:d78:200::1 via fe80::5054:ff:fef0:1121 dev e0 proto bird metric 32 pref medium
|
||
2001:678:d78:200::2 via fe80::5054:ff:fef0:1121 dev e0 proto bird metric 32 pref medium
|
||
2001:678:d78:200::3 dev loop0 proto kernel metric 256 pref medium
|
||
2001:678:d78:201::/112 via fe80::5054:ff:fef0:1121 dev e0 proto bird metric 32 pref medium
|
||
fe80::/64 dev loop0 proto kernel metric 256 pref medium
|
||
fe80::/64 dev e0 proto kernel metric 256 pref medium
|
||
default via fe80::5054:ff:fef0:1121 dev e0 proto bird metric 32 pref medium
|
||
```
|
||
|
||
**✅ OSPFv3 with loopback-only, _unnumbered_ IPv4/IPv6 interfaces works!**
|
||
|
||
## Results
|
||
|
||
I thought I'd record a little [[asciinema](/assets/vpp-ospf/clean.cast),
|
||
[gif](/assets/vpp-ospf/clean.gif)] that shows the end to end
|
||
configuration, starting from an empty dataplane and bird configuration. I'll show _Solution 2_, that
|
||
is, the solution that doesn't copy the _unnumbered_ interfaces in VPP to Linux.
|
||
|
||
Ready? Here I go!
|
||
|
||
{{< asciinema src="/assets/vpp-ospf/clean.cast" >}}
|
||
|
||
|
||
### To unnumbered or Not To unnumbered
|
||
|
||
I'm torn between _Solution 1_ and _Solution 2_. While on the one hand, setting the _unnumbered_
|
||
interface would be best reflected in Linux, it is not without problems. If the operator subsequently
|
||
tries to remove one of the addresses on `e0` or `e1`, that will yield a desync between Linux and
|
||
VPP (Linux will have removed the address, but VPP will still be _unnumbered_). On the other hand,
|
||
tricking Linux (and the operator) to believe there isn't an IPv4 (and IPv6) address configured on
|
||
the interface, is also not great.
|
||
|
||
Of the two approaches, I think I prefer _Solution 2_ (configuring the Linux CP plugin to not sync
|
||
_unnumbered_ addresses), because it minimizes the chance of operator error. If you're reading this and
|
||
have an Opinion™, would you please let me know?
|