Add article on SR Linux + Arista EVPN
All checks were successful
continuous-integration/drone/push Build is passing
All checks were successful
continuous-integration/drone/push Build is passing
This commit is contained in:
757
content/articles/2025-04-09-frysix-evpn.md
Normal file
757
content/articles/2025-04-09-frysix-evpn.md
Normal file
@ -0,0 +1,757 @@
|
||||
---
|
||||
date: "2025-04-09T07:51:23Z"
|
||||
title: 'FrysIX eVPN: think different'
|
||||
---
|
||||
|
||||
{{< image float="right" src="/assets/frys-ix/frysix-logo-small.png" alt="FrysIX Logo" width="12em" >}}
|
||||
|
||||
# Introduction
|
||||
|
||||
Somewhere in the far north of the Netherlands, the country where I was born, a town called Jubbega
|
||||
is the home of the Frysian Internet Exchange called [[Frys-IX](https://frys-ix.net/)]. Back in 2021,
|
||||
a buddy of mine, Arend, said that he was planning on renting a rack at the NIKHEF facility, one of
|
||||
the most densely populated facilities in western Europe. He was looking for a few launching
|
||||
customers and I was definitely in the market for a presence in Amsterdam. I even wrote about it on
|
||||
my [[bucketlist]({{< ref 2021-07-26-bucketlist.md >}})]. Arend and his IT company
|
||||
[[ERITAP](https://www.eritap.com/)], took delivery in May of 2021, and this is when the internet
|
||||
exchange with _Frysian roots_ was born.
|
||||
|
||||
In the years from 2021 until now, Arend and I have been operating the exchange with reasonable
|
||||
success. It grew from a handful of folks in that first rack, to now some 250 participating ISPs
|
||||
with about ten switches in six datacenters across the Amsterdam metro area. It's shifting a cool
|
||||
800Gbit of traffic or so. It's dope, and very rewarding to be a part of this community!
|
||||
|
||||
## Frys-IX is growing
|
||||
|
||||
We have several members with a 2x100G LAG and even though all inter-datacenter links are either dark
|
||||
fiber or WDM, we're starting to feel the growing pains as we set our sights to the next step growth.
|
||||
You see, when FrysIX did 13.37Gbit of traffic, Arend organized a barbecue. When it did 133.7Gbit of
|
||||
traffic, Arend organized an even bigger barbecue. Obviously, the next step is 1337Gbit and joining
|
||||
the infamous [[One TeraBit Club](https://github.com/tking/OneTeraBitClub)]. Thomas: we're coming!
|
||||
|
||||
It became clear that we would not be able to keep a dependable peering platform if FrysIX was a
|
||||
single L2 broadcast domain, and it also became clear that concatenating multiple 100G ports would be
|
||||
operationally expensive (think of all the dark fiber or WDM waves!), and brittle (think of LACP and
|
||||
balancing traffic over those ports). We need to modernize in order to stay ahead of the growth
|
||||
curve.
|
||||
|
||||
## Hello Nokia
|
||||
|
||||
{{< image float="right" src="/assets/frys-ix/nokia-7220-d4.png" alt="Nokia 7220-D4" width="20em" >}}
|
||||
|
||||
The Nokia 7220 Interconnect Router (7220 IXR) for data center fabric provides fixed-configuration,
|
||||
high-capacity platforms that let you bring unmatched scale, flexibility and operational simplicity
|
||||
to your data center networks and peering network environments. These devices are built around the
|
||||
Broadcom _Trident_ chipset, in the case of the lefthand "D4" platform, this is a Trident4 with
|
||||
28x100G and 8x400G ports.
|
||||
|
||||
{{< image float="right" src="/assets/frys-ix/IXR-7220-D3.jpg" alt="Nokia 7220-D3" width="20em" >}}
|
||||
|
||||
What I find particularly awesome of the Trident series is their speed (total bandwidth of
|
||||
12.8Tbps _per router_), low power use (without optics, the IXR-7220-D4 consumes about 150W) and
|
||||
a plethora of advanced capabilities like L2/L3 filtering, IPv4, IPv6 and MPLS routing, and modern
|
||||
approaches to scale-out networking such as VXLAN based EVPN. At the FrysIX barbecue in September of
|
||||
2024, FrysIX was gifted a rather powerful IXR-7220-D3 router, shown in the picture to the right.
|
||||
|
||||
ERITAP has bought two (new in box) IXR-7220-D4 (8x400G,28x100G) routers, and has also acquired two
|
||||
IXR-7220-D2 (48x25G,8x100G) routers. So in total, FrysIX is now the proud owner of five of these
|
||||
beautiful Nokia devices. If you haven't yet, you should definitely read about these versatile
|
||||
routers on the [[Nokia](https://onestore.nokia.com/asset/207599)] website, and some details of the
|
||||
_merchant silicon_ switch chips in use on the
|
||||
[[Broadcom](https://www.broadcom.com/products/ethernet-connectivity/switching/strataxgs/bcm56880-series)]
|
||||
website.
|
||||
|
||||
### eVPN: A small rant
|
||||
|
||||
{{< image float="right" src="/assets/frys-ix/FrysIX_ Topology (concept).svg" alt="Topology Concept" width="50%" >}}
|
||||
|
||||
First, I need to get something off my chest. Consider a topology for an internet exchange platform,
|
||||
taking into account the available equipment, rackspace, power, and cross connects. Somehow, almost
|
||||
every design or reference architecture I can find on the Internet, assumes folks want to build a
|
||||
[[Clos network](https://en.wikipedia.org/wiki/Clos_network)], which has a topology existing of leaf
|
||||
and spine switches. The _spine_ switches have a different set of features than the _leaf_ ones,
|
||||
notably they don't have to do provider edge functionality like VXLAN encap and decapsulation.
|
||||
Almost all of these designs are showing how one might build a leaf-spine network for hyperscale.
|
||||
|
||||
**Critique 1**: my 'spine' (IXR-7220-D4 routers) must also be provider edge. Practically speaking,
|
||||
in the picture above I have these beautiful Nokia IXR-7220-D4 switches, using two 400G ports to
|
||||
connect between the facilities, and six 100G ports to connect the smaller breakout switches. That
|
||||
would leave a _massive_ amount of capacity unused: 22x 100G and 6x400G ports, to be exact.
|
||||
|
||||
**Critique 2**: all 'leaf' (either IXR-7220-D2 routers or Arista switches) can't realistically
|
||||
connect to both 'spines'. Our devices are spread out over two (and in practice, more like six)
|
||||
datacenters, and it's prohibitively expensive to get 100G waves or dark fiber to create a full mesh.
|
||||
It's much more economical to create a star-topology that minimizes cross-datacenter fiber spans.
|
||||
|
||||
**Critique 3**: Most of these 'spine-leaf' reference architectures assume that the interior gateway
|
||||
protocol is EBGP in what they call the _underlay_, and on top of that, some secondary EBGP that's
|
||||
called the _overlay_. Frankly, such a design makes my head spin a little bit. These designs assume
|
||||
hundreds of switches, in which case making use of one AS number per switch could make sense (as iBGP
|
||||
needs either a 'full mesh', or external route reflectors).
|
||||
|
||||
Setting aside eVPN for a second, if I were to build a transport network, much like [[IPng Site
|
||||
Local]({{< ref 2023-03-11-mpls-core.md >}})], I would use a simpler design:
|
||||
1. Take a classic IGP like [[OSPF](https://en.wikipedia.org/wiki/Open_Shortest_Path_First)], or
|
||||
perhaps [[IS-IS](https://en.wikipedia.org/wiki/IS-IS)]. There is no benefit, to me at least, to use
|
||||
BGP as an IGP.
|
||||
1. I would give each of the links between the switches an IPv4 /31 and enable link-local, and give
|
||||
each switch a loopback address with a /32 IPv4 and a /128 IPv6.
|
||||
1. If I had multiple links between two given switches, I would probably just use ECMP if my devices
|
||||
supported it, and fall back to a LACP signaled bundle-ethernet otherwise.
|
||||
1. If I were to need to use BGP (and for eVPN, this need exists), taking the ISP mindset (as opposed
|
||||
to the datacenter fabric mindset), I would simply install iBGP against two or three route
|
||||
reflectors, and exchange routing information within the same single AS number.
|
||||
|
||||
### eVPN: A demo topology
|
||||
|
||||
{{< image float="right" src="/assets/frys-ix/Nokia Arista VXLAN.svg" alt="Demo topology" width="50%" >}}
|
||||
|
||||
So, that's exactly how I'm going to approach the FrysIX eVPN design: OSPF for the underlay and iBGP
|
||||
for the overlay! I have a feeling that some folks will dispise me for being contrarian, but you can
|
||||
leave your comments below, and don't forget to like-and-subscribe :-)
|
||||
|
||||
Arend builds this topology for me in Jubbega - also known as FrysIX HQ. He takes the two
|
||||
400G-capable switches and connects them. Then he takes an Arista DCS-7060CX switch (which is eVPN
|
||||
capable, with 32x100G ports, based on the Broadcom Tomahawk3 chipset), and a smaller Nokia
|
||||
IXR-7220-D2 (with 48x25G and 8x100G ports, based on the Trident3 chipset).
|
||||
|
||||
#### Underlay: Nokia's SR Linux
|
||||
|
||||
We boot up the lab, verify that all the optics and links are up, and connect the management ports to
|
||||
an OOB network that I can remotely log in to. This is the first time that either of us work on
|
||||
Nokia, but I find it reasonably intuitive once I get a few tips and tricks from Niek.
|
||||
|
||||
```
|
||||
[linuxadmin@nikhef ~]$ sr_cli
|
||||
--{ running }--[ ]--
|
||||
A:linuxadmin@nikhef# enter candidate
|
||||
--{ candidate shared default }--[ ]--
|
||||
A:linuxadmin@nikhef# set / interface lo0 admin-state enable
|
||||
A:linuxadmin@nikhef# set / interface lo0 subinterface 0 admin-state enable
|
||||
A:linuxadmin@nikhef# set / interface lo0 subinterface 0 ipv4 admin-state enable
|
||||
A:linuxadmin@nikhef# set / interface lo0 subinterface 0 ipv4 address 198.19.16.1/32
|
||||
A:linuxadmin@nikhef# commit stay
|
||||
```
|
||||
|
||||
There, my first config snippet! This creates a _loopback_ interface, and similar to JunOS, a
|
||||
_subinterface_ (which Juniper calls a _unit_) which enables IPv4 and gives it an /32 address. In SR
|
||||
Linux, any interface has to be associated with a _network-instance_, think of those as routing
|
||||
domains or VRFs. There's a conveniently named _default_ network-instance, which I'll add this and
|
||||
the point-to-point interface between the two 400G routers to:
|
||||
|
||||
```
|
||||
A:linuxadmin@nikhef# info flat interface ethernet-1/29
|
||||
set / interface ethernet-1/29 admin-state enable
|
||||
set / interface ethernet-1/29 subinterface 0 admin-state enable
|
||||
set / interface ethernet-1/29 subinterface 0 ip-mtu 9190
|
||||
set / interface ethernet-1/29 subinterface 0 ipv4 admin-state enable
|
||||
set / interface ethernet-1/29 subinterface 0 ipv4 address 198.19.17.1/31
|
||||
set / interface ethernet-1/29 subinterface 0 ipv6 admin-state enable
|
||||
|
||||
A:linuxadmin@nikhef# set / network-instance default type default
|
||||
A:linuxadmin@nikhef# set / network-instance default admin-state enable
|
||||
A:linuxadmin@nikhef# set / network-instance default interface ethernet-1/29.0
|
||||
A:linuxadmin@nikhef# set / network-instance default interface lo0.0
|
||||
A:linuxadmin@nikhef# commit stay
|
||||
```
|
||||
|
||||
Cool. Assuming I now also do this on the other IXR-7220-D4 router, called _equinix_ (which gets the
|
||||
loopback address 198.19.16.0/32 and the point-to-point on the 400G interface of 198.19.17.0/31), I
|
||||
should be able to do my first ping:
|
||||
|
||||
```
|
||||
A:linuxadmin@equinix# ping network-instance default 198.19.17.1 -s 9162 -M do
|
||||
Using network instance default
|
||||
PING 198.19.17.1 (198.19.17.1) 9162(9190) bytes of data.
|
||||
9170 bytes from 198.19.17.1: icmp_seq=1 ttl=64 time=0.466 ms
|
||||
9170 bytes from 198.19.17.1: icmp_seq=2 ttl=64 time=0.477 ms
|
||||
9170 bytes from 198.19.17.1: icmp_seq=3 ttl=64 time=0.547 ms
|
||||
```
|
||||
|
||||
#### Underlay: SR Linux OSPF
|
||||
|
||||
OK, let's get these two Nokia routers to speak OSPF, so that they can reach each others' loopbacks.
|
||||
It's really easy:
|
||||
|
||||
```
|
||||
A:linuxadmin@nikhef# / network-instance default protocols ospf instance default
|
||||
--{ candidate shared default }--[ network-instance default protocols ospf instance default ]--
|
||||
A:linuxadmin@nikhef# set admin-state enable
|
||||
A:linuxadmin@nikhef# set version ospf-v2
|
||||
A:linuxadmin@nikhef# set router-id 198.19.16.1
|
||||
A:linuxadmin@nikhef# set area 0.0.0.0 interface ethernet-1/29.0 interface-type point-to-point
|
||||
A:linuxadmin@nikhef# set area 0.0.0.0 interface lo0.0 passive true
|
||||
A:linuxadmin@nikhef# commit stay
|
||||
```
|
||||
|
||||
Similar to in JunOS, I can descend into a configuration scope (the first line goes into the
|
||||
_network-instance_ called `default` and then the _protocols_ called `ospf`, and then the _instance_
|
||||
called `default`. Subsequent `set` commands operate at this scope. Once I commit this configuration
|
||||
(on the nikhef router and also the equinix router, with its own router-id), OSPF shoots to life
|
||||
immediately:
|
||||
|
||||
```
|
||||
A:linuxadmin@nikhef# show network-instance default protocols ospf neighbor
|
||||
=========================================================================================
|
||||
Net-Inst default OSPFv2 Instance default Neighbors
|
||||
=========================================================================================
|
||||
+---------------------------------------------------------------------------------------+
|
||||
| Interface-Name Rtr Id State Pri RetxQ Time Before Dead |
|
||||
+=======================================================================================+
|
||||
| ethernet-1/29.0 198.19.16.0 full 1 0 36 |
|
||||
+---------------------------------------------------------------------------------------+
|
||||
-----------------------------------------------------------------------------------------
|
||||
No. of Neighbors: 1
|
||||
=========================================================================================
|
||||
|
||||
A:linuxadmin@nikhef# show network-instance default route-table all | more
|
||||
IPv4 unicast route table of network instance default
|
||||
+------------------+-----+------------+--------------+--------+----------+--------+------+-------------+-----------------+
|
||||
| Prefix | ID | Route Type | Route Owner | Active | Origin | Metric | Pref | Next-hop | Next-hop |
|
||||
| | | | | | Network | | | (Type) | Interface |
|
||||
| | | | | | Instance | | | | |
|
||||
+==================+=====+============+==============+========+==========+========+======+=============+=================+
|
||||
| 198.19.16.0/32 | 0 | ospfv2 | ospf_mgr | True | default | 1 | 10 | 198.19.17.0 | ethernet-1/29.0 |
|
||||
| | | | | | | | | (direct) | |
|
||||
| 198.19.16.1/32 | 7 | host | net_inst_mgr | True | default | 0 | 0 | None | None |
|
||||
| 198.19.17.0/31 | 6 | local | net_inst_mgr | True | default | 0 | 0 | 198.19.17.1 | ethernet-1/29.0 |
|
||||
| | | | | | | | | (direct) | |
|
||||
| 198.19.17.1/32 | 6 | host | net_inst_mgr | True | default | 0 | 0 | None | None |
|
||||
+==================+=====+============+==============+========+==========+========+======+=============+=================+
|
||||
|
||||
A:linuxadmin@nikhef# ping network-instance default 198.19.16.0
|
||||
Using network instance default
|
||||
PING 198.19.16.0 (198.19.16.0) 56(84) bytes of data.
|
||||
64 bytes from 198.19.16.0: icmp_seq=1 ttl=64 time=0.484 ms
|
||||
64 bytes from 198.19.16.0: icmp_seq=2 ttl=64 time=0.663 ms
|
||||
```
|
||||
|
||||
Delicious! OSPF has learned the loopback, and it is now reachable. As with most things, going from 0
|
||||
to 1 (in this case: understanding how SR Linux works at all) is the most difficult part. Then going
|
||||
from 1 to 2 is critical (in this case: making two routers interact with OSPF), but from there on,
|
||||
going from 2 to N is easy (in my case: enabling several other point-to-point /31 transit networks on
|
||||
the Nikhef router, using ethernet-1/1.0 through ethernet-1/4.0 with the correct MTU and turning on OSPF
|
||||
for these), makes the whole network shoot to life.
|
||||
|
||||
#### Underlay: Arista
|
||||
|
||||
I'll point out that one of the devices in this topology is an Arista. We have several of these ready
|
||||
for deployment at FrysIX. They are a lot more affordable, come with 32x100G ports, and are really
|
||||
good at packet slinging because they're based on the Broadcom _Tomahawk_ chipset. They pack a few less
|
||||
faetures than the _Trident_ chipset, but they happen to have all the features we need to run our
|
||||
internet exchange . So I turn my attention to the Arista in the topology. I am much more comfortable
|
||||
configuring the whole thing here, as it's not my first time touching these devices:
|
||||
|
||||
```
|
||||
arista-leaf#show run int loop0
|
||||
interface Loopback0
|
||||
ip address 198.19.16.2/32
|
||||
ip ospf area 0.0.0.0
|
||||
arista-leaf#show run int Ethernet32/1
|
||||
interface Ethernet32/1
|
||||
description Core: Connected to nikhef:ethernet-1/2
|
||||
load-interval 1
|
||||
mtu 9190
|
||||
no switchport
|
||||
ip address 198.19.17.5/31
|
||||
ip ospf cost 1000
|
||||
ip ospf network point-to-point
|
||||
ip ospf area 0.0.0.0
|
||||
arista-leaf#show run section router ospf
|
||||
router ospf 65500
|
||||
router-id 198.19.16.2
|
||||
redistribute connected
|
||||
network 198.19.0.0/16 area 0.0.0.0
|
||||
max-lsa 12000
|
||||
```
|
||||
|
||||
I complete the configuration for the other two core ports on this Arista, port Eth31/1 connects also
|
||||
to the nikhef IXR-7220-D4 and I give it a high cost of 1000, while Eth30/1 connects only 1x100G to
|
||||
the nokia-leaf IXR-7220-D2 with a cost of 10.
|
||||
It's nice to see that OSPF in action - there are two equal path (but high cost) OSPF paths via
|
||||
router-id 198.19.16.1 (nikhef), and there's one lower cost path via router-id 198.19.16.3
|
||||
(nokia-leaf). The traceroute nicely shows the scenic route (arista-leaf -> nokia-leaf -> nokia ->
|
||||
equinix).
|
||||
```
|
||||
arista-leaf#show ip ospf nei
|
||||
Neighbor ID Instance VRF Pri State Dead Time Address Interface
|
||||
198.19.16.1 65500 default 1 FULL 00:00:36 198.19.17.4 Ethernet32/1
|
||||
198.19.16.3 65500 default 1 FULL 00:00:31 198.19.17.11 Ethernet30/1
|
||||
198.19.16.1 65500 default 1 FULL 00:00:35 198.19.17.2 Ethernet31/1
|
||||
|
||||
arista-leaf#traceroute 198.19.16.0
|
||||
traceroute to 198.19.16.0 (198.19.16.0), 30 hops max, 60 byte packets
|
||||
1 198.19.17.11 (198.19.17.11) 0.220 ms 0.150 ms 0.206 ms
|
||||
2 198.19.17.6 (198.19.17.6) 0.169 ms 0.107 ms 0.099 ms
|
||||
3 198.19.16.0 (198.19.16.0) 0.434 ms 0.346 ms 0.303 ms
|
||||
```
|
||||
|
||||
So far, so good! The _underlay_ is up, every router can reach every other router on its loopback,
|
||||
and all OSPF adjacencies are formed. I'll leave the 2x100G between _nikhef_ and _arista-leaf_ at
|
||||
high cost for now.
|
||||
|
||||
#### Overlay EVPN: SR Linux
|
||||
|
||||
The big-idea here is to use iBGP with the same AS number, and because there are two main facilities
|
||||
(NIKHEF and Equinix), make each of those bigger IXR-7220-D4 routers act as route-reflectors for
|
||||
others. It means that they will have an iBGP session amongst themselves (198.191.16.0 <->
|
||||
198.19.16.1) and otherwise accept iBGP sessions from any IP address in the 198.19.16.0/24 subnet.
|
||||
This way, I don't have to configure any more than strictly necessary on the core routers. Any new
|
||||
router can just plug in, form an OSPF adjacency, and connect to both core routers. I proceed to
|
||||
configure the Nokia's like this:
|
||||
```
|
||||
A:linuxadmin@nikhef# / network-instance default protocols bgp
|
||||
A:linuxadmin@nikhef# set admin-state enable
|
||||
A:linuxadmin@nikhef# set autonomous-system 65500
|
||||
A:linuxadmin@nikhef# set router-id 198.19.16.1
|
||||
A:linuxadmin@nikhef# set dynamic-neighbors accept match 198.19.16.0/24 peer-group overlay
|
||||
A:linuxadmin@nikhef# set afi-safi evpn admin-state enable
|
||||
A:linuxadmin@nikhef# set preference ibgp 170
|
||||
A:linuxadmin@nikhef# set route-advertisement rapid-withdrawal true
|
||||
A:linuxadmin@nikhef# set route-advertisement wait-for-fib-install false
|
||||
A:linuxadmin@nikhef# set group overlay peer-as 65500
|
||||
A:linuxadmin@nikhef# set group overlay afi-safi evpn admin-state enable
|
||||
A:linuxadmin@nikhef# set group overlay afi-safi ipv4-unicast admin-state disable
|
||||
A:linuxadmin@nikhef# set group overlay afi-safi ipv6-unicast admin-state disable
|
||||
A:linuxadmin@nikhef# set group overlay local-as as-number 65500
|
||||
A:linuxadmin@nikhef# set group overlay route-reflector client true
|
||||
A:linuxadmin@nikhef# set group overlay transport local-address 198.19.16.1
|
||||
A:linuxadmin@nikhef# set neighbor 198.19.16.0 admin-state enable
|
||||
A:linuxadmin@nikhef# set neighbor 198.19.16.0 peer-group overlay
|
||||
A:linuxadmin@nikhef# commit stay
|
||||
```
|
||||
|
||||
I can see that iBGP sessions establish between all the devices:
|
||||
|
||||
```
|
||||
A:linuxadmin@nikhef# show network-instance default protocols bgp neighbor
|
||||
---------------------------------------------------------------------------------------------------------------------------
|
||||
BGP neighbor summary for network-instance "default"
|
||||
Flags: S static, D dynamic, L discovered by LLDP, B BFD enabled, - disabled, * slow
|
||||
---------------------------------------------------------------------------------------------------------------------------
|
||||
---------------------------------------------------------------------------------------------------------------------------
|
||||
+-------------+-------------+----------+-------+----------+-------------+---------------+------------+--------------------+
|
||||
| Net-Inst | Peer | Group | Flags | Peer-AS | State | Uptime | AFI/SAFI | [Rx/Active/Tx] |
|
||||
+=============+=============+==========+=======+==========+=============+===============+============+====================+
|
||||
| default | 198.19.16.0 | overlay | S | 65500 | established | 0d:0h:2m:32s | evpn | [0/0/0] |
|
||||
| default | 198.19.16.2 | overlay | D | 65500 | established | 0d:0h:2m:27s | evpn | [0/0/0] |
|
||||
| default | 198.19.16.3 | overlay | D | 65500 | established | 0d:0h:2m:41s | evpn | [0/0/0] |
|
||||
+-------------+-------------+----------+-------+----------+-------------+---------------+------------+--------------------+
|
||||
---------------------------------------------------------------------------------------------------------------------------
|
||||
Summary:
|
||||
1 configured neighbors, 1 configured sessions are established, 0 disabled peers
|
||||
2 dynamic peers
|
||||
```
|
||||
|
||||
A few things to note here - there one _configured_ neighbor (this is the other IXR-7220-D4 router),
|
||||
and two _dynamic_ peers, these are the Arista and the smaller IXR-7220-D2 router. The only address
|
||||
family that they are exchanging information for is the _evpn_ family, and no prefixes have been
|
||||
learned or sent het (that's the `[0/0/0]` designation in the last column).
|
||||
|
||||
#### Overlay EVPN: Arista
|
||||
|
||||
The Arista is also remarkably straight forward to configure. Here, I'll simply enable the iBGP
|
||||
session as follows:
|
||||
|
||||
```
|
||||
arista-leaf#show run section bgp
|
||||
router bgp 65500
|
||||
neighbor evpn peer group
|
||||
neighbor evpn remote-as 65500
|
||||
neighbor evpn update-source Loopback0
|
||||
neighbor evpn ebgp-multihop 3
|
||||
neighbor evpn send-community extended
|
||||
neighbor evpn maximum-routes 12000 warning-only
|
||||
neighbor 198.19.16.0 peer group evpn
|
||||
neighbor 198.19.16.1 peer group evpn
|
||||
!
|
||||
address-family evpn
|
||||
neighbor evpn activate
|
||||
|
||||
arista-leaf#show bgp summary
|
||||
BGP summary information for VRF default
|
||||
Router identifier 198.19.16.2, local AS number 65500
|
||||
Neighbor AS Session State AFI/SAFI AFI/SAFI State NLRI Rcd NLRI Acc
|
||||
----------- ----------- ------------- ----------------------- -------------- ---------- ----------
|
||||
198.19.16.0 65500 Established IPv4 Unicast Advertised 0 0
|
||||
198.19.16.0 65500 Established L2VPN EVPN Negotiated 0 0
|
||||
198.19.16.1 65500 Established IPv4 Unicast Advertised 0 0
|
||||
198.19.16.1 65500 Established L2VPN EVPN Negotiated 0 0
|
||||
```
|
||||
|
||||
On this leaf node, I'll have a redundant iBGP session with the two core nodes. Since those core
|
||||
nodes are peering amongst themselves, and are configured as route-reflectors, this is all I need. No
|
||||
matter how many additional Arista (or Nokia) devices I add to the network, all they'll have to do is
|
||||
enable OSPF (so they can reach 198.19.16.0 and .1) and turn on iBGP sesions with both. Voila!
|
||||
|
||||
#### VXLAN EVPN: SR Linux
|
||||
Nokia informs me that it uses a special interface called _system0_ to source its VXLAN traffic from.
|
||||
So it's a matter of defining that interface and associate a VXLAN interface with it, like so:
|
||||
|
||||
```
|
||||
A:linuxadmin@nikhef# set / interface system0 admin-state enable
|
||||
A:linuxadmin@nikhef# set / interface system0 subinterface 0 admin-state enable
|
||||
A:linuxadmin@nikhef# set / interface system0 subinterface 0 ipv4 admin-state enable
|
||||
A:linuxadmin@nikhef# set / interface system0 subinterface 0 ipv4 address 198.19.18.1/32
|
||||
A:linuxadmin@nikhef# set / network-instance default interface system0.0
|
||||
A:linuxadmin@nikhef# set / tunnel-interface vxlan1 vxlan-interface 2604 type bridged
|
||||
A:linuxadmin@nikhef# set / tunnel-interface vxlan1 vxlan-interface 2604 ingress vni 2604
|
||||
A:linuxadmin@nikhef# set / tunnel-interface vxlan1 vxlan-interface 2604 egress source-ip use-system-ipv4-address
|
||||
A:linuxadmin@nikhef# commit stay
|
||||
```
|
||||
|
||||
This creates the plumbing for a VXLAN sub-interface called `vxlan1.2604` which will accept/send
|
||||
traffic using VNI 2604 (this happens to be the VLAN id we use at FrysIX for our production Peering
|
||||
LAN), and it'll use the `system0.0` address to source that traffic from.
|
||||
|
||||
The second part is to create what SR Linux calls a MAC-VRF and put some interface in it:
|
||||
|
||||
```
|
||||
A:linuxadmin@nikhef# set / interface ethernet-1/9 admin-state enable
|
||||
A:linuxadmin@nikhef# set / interface ethernet-1/9 breakout-mode num-breakout-ports 4
|
||||
A:linuxadmin@nikhef# set / interface ethernet-1/9 breakout-mode breakout-port-speed 10G
|
||||
A:linuxadmin@nikhef# set / interface ethernet-1/9/3 admin-state enable
|
||||
A:linuxadmin@nikhef# set / interface ethernet-1/9/3 vlan-tagging true
|
||||
A:linuxadmin@nikhef# set / interface ethernet-1/9/3 subinterface 0 type bridged
|
||||
A:linuxadmin@nikhef# set / interface ethernet-1/9/3 subinterface 0 admin-state enable
|
||||
A:linuxadmin@nikhef# set / interface ethernet-1/9/3 subinterface 0 vlan encap untagged
|
||||
|
||||
A:linuxadmin@nikhef# / network-instance peeringlan
|
||||
A:linuxadmin@nikhef# set type mac-vrf
|
||||
A:linuxadmin@nikhef# set admin-state enable
|
||||
A:linuxadmin@nikhef# set interface ethernet-1/9/3.0
|
||||
A:linuxadmin@nikhef# set vxlan-interface vxlan1.2604
|
||||
A:linuxadmin@nikhef# set protocols bgp-evpn bgp-instance 1 admin-state enable
|
||||
A:linuxadmin@nikhef# set protocols bgp-evpn bgp-instance 1 vxlan-interface vxlan1.2604
|
||||
A:linuxadmin@nikhef# set protocols bgp-evpn bgp-instance 1 evi 2604
|
||||
A:linuxadmin@nikhef# set protocols bgp-vpn bgp-instance 1 route-distinguisher rd 65500:2604
|
||||
A:linuxadmin@nikhef# set protocols bgp-vpn bgp-instance 1 route-target export-rt target:65500:2604
|
||||
A:linuxadmin@nikhef# set protocols bgp-vpn bgp-instance 1 route-target import-rt target:65500:2604
|
||||
A:linuxadmin@nikhef# commit stay
|
||||
```
|
||||
|
||||
In the first block here, I take what is a 100G port called `ethernet-1/9` and I split it into 4x25G
|
||||
ports. I'll force the port speed to 10G because Arend has taken a 40G-4x10G DAC, and it happens that
|
||||
the third lane is plugged into the Debian machine. So on `ethernet-1/9/3` I'll create a
|
||||
sub-interface, make it type _bridged_ (which I've also done on `vxlan1.2604`!) and allow any
|
||||
untagged traffic to enter it.
|
||||
|
||||
{{< image width="5em" float="left" src="/assets/shared/brain.png" alt="brain" >}}
|
||||
|
||||
If you, like me, are used to either VPP or IOS/XR, this type of sub-interface stuff should feel very
|
||||
natural to you. I've written about the sub-interfaces logic on Cisco's IOS/XR and VPP approach in a
|
||||
previous [[article]({{< ref 2022-02-14-vpp-vlan-gym.md >}})] which my buddy Fred lovingly calls
|
||||
_VLAN Gymnastics_ because the ports are just so damn flexible. Worth a read!
|
||||
|
||||
The second block creates a new _network-instance_ which I'll name `peeringlan`, and it associates
|
||||
the newly crated untagged sub-interface ethernet-1/9/3.0 with with the VXLAN interface, and starts a
|
||||
protocol for eVPN instructing traffic in and out of this network-instance to use EVI 2604 on the
|
||||
VXLAN interface, and signalling of all MAC addresses learned to use route-distinguisher and
|
||||
import/export route-targets. For simplicity I've just used the same for each: 65500:2604.
|
||||
|
||||
I continue to add an interface to the `peeringlan` _network-instance_ on the other two Nokia
|
||||
routers: `ethernet-1/9/3.0` on the equinix router and `ethernet-1/9.0` on the nokia-leaf router.
|
||||
Each of these goes to a 10Gbps port on a Debian machine.
|
||||
|
||||
#### VXLAN EVPN: Arista
|
||||
|
||||
At this point I'm feeling pretty bullish about the whole project. Arista does not make it very
|
||||
difficult on me to configure it for L2 EVPN (which is called MAC-VRF here also):
|
||||
|
||||
```
|
||||
arista-leaf#conf t
|
||||
vlan 2604
|
||||
name v-peeringlan
|
||||
interface Ethernet9/3
|
||||
speed forced 10000full
|
||||
switchport access vlan 2604
|
||||
|
||||
interface Loopback1
|
||||
ip address 198.19.18.2/32
|
||||
interface Vxlan1
|
||||
vxlan source-interface Loopback1
|
||||
vxlan udp-port 4789
|
||||
vxlan vlan 2604 vni 2604
|
||||
```
|
||||
|
||||
After creating VLAN 2604 on making port Eth9/3 an access port in that VLAN, I'll add a VTEP endpoint
|
||||
called `Loopback1`, and a VXLAN interface that uses that to source its traffic. Here, I'll associate
|
||||
local VLAN 2604 with the `Vxlan1` and its VNI 2604, to match up with how I configured the Nokias.
|
||||
|
||||
Finally, it's a matter of tying these together by announcing the MAC addresses into the EVPN iBGP
|
||||
sessions:
|
||||
```
|
||||
arista-leaf#conf t
|
||||
router bgp 65500
|
||||
vlan 2604
|
||||
rd 65500:2604
|
||||
route-target both 65500:2604
|
||||
redistribute learned
|
||||
!
|
||||
```
|
||||
|
||||
### Results
|
||||
|
||||
To validate the configurations, I learn a cool trick from my buddy Andy on the SR Linux discord
|
||||
server. In EOS, I can ask it to check for any obvious mistakes in two places:
|
||||
|
||||
```
|
||||
arista-leaf#show vxlan config-sanity detail
|
||||
Category Result Detail
|
||||
---------------------------------- -------- --------------------------------------------------
|
||||
Local VTEP Configuration Check OK
|
||||
Loopback IP Address OK
|
||||
VLAN-VNI Map OK
|
||||
Flood List OK
|
||||
Routing OK
|
||||
VNI VRF ACL OK
|
||||
Decap VRF-VNI Map OK
|
||||
VRF-VNI Dynamic VLAN OK
|
||||
Remote VTEP Configuration Check OK
|
||||
Remote VTEP OK
|
||||
Platform Dependent Check OK
|
||||
VXLAN Bridging OK
|
||||
VXLAN Routing OK VXLAN Routing not enabled
|
||||
CVX Configuration Check OK
|
||||
CVX Server OK Not in controller client mode
|
||||
MLAG Configuration Check OK Run 'show mlag config-sanity' to verify MLAG config
|
||||
Peer VTEP IP OK MLAG peer is not connected
|
||||
MLAG VTEP IP OK
|
||||
Peer VLAN-VNI OK
|
||||
Virtual VTEP IP OK
|
||||
MLAG Inactive State OK
|
||||
|
||||
arista-leaf#show bgp evpn sanity detail
|
||||
Category Check Status Detail
|
||||
-------- -------------------- ------ ------
|
||||
General Send community OK
|
||||
General Multi-agent mode OK
|
||||
General Neighbor established OK
|
||||
L2 MAC-VRF route-target OK
|
||||
import and export
|
||||
L2 MAC-VRF OK
|
||||
route-distinguisher
|
||||
L2 MAC-VRF redistribute OK
|
||||
L2 MAC-VRF overlapping OK
|
||||
VLAN
|
||||
L2 Suppressed MAC OK
|
||||
VXLAN VLAN to VNI map for OK
|
||||
MAC-VRF
|
||||
VXLAN VRF to VNI map for OK
|
||||
IP-VRF
|
||||
```
|
||||
|
||||
#### Results: Arista view
|
||||
|
||||
Inspecting the MAC addresses learned from all four of the client ports on the Debian machine is
|
||||
easy:
|
||||
|
||||
```
|
||||
arista-leaf#show bgp evpn summary
|
||||
BGP summary information for VRF default
|
||||
Router identifier 198.19.16.2, local AS number 65500
|
||||
Neighbor Status Codes: m - Under maintenance
|
||||
Neighbor V AS MsgRcvd MsgSent InQ OutQ Up/Down State PfxRcd PfxAcc
|
||||
198.19.16.0 4 65500 3311 3867 0 0 18:06:28 Estab 7 7
|
||||
198.19.16.1 4 65500 3308 3873 0 0 18:06:28 Estab 7 7
|
||||
|
||||
arista-leaf#show bgp evpn vni 2604 next-hop 198.19.18.3
|
||||
BGP routing table information for VRF default
|
||||
Router identifier 198.19.16.2, local AS number 65500
|
||||
Route status codes: * - valid, > - active, S - Stale, E - ECMP head, e - ECMP
|
||||
c - Contributing to ECMP, % - Pending BGP convergence
|
||||
Origin codes: i - IGP, e - EGP, ? - incomplete
|
||||
AS Path Attributes: Or-ID - Originator ID, C-LST - Cluster List, LL Nexthop - Link Local Nexthop
|
||||
|
||||
Network Next Hop Metric LocPref Weight Path
|
||||
* >Ec RD: 65500:2604 mac-ip e43a.6e5f.0c59
|
||||
198.19.18.3 - 100 0 i Or-ID: 198.19.16.3 C-LST: 198.19.16.1
|
||||
* ec RD: 65500:2604 mac-ip e43a.6e5f.0c59
|
||||
198.19.18.3 - 100 0 i Or-ID: 198.19.16.3 C-LST: 198.19.16.0
|
||||
* >Ec RD: 65500:2604 imet 198.19.18.3
|
||||
198.19.18.3 - 100 0 i Or-ID: 198.19.16.3 C-LST: 198.19.16.1
|
||||
* ec RD: 65500:2604 imet 198.19.18.3
|
||||
198.19.18.3 - 100 0 i Or-ID: 198.19.16.3 C-LST: 198.19.16.0
|
||||
```
|
||||
There's a lot to unpack here! The Arista is seeing that from the _route-discriminator_ I configured
|
||||
on all the sessions, it is learning one MAC address on neighbor 198.19.18.3 (this is the VTEP for
|
||||
the nokia-leaf router) from both iBGP sessions. The MAC address is learned from originator
|
||||
198.19.16.3 (the loopback of the nokia-leaf router), from two cluster members, the _active_ one on
|
||||
iBGP speaker 198.19.16.1 (nikhef) and a backup member on 198.19.16.0 (equinix).
|
||||
|
||||
I can also see that there's a bunch of `imet` route entries, and Andy explained these to me. They are
|
||||
a signal from a VTEP participant that they are interested in seeing multicast traffic (like neighbor
|
||||
discovery or ARP requests) flooded to them. Every router participating in this L2VPN will raise such
|
||||
an `imet` route, which I'll see in duplicates as well (one from each iBGP session). This checks out.
|
||||
|
||||
#### Results: SR Linux view
|
||||
|
||||
The Nokia IXR-7220-D4 router called _equinix_ has also learned a bunch of EVPN routing entries,
|
||||
which I can inspect as follows:
|
||||
|
||||
```
|
||||
A:linuxadmin@equinix# show network-instance default protocols bgp routes evpn route-type summary
|
||||
--------------------------------------------------------------------------------------------------------------------------------------------------------------------
|
||||
Show report for the BGP route table of network-instance "default"
|
||||
--------------------------------------------------------------------------------------------------------------------------------------------------------------------
|
||||
Status codes: u=used, *=valid, >=best, x=stale, b=backup
|
||||
Origin codes: i=IGP, e=EGP, ?=incomplete
|
||||
--------------------------------------------------------------------------------------------------------------------------------------------------------------------
|
||||
BGP Router ID: 198.19.16.0 AS: 65500 Local AS: 65500
|
||||
--------------------------------------------------------------------------------------------------------------------------------------------------------------------
|
||||
Type 2 MAC-IP Advertisement Routes
|
||||
+--------+---------------+--------+-------------------+------------+-------------+------+-------------+--------+--------------------------------+------------------+
|
||||
| Status | Route- | Tag-ID | MAC-address | IP-address | neighbor | Path-| Next-Hop | Label | ESI | MAC Mobility |
|
||||
| | distinguisher | | | | | id | | | | |
|
||||
+========+===============+========+===================+============+=============+======+============-+========+================================+==================+
|
||||
| u*> | 65500:2604 | 0 | E4:3A:6E:5F:0C:57 | 0.0.0.0 | 198.19.16.1 | 0 | 198.19.18.1 | 2604 | 00:00:00:00:00:00:00:00:00:00 | - |
|
||||
| * | 65500:2604 | 0 | E4:3A:6E:5F:0C:58 | 0.0.0.0 | 198.19.16.1 | 0 | 198.19.18.2 | 2604 | 00:00:00:00:00:00:00:00:00:00 | - |
|
||||
| u*> | 65500:2604 | 0 | E4:3A:6E:5F:0C:58 | 0.0.0.0 | 198.19.16.2 | 0 | 198.19.18.2 | 2604 | 00:00:00:00:00:00:00:00:00:00 | - |
|
||||
| * | 65500:2604 | 0 | E4:3A:6E:5F:0C:59 | 0.0.0.0 | 198.19.16.1 | 0 | 198.19.18.3 | 2604 | 00:00:00:00:00:00:00:00:00:00 | - |
|
||||
| u*> | 65500:2604 | 0 | E4:3A:6E:5F:0C:59 | 0.0.0.0 | 198.19.16.3 | 0 | 198.19.18.3 | 2604 | 00:00:00:00:00:00:00:00:00:00 | - |
|
||||
+--------+---------------+--------+-------------------+------------+-------------+------+-------------+--------+--------------------------------+------------------+
|
||||
--------------------------------------------------------------------------------------------------------------------------------------------------------------------
|
||||
Type 3 Inclusive Multicast Ethernet Tag Routes
|
||||
+--------+-----------------------------+--------+---------------------+-----------------+--------+-----------------------+
|
||||
| Status | Route-distinguisher | Tag-ID | Originator-IP | neighbor | Path- | Next-Hop |
|
||||
| | | | | | id | |
|
||||
+========+=============================+========+=====================+=================+========+=======================+
|
||||
| u*> | 65500:2604 | 0 | 198.19.18.1 | 198.19.16.1 | 0 | 198.19.18.1 |
|
||||
| * | 65500:2604 | 0 | 198.19.18.2 | 198.19.16.1 | 0 | 198.19.18.2 |
|
||||
| u*> | 65500:2604 | 0 | 198.19.18.2 | 198.19.16.2 | 0 | 198.19.18.2 |
|
||||
| * | 65500:2604 | 0 | 198.19.18.3 | 198.19.16.1 | 0 | 198.19.18.3 |
|
||||
| u*> | 65500:2604 | 0 | 198.19.18.3 | 198.19.16.3 | 0 | 198.19.18.3 |
|
||||
+--------+-----------------------------+--------+---------------------+-----------------+--------+-----------------------+
|
||||
--------------------------------------------------------------------------------------------------------------------------
|
||||
0 Ethernet Auto-Discovery routes 0 used, 0 valid
|
||||
5 MAC-IP Advertisement routes 3 used, 5 valid
|
||||
5 Inclusive Multicast Ethernet Tag routes 3 used, 5 valid
|
||||
0 Ethernet Segment routes 0 used, 0 valid
|
||||
0 IP Prefix routes 0 used, 0 valid
|
||||
0 Selective Multicast Ethernet Tag routes 0 used, 0 valid
|
||||
0 Selective Multicast Membership Report Sync routes 0 used, 0 valid
|
||||
0 Selective Multicast Leave Sync routes 0 used, 0 valid
|
||||
--------------------------------------------------------------------------------------------------------------------------
|
||||
```
|
||||
|
||||
I have to say, SR Linux is incredibly chatty! But, I can see all the relevant bits and bobs here.
|
||||
Each MAC-IP entry is accounted for, I can see several nexthops pointing at the nikhef switch, one
|
||||
pointing at the nokia-leaf router and one pointing at the Arista switch. I also see the IMET
|
||||
entries. One thing to note -- the SR Linux implementation leaves the type-2 routes empty with a
|
||||
0.0.0.0 IPv4 address, while the Arista (in my opinion, more correctly) leaves them as NULL
|
||||
(unspecified). But, everything looks great!
|
||||
|
||||
#### Results: Debian view
|
||||
|
||||
There's one more thing to show, and that's kind of the 'proof is in the pudding' moment. Arend
|
||||
hooked up a Debian machine with an Intel X710-DA4 network card, which sports 4x10G SFP+ connections.
|
||||
This network card is a regular in my AS8298 network, as it has excellent DPDK support and can pump
|
||||
easily 40Mpps with VPP. IPng 🥰 Intel X710!
|
||||
|
||||
```
|
||||
root@debian:~ # ip netns add nikhef
|
||||
root@debian:~ # ip link set enp1s0f0 netns nikhef
|
||||
root@debian:~ # ip netns exec nikhef ip link set enp1s0f0 up mtu 9000
|
||||
root@debian:~ # ip netns exec nikhef ip addr add 192.0.2.10/24 dev enp1s0f0
|
||||
root@debian:~ # ip netns exec nikhef ip addr add 2001:db8::10/64 dev enp1s0f0
|
||||
|
||||
root@debian:~ # ip netns add arista-leaf
|
||||
root@debian:~ # ip link set enp1s0f1 netns arista-leaf
|
||||
root@debian:~ # ip netns exec arista-leaf ip link set enp1s0f1 up mtu 9000
|
||||
root@debian:~ # ip netns exec arista-leaf ip addr add 192.0.2.11/24 dev enp1s0f1
|
||||
root@debian:~ # ip netns exec arista-leaf ip addr add 2001:db8::11/64 dev enp1s0f1
|
||||
|
||||
root@debian:~ # ip netns add nokia-leaf
|
||||
root@debian:~ # ip link set enp1s0f2 netns nokia-leaf
|
||||
root@debian:~ # ip netns exec nokia-leaf ip link set enp1s0f2 up mtu 9000
|
||||
root@debian:~ # ip netns exec nokia-leaf ip addr add 192.0.2.12/24 dev enp1s0f2
|
||||
root@debian:~ # ip netns exec nokia-leaf ip addr add 2001:db8::12/64 dev enp1s0f2
|
||||
|
||||
root@debian:~ # ip netns add equinix
|
||||
root@debian:~ # ip link set enp1s0f3 netns equinix
|
||||
root@debian:~ # ip netns exec equinix ip link set enp1s0f3 up mtu 9000
|
||||
root@debian:~ # ip netns exec equinix ip addr add 192.0.2.13/24 dev enp1s0f3
|
||||
root@debian:~ # ip netns exec equinix ip addr add 2001:db8::13/64 dev enp1s0f3
|
||||
|
||||
root@debian:~# ip netns exec nikhef fping -g 192.0.2.8/29
|
||||
192.0.2.10 is alive
|
||||
192.0.2.11 is alive
|
||||
192.0.2.12 is alive
|
||||
192.0.2.13 is alive
|
||||
|
||||
root@debian:~# ip netns exec arista-leaf fping 2001:db8::10 2001:db8::11 2001:db8::12 2001:db8::13
|
||||
2001:db8::10 is alive
|
||||
2001:db8::11 is alive
|
||||
2001:db8::12 is alive
|
||||
2001:db8::13 is alive
|
||||
|
||||
root@debian:~# ip netns exec equinix ip nei
|
||||
192.0.2.10 dev enp1s0f3 lladdr e4:3a:6e:5f:0c:57 STALE
|
||||
192.0.2.11 dev enp1s0f3 lladdr e4:3a:6e:5f:0c:58 STALE
|
||||
192.0.2.12 dev enp1s0f3 lladdr e4:3a:6e:5f:0c:59 STALE
|
||||
fe80::e63a:6eff:fe5f:c57 dev enp1s0f3 lladdr e4:3a:6e:5f:0c:57 STALE
|
||||
fe80::e63a:6eff:fe5f:c58 dev enp1s0f3 lladdr e4:3a:6e:5f:0c:58 STALE
|
||||
fe80::e63a:6eff:fe5f:c59 dev enp1s0f3 lladdr e4:3a:6e:5f:0c:59 STALE
|
||||
2001:db8::10 dev enp1s0f3 lladdr e4:3a:6e:5f:0c:57 STALE
|
||||
2001:db8::11 dev enp1s0f3 lladdr e4:3a:6e:5f:0c:58 STALE
|
||||
2001:db8::12 dev enp1s0f3 lladdr e4:3a:6e:5f:0c:59 STALE
|
||||
```
|
||||
|
||||
The Debian machine puts each network card into its own network namespace, and gives it both an IPv4
|
||||
and an IPv6 address. I can then enter the `nikhef` network namespace, which has its NIC connected to
|
||||
the IXR-7220-D4 router called _nikhef_, and ping all four endpoints. Similarly, I can enter the
|
||||
`arista-leaf` namespace and ping6 all four endpoints. Finally, I take a look at the IPv6 and IPv4
|
||||
neighbor table on the network card that is connected to the Equinix router. All three MAC addresses are
|
||||
seen. This proves end to end connectivity across the EVPN VXLAN, and full interoperability.
|
||||
|
||||
Performance? We got that!
|
||||
```
|
||||
root@debian:~# ip netns exec equinix iperf3 -c 192.0.2.12
|
||||
Connecting to host 192.0.2.12, port 5201
|
||||
[ 5] local 192.0.2.10 port 34598 connected to 192.0.2.12 port 5201
|
||||
[ ID] Interval Transfer Bitrate Retr Cwnd
|
||||
[ 5] 0.00-1.00 sec 1.15 GBytes 9.91 Gbits/sec 19 1.52 MBytes
|
||||
[ 5] 1.00-2.00 sec 1.15 GBytes 9.90 Gbits/sec 3 1.54 MBytes
|
||||
[ 5] 2.00-3.00 sec 1.15 GBytes 9.90 Gbits/sec 1 1.54 MBytes
|
||||
[ 5] 3.00-4.00 sec 1.15 GBytes 9.90 Gbits/sec 1 1.54 MBytes
|
||||
[ 5] 4.00-5.00 sec 1.15 GBytes 9.90 Gbits/sec 0 1.54 MBytes
|
||||
[ 5] 5.00-6.00 sec 1.15 GBytes 9.90 Gbits/sec 0 1.54 MBytes
|
||||
[ 5] 6.00-7.00 sec 1.15 GBytes 9.90 Gbits/sec 0 1.54 MBytes
|
||||
[ 5] 7.00-8.00 sec 1.15 GBytes 9.90 Gbits/sec 0 1.54 MBytes
|
||||
[ 5] 8.00-9.00 sec 1.15 GBytes 9.90 Gbits/sec 0 1.54 MBytes
|
||||
[ 5] 9.00-10.00 sec 1.15 GBytes 9.90 Gbits/sec 0 1.54 MBytes
|
||||
- - - - - - - - - - - - - - - - - - - - - - - - -
|
||||
[ ID] Interval Transfer Bitrate Retr
|
||||
[ 5] 0.00-10.00 sec 11.5 GBytes 9.90 Gbits/sec 24 sender
|
||||
[ 5] 0.00-10.00 sec 11.5 GBytes 9.90 Gbits/sec receiver
|
||||
|
||||
iperf Done.
|
||||
```
|
||||
|
||||
## What's Next
|
||||
|
||||
There's a few improvements I can make before deploying this architecture to the internet exchange.
|
||||
Notably:
|
||||
* the functional equivalent of _port security_, that is to say only allowing one or two MAC
|
||||
addresses per member port. FrysIX has a strict one-port-one-member-one-MAC rule, and having port
|
||||
security will greatly improve our resilience.
|
||||
* SR Linux has the ability to suppress ARP, _even on L2 MAC-VRF_! It's relatively well known for
|
||||
IRB based setups, but adding this to transparent bridge-domains is possible in Nokia
|
||||
[[ref](https://documentation.nokia.com/srlinux/22-6/SR_Linux_Book_Files/EVPN-VXLAN_Guide/services-evpn-vxlan-l2.html#configuring_evpn_learning_for_proxy_arp)],
|
||||
using the syntax of `protocols bgp-evpn bgp-instance 1 routes bridge-table mac-ip advertise
|
||||
true`. This will glean the IP addresses based on intercepted ARP requests, and reduce the need for
|
||||
BUM flooding. If DE-CIX can do it, so can FrysIX :)
|
||||
* some automation - although configuring the MAC-VRF across Arista and SR Linux is definitely not
|
||||
as difficult as I thought, having some automation in place will avoid errors and mistakes. It
|
||||
would suck if the IXP collapsed because I botched a link drain or PNI configuration!
|
||||
|
||||
|
||||
### Acknowledgements
|
||||
|
||||
I am relatively new to EVPN configurations, and wanted to give a shoutout to Andy Whitaker who
|
||||
jumped in very quickly when I asked a question on the SR Linux Discord. He was gracious with his
|
||||
time and spent a few hours on a video call with me, explaining EVPN in great detail both for Arista
|
||||
as well as SR Linux, and in particular wanted to give a big "Thank you!" for helping me understand
|
||||
symmetric and asymmetric IRB in the context of multivendor EVPN. Andy is about to start a new job at
|
||||
Nokia, and I wish him all the best. To my friends at Nokia: you caught a good one, Andy is pure
|
||||
gold!
|
1
static/assets/frys-ix/FrysIX_ Topology (concept).svg
Normal file
1
static/assets/frys-ix/FrysIX_ Topology (concept).svg
Normal file
File diff suppressed because one or more lines are too long
After Width: | Height: | Size: 90 KiB |
BIN
static/assets/frys-ix/IXR-7220-D3.jpg
(Stored with Git LFS)
Normal file
BIN
static/assets/frys-ix/IXR-7220-D3.jpg
(Stored with Git LFS)
Normal file
Binary file not shown.
1
static/assets/frys-ix/Nokia Arista VXLAN.svg
Normal file
1
static/assets/frys-ix/Nokia Arista VXLAN.svg
Normal file
File diff suppressed because one or more lines are too long
After Width: | Height: | Size: 166 KiB |
BIN
static/assets/frys-ix/frysix-logo-small.png
(Stored with Git LFS)
Normal file
BIN
static/assets/frys-ix/frysix-logo-small.png
(Stored with Git LFS)
Normal file
Binary file not shown.
BIN
static/assets/frys-ix/nokia-7220-d2.png
(Stored with Git LFS)
Normal file
BIN
static/assets/frys-ix/nokia-7220-d2.png
(Stored with Git LFS)
Normal file
Binary file not shown.
BIN
static/assets/frys-ix/nokia-7220-d4.png
(Stored with Git LFS)
Normal file
BIN
static/assets/frys-ix/nokia-7220-d4.png
(Stored with Git LFS)
Normal file
Binary file not shown.
Reference in New Issue
Block a user