Files
ipng.ch/content/articles/2024-04-27-freeix-1.md
Pim van Pelt 16ac42bad9
All checks were successful
continuous-integration/drone/push Build is passing
Create consistent title for both articles
2024-10-21 19:50:50 +02:00

232 lines
15 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
date: "2024-04-27T10:52:11Z"
title: "FreeIX Remote - Part 1"
aliases:
- /s/articles/2024/04/27/freeix-1.html
- /s/articles/2024/04/27/freeix-remote/
---
# Introduction
{{< image width="300px" float="right" src="/assets/freeix/openart-image_REzWzO43_1714219288118_raw.jpg" alt="OpenART" >}}
Tier1 and aspiring Tier2 providers interconnect only in large metropolitan areas, due to commercial incentives and
politics. They won't often peer with smaller providers, because why peer with a potential customer? Due to this,
its entirely likely that traffic between two parties in Thessaloniki is sent to Frankfurt or Milan and back.
One possible antidote to this is to connect to a local Internet Exchange point. Not all ISPs have access to large
metropolitan datacenters where larger internet exchanges have a point of presence, and it doesn't help that the
datacenter operator is happy to charge a substantial amount of money each month, just for the privilege of having
a passive fiber cross connect to the exchange. Many Internet Exchanges these days ask for per-month port costs *and*
meter the traffic with policers and rate limiters, such that the total cost of peering starts to exceed what one
might pay for transit, especially at low volumes, which further exacerbates the problem. Bah.
This is an unfortunate market effect (the race to the bottom), where transit providers are continuously lowering their
prices to compete. And while transit providers can make up to some extent due to economies of scale, at some point they
are mostly all of equal size, and thus the only thing that can flex is quality of service.
The benefit of using an Internet Exchange is to reduce the portion of an ISPs (and CDNs) traffic that must be
delivered via their upstream transit providers, thereby reducing the average per-bit delivery cost and as well reducing
the end to end latency as seen by their users or customers. Furthermore, the increased number of paths available through
the IXP improves routing efficiency and fault-tolerance, and it avoids traffic going the scenic route to a large hub
like Frankfurt, London, Amsterdam, Paris or Rome, if it could very well remain local.
IPng Networks really believes in an open and affordable Internet, and I would like to do my part in ensuring the
internet stays accessible for smaller parties.
## Smöl IXPs
One notable problem with small exchanges, like for example [[FNC-IX](https://www.fnc-ix.net/)] in the Paris metro, or
[[CHIX-CH](https://ch-ix.ch/)], [[Community IX](https://www.community-ix.ch/)] and [[Free-IX](https://free-ix.ch/)] in
the Zurich metropolitan area, is that they are, well, small. They may be cheaper to connect to, in some cases even free,
but they don't have a sizable membership which means that there is inherently less traffic flowing, which in turn makes
it less appealing for prospect members to connect to.
At IPng, I have partnered with a few super cool ISPs and carriers to offer a Free Internet Exchange platform. Just to
head the main question off at the pass: _Free_ here actually does mean "Free as in beer" or
[[Gratis](https://en.wikipedia.org/wiki/Gratis)], a gift to the community that does not cost money. It also more
philosophically wants to be "Free as in open, and transparent" or
[[Libre](https://en.wikipedia.org/wiki/Free_software)].
Two examples are:
* [[Free IX: Switzerland](https://free-ix.ch/)] with POPs at STACK GEN01 Geneva, NTT Zurich and Bancadati Lugano.
* [[Free IX: Greece](https://free-ix.gr/)] with POPs at TISparkle in Athens and Balkan Gate in Thessaloniki.
.. but there are actually quite a few out there once you start looking :)
## Growing Smöl IXPs
Some internet exchanges break through the magical 1Tbps barrier (and get a courtesy callout on Twitter from Dr. King),
but many remain smöl. Perhaps it's time to break the _chicken-and-egg_ problem. What if there was a way to interconnect
these exchanges?
Let's take for example the Free IX in Greece that was announced at GRNOG16 in Athens on April 19th. This exchange
initially targets Athens and Thessaloniki, with 2x100G between the two cities. Members can connect to either site for
the cost of only a cross connect. The 1G/10G/25G ports will be _Gratis_. But I will be connecting one very special
member to Free IX Greece, AS50869:
{{< image src="/assets/freeix/Free IX Remote.svg" alt="FreeIX Remote" >}}
## Free IX: Remote
Here's what I am going to build. The _Free IX Remote_ project offers an outreach infrastructure which connects to
internet exchange points, and allows members to benefit from that in the following way:
1. FreeIX uses AS50869 to peer with any network operator who is available at public internet exchanges or using
private interconnects. It looks like a normal service provider in this regard. It will connect to internet
exchanges, and learn a bunch of routes.
1. FreeIX _members_ can join the program, after which they are granted certain propagation permissions by FreeIX
at the point where they have a BGP session with AS50869. The prefixes learned on these _member_ sessions are marked
as such, and will be allowed to propagate. Members will receive some or all learned prefixes from AS50869.
1. FreeIX _members_ can set fine grained BGP communities to determine which of their prefixes are propagated and at
which locations.
Members at smaller internet exchanges greatly benefit from this type of outreach, by receiving large portions of the
public internet directly at their preferred peering location. Similarly, the _Free IX Remote_ routers will carry
their traffic to these remote internet exchanges.
## Detailed Design
### Peer types
There are two types of BGP neighbor adjacency:
1. ***Members***: these are {ip-address,AS}-tuples which FreeIX has explicitly configured. Learned prefixes are added
to as-set AS50869:AS-MEMBERS. Members receive _some or all_ prefixes from FreeIX, each annotated with BGP **informational**
communities, and members can drive certain behavior with BGP **action** communities.
1. ***Peers***: these are all other entities with whom FreeIX has an adjacency at public internet exchanges or private
network interconnects. Peers receive some (or all) _member prefixes_ from FreeIX and cannot drive any behavior
with communities. With respect to internet exchanges and peers, AS50869 looks like a completely normal ISP,
advertising subsets of the customer AS cone from AS50869:AS-MEMBERS at each exchange point.
BGP sessions with members use strict ingress filtering by means of `bgpq4`, and will be tagged with a set of
informational BGP communities, such as where the prefix was learned, and what propagation permissions that it received
(eg. at which internet exchanges will it be allowed to be announced). Of course, prefixes that are RPKI invalid will be
dropped, while valid and unknown prefixes will be accepted. Members are granted _permissions_ by FreeIX, which determine
where their prefixes will be announced by AS50869. Further, members can perform optional actions by means of BGP communities
at their ingress point, to inhibit announcements to a certain peer or at a given exchange point.
Peers on the other hand are not granted any permissions and all action BGP communities will be stripped on prefixes
learned. Informational communities will still be tagged on learned prefixes. Two things happen here. Firstly, members
will be offered only those prefixes for which they have permission -- in other words, I will create a configuration file
that says member AS8298 may receive prefixes learned from Frys-IX. Secondly, even for those prefixes that are advertised,
the member AS8298 can use the informational communities to further filter what they accept from Free IX Remote AS50869.
# BGP Classic Communities
Members are allowed to set the following legacy action BGP communities for coarse grained distribution of their prefixes
through the FreeIX network.
* `(50869,0)` or `(50869,3000)` do not announce anywhere
* `(50869,666)` or `(65535,666)` blackhole everywhere (can be on any more specific from the member's AS-SET)
* `(50869,3100)` prepend once everywhere
* `(50869,3200)` prepend twice everywhere
* `(50869,3300)` prepend three times everywhere
Peers, on the other hand, are not allowed to set _any_ communities, so all classic BGP communities from them are stripped
on ingress.
# BGP Large Communities
Free IX Remote will use three types of BGP Large Communities, which each serve a distinct purpose:
1. ***Informational***: These communities are set by the FreeIX router when learning a prefix. They cannot be set by
peers or members, and will be stripped on ingress. They will be sent to both members and peers, allowing operators to
choose which prefixes to learn based on their origin details, like which country or internet exchange they were
learned at.
1. ***Permission***: These communities are also set by FreeIX operators when learning a prefix (eg. on the ingress
router). They cannot be set by peers or members, and will be stripped on ingress. The permission communities
determine where FreeIX will allow the prefix to propagate. They will be stripped on egress.
1. ***Action***: Based on the permissions, members can further steer announcements by sending certain action communities
to FreeIX. These actions cannot be sent by peers, but in certain cases they can be set by FreeIX operators on ingress.
Similarly to the _permission_ communties, all _action_ communities will be stripped on egress.
Regular peers of AS50869 at exchange points and private network interconnects will not be able to set any communities,
so all large BGP communities from them are stripped on ingress.
### Informational Communities
When FreeIX routers learn prefixes, they will annotate them with certain communities. For example, the router at
Amsterdam NIKHEF (which is router #1, country #2), when learning a prefix at FrysIX (which is ixp #1152), will set the
following BGP large communities:
* `(50869,1010,1)`: Informational (10XX), Router (1010), vpp0.nlams0.free-ix.net (1)
* `(50869,1020,2)`: Informational (10XX), Country (1020), Netherlands (2)
* `(50869,1030,1152)`: Informational (10XX), IXP (1030), PeeringDB IXP for FrysIX (1152)
When propagating these prefixes to neighbors (both members and peers), these informational communities can be used to
determine local policy, for example by setting a different localpref or dropping prefixes from a certain location.
Informational communities can be read, but they can't be _set_ by peers or members -- they are always cleared by FreeIX
routers when learning prefixes, and as such the only routers which will set them are the FreeIX ones.
### Permission Communities
FreeIX maintains a list of permissions per member. When members announce their prefixes to FreeIX routers, these
permissions communities are set. They determine what the member is allowed to do with FreeIX propagation - notably which
routers, countries, and internet exchanges the member will be allowed to propagate to.
Usually, member prefixes are allowed to propagate everywhere, so the following communities might be set by the FreeIX
router on ingress:
* `(50869,2010,0)`: Permission (20XX), Router (2010), everywhere (0)
* `(50869,2020,0)`: Permission (20XX), Country (2020), everywhere (0)
* `(50869,2030,0)`: Permission (20XX), IXP (2030), everywhere (0)
If the member prefixes are allowed to propagate only to certain places, the 'everywhere' communities will not be set,
and instead lists of communities with finer grained permissions can be used, for example:
* `(50869,2010,2)`: Permission (20XX), Router (2010), vpp0.grskg0.free-ix.net (2)
* `(50869,2020,3)`: Permission (20XX), Country (2020), Greece (3)
* `(50869,2030,60)`: Permission (20XX), IXP (2030), PeeringDB IXP for SwissIX (60)
Permission communities can't be set by peers, nor by members -- they are always cleared by FreeIX routers when learning
prefixes, and are configured explicitly by FreeIX operators.
### Action Communities
Based on the permission communities, zero or more egress routers, countries and internet exchanges are eligible to
propagate member prefixes by AS50869 to its peers. Members can define very fine grained action communities to further
tweak which prefixes propagate on which routers, in which countries and towards which internet exchanges and private
network interconnects:
* `(50869,3010,3)`: Inhibit Action (30XX), Router (3010), vpp0.gratt0.free-ix.net (3)
* `(50869,3020,1)`: Inhibit Action (30XX), Country (3020), Switzerland (1)
* `(50869,3030,1308)`: Inhibit Action (30XX), IXP (3030), PeeringDB IXP for LS-IX (1308)
Four actions can be placed on a per-remote-asn basis:
* `(50869,3040,13030)`: Inhibit Action (30XX), AS (3040), Init7 (AS13030)
* `(50869,3100,6939)`: Prepend Once Action (3100), Hurricane Electric (AS6939)
* `(50869,3200,12859)`: Prepend Twice Action (3200), BIT BV (AS12859)
* `(50869,3300,8283)`: Prepend Thice Action (3300), Coloclue (AS8283)
Peers cannot set these actions, as all action communities will be stripped on ingress. Members can set these action
communities on their sessions with FreeIX routers, however in some cases they may also be set by FreeIX operators when
learning prefixes.
## What's next
{{< image width="200px" float="right" src="/assets/freeix/bird-logo.svg" alt="Bird" >}}
Perhaps this interaction between _informational_, _permission_ and _action_ BGP communities gives you an idea on how
such a network may operate. It's somewhat different to a classic Transit provider, in that AS50869 will not carry a
full table. It'll _merely_ provide a form of partial transit from member A at IXP #1, to and from all peers that
can be found at IXPs #2-#N. Makes the mind boggle? Don't worry, we'll figure it out together :)
In an upcoming article I'll detail the programming work that goes into implementing this complex peering policy in Bird2
as driving VPP routers (duh), with an IGP that is IPv4-less, because at this point, I [[may as well]({{< ref "2024-04-06-vpp-ospf" >}})] put my money where my mouth is.
If you're interested in this kind of stuff, take a look at the IPng Networks AS8298 [[Routing Policy]({{< ref "2021-11-14-routing-policy" >}})]. Similar to that one, this one will use a combination of functional programming, templates,
and clever expansions to make a customized per-member and per-peer configuration based on a YAML input file which
dictates which member and which prefix is allowed to go where.
{{< image width="200px" float="right" src="/assets/vpp/fdio-color.svg" alt="VPP" >}}
First, I need to get a replacement router for the Thessaloniki router, which will run VPP of course. My buddy Antonis
noticed that there are CPU and/or DDR errors on that chassis, so it may need to be RMAd. But once it's operational, I will
start by deploying one instance in Amsterdam NIKHEF, and another in Thessaloniki Balkan Gate, with a 100G connection
between them, graciously provided by [[LANCOM](https://www.lancom.gr/en/)]. Just look at that FD.io hound runnnnn!!1