232 lines
15 KiB
Markdown
232 lines
15 KiB
Markdown
---
|
||
date: "2024-04-27T10:52:11Z"
|
||
title: "FreeIX Remote - Part 1"
|
||
aliases:
|
||
- /s/articles/2024/04/27/freeix-1.html
|
||
- /s/articles/2024/04/27/freeix-remote/
|
||
---
|
||
|
||
# Introduction
|
||
|
||
{{< image width="300px" float="right" src="/assets/freeix/openart-image_REzWzO43_1714219288118_raw.jpg" alt="OpenART" >}}
|
||
|
||
Tier1 and aspiring Tier2 providers interconnect only in large metropolitan areas, due to commercial incentives and
|
||
politics. They won't often peer with smaller providers, because why peer with a potential customer? Due to this,
|
||
it’s entirely likely that traffic between two parties in Thessaloniki is sent to Frankfurt or Milan and back.
|
||
|
||
One possible antidote to this is to connect to a local Internet Exchange point. Not all ISPs have access to large
|
||
metropolitan datacenters where larger internet exchanges have a point of presence, and it doesn't help that the
|
||
datacenter operator is happy to charge a substantial amount of money each month, just for the privilege of having
|
||
a passive fiber cross connect to the exchange. Many Internet Exchanges these days ask for per-month port costs *and*
|
||
meter the traffic with policers and rate limiters, such that the total cost of peering starts to exceed what one
|
||
might pay for transit, especially at low volumes, which further exacerbates the problem. Bah.
|
||
|
||
This is an unfortunate market effect (the race to the bottom), where transit providers are continuously lowering their
|
||
prices to compete. And while transit providers can make up to some extent due to economies of scale, at some point they
|
||
are mostly all of equal size, and thus the only thing that can flex is quality of service.
|
||
|
||
The benefit of using an Internet Exchange is to reduce the portion of an ISP’s (and CDN’s) traffic that must be
|
||
delivered via their upstream transit providers, thereby reducing the average per-bit delivery cost and as well reducing
|
||
the end to end latency as seen by their users or customers. Furthermore, the increased number of paths available through
|
||
the IXP improves routing efficiency and fault-tolerance, and it avoids traffic going the scenic route to a large hub
|
||
like Frankfurt, London, Amsterdam, Paris or Rome, if it could very well remain local.
|
||
|
||
IPng Networks really believes in an open and affordable Internet, and I would like to do my part in ensuring the
|
||
internet stays accessible for smaller parties.
|
||
|
||
## Smöl IXPs
|
||
|
||
One notable problem with small exchanges, like for example [[FNC-IX](https://www.fnc-ix.net/)] in the Paris metro, or
|
||
[[CHIX-CH](https://ch-ix.ch/)], [[Community IX](https://www.community-ix.ch/)] and [[Free-IX](https://free-ix.ch/)] in
|
||
the Zurich metropolitan area, is that they are, well, small. They may be cheaper to connect to, in some cases even free,
|
||
but they don't have a sizable membership which means that there is inherently less traffic flowing, which in turn makes
|
||
it less appealing for prospect members to connect to.
|
||
|
||
At IPng, I have partnered with a few super cool ISPs and carriers to offer a Free Internet Exchange platform. Just to
|
||
head the main question off at the pass: _Free_ here actually does mean "Free as in beer" or
|
||
[[Gratis](https://en.wikipedia.org/wiki/Gratis)], a gift to the community that does not cost money. It also more
|
||
philosophically wants to be "Free as in open, and transparent" or
|
||
[[Libre](https://en.wikipedia.org/wiki/Free_software)].
|
||
|
||
Two examples are:
|
||
* [[Free IX: Switzerland](https://free-ix.ch/)] with POPs at STACK GEN01 Geneva, NTT Zurich and Bancadati Lugano.
|
||
* [[Free IX: Greece](https://free-ix.gr/)] with POPs at TISparkle in Athens and Balkan Gate in Thessaloniki.
|
||
|
||
.. but there are actually quite a few out there once you start looking :)
|
||
|
||
## Growing Smöl IXPs
|
||
|
||
Some internet exchanges break through the magical 1Tbps barrier (and get a courtesy callout on Twitter from Dr. King),
|
||
but many remain smöl. Perhaps it's time to break the _chicken-and-egg_ problem. What if there was a way to interconnect
|
||
these exchanges?
|
||
|
||
Let's take for example the Free IX in Greece that was announced at GRNOG16 in Athens on April 19th. This exchange
|
||
initially targets Athens and Thessaloniki, with 2x100G between the two cities. Members can connect to either site for
|
||
the cost of only a cross connect. The 1G/10G/25G ports will be _Gratis_. But I will be connecting one very special
|
||
member to Free IX Greece, AS50869:
|
||
|
||
{{< image src="/assets/freeix/Free IX Remote.svg" alt="FreeIX Remote" >}}
|
||
|
||
## Free IX: Remote
|
||
|
||
Here's what I am going to build. The _Free IX Remote_ project offers an outreach infrastructure which connects to
|
||
internet exchange points, and allows members to benefit from that in the following way:
|
||
|
||
1. FreeIX uses AS50869 to peer with any network operator who is available at public internet exchanges or using
|
||
private interconnects. It looks like a normal service provider in this regard. It will connect to internet
|
||
exchanges, and learn a bunch of routes.
|
||
1. FreeIX _members_ can join the program, after which they are granted certain propagation permissions by FreeIX
|
||
at the point where they have a BGP session with AS50869. The prefixes learned on these _member_ sessions are marked
|
||
as such, and will be allowed to propagate. Members will receive some or all learned prefixes from AS50869.
|
||
1. FreeIX _members_ can set fine grained BGP communities to determine which of their prefixes are propagated and at
|
||
which locations.
|
||
|
||
Members at smaller internet exchanges greatly benefit from this type of outreach, by receiving large portions of the
|
||
public internet directly at their preferred peering location. Similarly, the _Free IX Remote_ routers will carry
|
||
their traffic to these remote internet exchanges.
|
||
|
||
## Detailed Design
|
||
|
||
### Peer types
|
||
|
||
There are two types of BGP neighbor adjacency:
|
||
|
||
1. ***Members***: these are {ip-address,AS}-tuples which FreeIX has explicitly configured. Learned prefixes are added
|
||
to as-set AS50869:AS-MEMBERS. Members receive _some or all_ prefixes from FreeIX, each annotated with BGP **informational**
|
||
communities, and members can drive certain behavior with BGP **action** communities.
|
||
|
||
1. ***Peers***: these are all other entities with whom FreeIX has an adjacency at public internet exchanges or private
|
||
network interconnects. Peers receive some (or all) _member prefixes_ from FreeIX and cannot drive any behavior
|
||
with communities. With respect to internet exchanges and peers, AS50869 looks like a completely normal ISP,
|
||
advertising subsets of the customer AS cone from AS50869:AS-MEMBERS at each exchange point.
|
||
|
||
BGP sessions with members use strict ingress filtering by means of `bgpq4`, and will be tagged with a set of
|
||
informational BGP communities, such as where the prefix was learned, and what propagation permissions that it received
|
||
(eg. at which internet exchanges will it be allowed to be announced). Of course, prefixes that are RPKI invalid will be
|
||
dropped, while valid and unknown prefixes will be accepted. Members are granted _permissions_ by FreeIX, which determine
|
||
where their prefixes will be announced by AS50869. Further, members can perform optional actions by means of BGP communities
|
||
at their ingress point, to inhibit announcements to a certain peer or at a given exchange point.
|
||
|
||
Peers on the other hand are not granted any permissions and all action BGP communities will be stripped on prefixes
|
||
learned. Informational communities will still be tagged on learned prefixes. Two things happen here. Firstly, members
|
||
will be offered only those prefixes for which they have permission -- in other words, I will create a configuration file
|
||
that says member AS8298 may receive prefixes learned from Frys-IX. Secondly, even for those prefixes that are advertised,
|
||
the member AS8298 can use the informational communities to further filter what they accept from Free IX Remote AS50869.
|
||
|
||
# BGP Classic Communities
|
||
|
||
Members are allowed to set the following legacy action BGP communities for coarse grained distribution of their prefixes
|
||
through the FreeIX network.
|
||
|
||
* `(50869,0)` or `(50869,3000)` do not announce anywhere
|
||
* `(50869,666)` or `(65535,666)` blackhole everywhere (can be on any more specific from the member's AS-SET)
|
||
* `(50869,3100)` prepend once everywhere
|
||
* `(50869,3200)` prepend twice everywhere
|
||
* `(50869,3300)` prepend three times everywhere
|
||
|
||
Peers, on the other hand, are not allowed to set _any_ communities, so all classic BGP communities from them are stripped
|
||
on ingress.
|
||
|
||
# BGP Large Communities
|
||
|
||
Free IX Remote will use three types of BGP Large Communities, which each serve a distinct purpose:
|
||
|
||
1. ***Informational***: These communities are set by the FreeIX router when learning a prefix. They cannot be set by
|
||
peers or members, and will be stripped on ingress. They will be sent to both members and peers, allowing operators to
|
||
choose which prefixes to learn based on their origin details, like which country or internet exchange they were
|
||
learned at.
|
||
|
||
1. ***Permission***: These communities are also set by FreeIX operators when learning a prefix (eg. on the ingress
|
||
router). They cannot be set by peers or members, and will be stripped on ingress. The permission communities
|
||
determine where FreeIX will allow the prefix to propagate. They will be stripped on egress.
|
||
|
||
1. ***Action***: Based on the permissions, members can further steer announcements by sending certain action communities
|
||
to FreeIX. These actions cannot be sent by peers, but in certain cases they can be set by FreeIX operators on ingress.
|
||
Similarly to the _permission_ communties, all _action_ communities will be stripped on egress.
|
||
|
||
Regular peers of AS50869 at exchange points and private network interconnects will not be able to set any communities,
|
||
so all large BGP communities from them are stripped on ingress.
|
||
|
||
### Informational Communities
|
||
|
||
When FreeIX routers learn prefixes, they will annotate them with certain communities. For example, the router at
|
||
Amsterdam NIKHEF (which is router #1, country #2), when learning a prefix at FrysIX (which is ixp #1152), will set the
|
||
following BGP large communities:
|
||
|
||
* `(50869,1010,1)`: Informational (10XX), Router (1010), vpp0.nlams0.free-ix.net (1)
|
||
* `(50869,1020,2)`: Informational (10XX), Country (1020), Netherlands (2)
|
||
* `(50869,1030,1152)`: Informational (10XX), IXP (1030), PeeringDB IXP for FrysIX (1152)
|
||
|
||
When propagating these prefixes to neighbors (both members and peers), these informational communities can be used to
|
||
determine local policy, for example by setting a different localpref or dropping prefixes from a certain location.
|
||
Informational communities can be read, but they can't be _set_ by peers or members -- they are always cleared by FreeIX
|
||
routers when learning prefixes, and as such the only routers which will set them are the FreeIX ones.
|
||
|
||
### Permission Communities
|
||
|
||
FreeIX maintains a list of permissions per member. When members announce their prefixes to FreeIX routers, these
|
||
permissions communities are set. They determine what the member is allowed to do with FreeIX propagation - notably which
|
||
routers, countries, and internet exchanges the member will be allowed to propagate to.
|
||
|
||
Usually, member prefixes are allowed to propagate everywhere, so the following communities might be set by the FreeIX
|
||
router on ingress:
|
||
|
||
* `(50869,2010,0)`: Permission (20XX), Router (2010), everywhere (0)
|
||
* `(50869,2020,0)`: Permission (20XX), Country (2020), everywhere (0)
|
||
* `(50869,2030,0)`: Permission (20XX), IXP (2030), everywhere (0)
|
||
|
||
If the member prefixes are allowed to propagate only to certain places, the 'everywhere' communities will not be set,
|
||
and instead lists of communities with finer grained permissions can be used, for example:
|
||
|
||
* `(50869,2010,2)`: Permission (20XX), Router (2010), vpp0.grskg0.free-ix.net (2)
|
||
* `(50869,2020,3)`: Permission (20XX), Country (2020), Greece (3)
|
||
* `(50869,2030,60)`: Permission (20XX), IXP (2030), PeeringDB IXP for SwissIX (60)
|
||
|
||
Permission communities can't be set by peers, nor by members -- they are always cleared by FreeIX routers when learning
|
||
prefixes, and are configured explicitly by FreeIX operators.
|
||
|
||
### Action Communities
|
||
|
||
Based on the permission communities, zero or more egress routers, countries and internet exchanges are eligible to
|
||
propagate member prefixes by AS50869 to its peers. Members can define very fine grained action communities to further
|
||
tweak which prefixes propagate on which routers, in which countries and towards which internet exchanges and private
|
||
network interconnects:
|
||
|
||
* `(50869,3010,3)`: Inhibit Action (30XX), Router (3010), vpp0.gratt0.free-ix.net (3)
|
||
* `(50869,3020,1)`: Inhibit Action (30XX), Country (3020), Switzerland (1)
|
||
* `(50869,3030,1308)`: Inhibit Action (30XX), IXP (3030), PeeringDB IXP for LS-IX (1308)
|
||
|
||
Four actions can be placed on a per-remote-asn basis:
|
||
|
||
* `(50869,3040,13030)`: Inhibit Action (30XX), AS (3040), Init7 (AS13030)
|
||
* `(50869,3100,6939)`: Prepend Once Action (3100), Hurricane Electric (AS6939)
|
||
* `(50869,3200,12859)`: Prepend Twice Action (3200), BIT BV (AS12859)
|
||
* `(50869,3300,8283)`: Prepend Thice Action (3300), Coloclue (AS8283)
|
||
|
||
Peers cannot set these actions, as all action communities will be stripped on ingress. Members can set these action
|
||
communities on their sessions with FreeIX routers, however in some cases they may also be set by FreeIX operators when
|
||
learning prefixes.
|
||
|
||
## What's next
|
||
|
||
{{< image width="200px" float="right" src="/assets/freeix/bird-logo.svg" alt="Bird" >}}
|
||
|
||
Perhaps this interaction between _informational_, _permission_ and _action_ BGP communities gives you an idea on how
|
||
such a network may operate. It's somewhat different to a classic Transit provider, in that AS50869 will not carry a
|
||
full table. It'll _merely_ provide a form of partial transit from member A at IXP #1, to and from all peers that
|
||
can be found at IXPs #2-#N. Makes the mind boggle? Don't worry, we'll figure it out together :)
|
||
|
||
In an upcoming article I'll detail the programming work that goes into implementing this complex peering policy in Bird2
|
||
as driving VPP routers (duh), with an IGP that is IPv4-less, because at this point, I [[may as well]({{< ref "2024-04-06-vpp-ospf" >}})] put my money where my mouth is.
|
||
|
||
If you're interested in this kind of stuff, take a look at the IPng Networks AS8298 [[Routing Policy]({{< ref "2021-11-14-routing-policy" >}})]. Similar to that one, this one will use a combination of functional programming, templates,
|
||
and clever expansions to make a customized per-member and per-peer configuration based on a YAML input file which
|
||
dictates which member and which prefix is allowed to go where.
|
||
|
||
{{< image width="200px" float="right" src="/assets/vpp/fdio-color.svg" alt="VPP" >}}
|
||
|
||
First, I need to get a replacement router for the Thessaloniki router, which will run VPP of course. My buddy Antonis
|
||
noticed that there are CPU and/or DDR errors on that chassis, so it may need to be RMAd. But once it's operational, I will
|
||
start by deploying one instance in Amsterdam NIKHEF, and another in Thessaloniki Balkan Gate, with a 100G connection
|
||
between them, graciously provided by [[LANCOM](https://www.lancom.gr/en/)]. Just look at that FD.io hound runnnnn!!1
|