ipng.ch/content/articles/2021-11-14-routing-policy.md

---
date: "2021-11-14T22:49:09Z"
title: Case Study - BGP Routing Policy
aliases:
- /s/articles/2021/11/14/routing-policy.html
---

# Introduction

BGP Routing policy is a very interesting topic. I get asked about it formally
and informally all the time. I have to admit, there are lots of ways to organize
an automous system. Vendors have unique features and templating / procedural
functions, but in the end, BGP routing policy all boils down to two+two things:

1.   Not accepting the prefixes you don't want (inbound)
     *   For those prefixes accepted, ensure they have correct attributes.
1.   Not announcing prefixes to folks who shouldn't see them (outbound)
     *   For those prefixes announced, ensure they have correct attributes.

At IPng Networks, I've cycled through a few iterations and landed on a specific
setup that works well for me. It provides sufficient information to enable our
downstream (customers) to make good decisions on what they should accept from
us, as well as enough expressivity for them to determine which prefixes we
should propagate for them, where, and how.

This article describes one approach to a relatively feature rich routing policy
which is in use at IPng Networks (AS8298). It uses the [Bird2](https://bird.network.cz/) configuration
language, although the concepts would be implementable in ~any modern routing
suite (ie. FRR, Cisco, Juniper, Arista, Extreme, et cetera).

Interested in one operator's opinion? Read on!

## 1. Concepts

There are three basic pieces of routing filtering, which I'll describe briefly.

### Prefix Lists

A prefix list (also sometimes referred to as an access-list in older software)
is a list of IPv4 of IPv6 prefixes, often with a prefixlen boundary, that
determines if a given prefix is "in" or "out".

An example could be: `2001:db8::/32{32,48}` which describes any prefix in the
supernet `2001:db8::/32` that has a prefix length of anywhere between /32 and
/48, inclusive.

### AS Paths

In BGP, each prefix learned comes with an AS path on how to reach it. If my router
learns a prefix from a peer with AS number `65520`, it'll see every prefix that peer
sends as a list of AS numbers starting with 65520. With AS Paths, the very first
one in the list is the one the router directly learned the prefix from, and the very
last one is the origin of the prefix. Often times the prefix is shown as a regular
expression, starting with `^` and ending with `$` and to help readability,
spaces are often written as `_`.

Examples: `^25091_1299_3301$` and `^58299_174_1299_3301$`

### BGP Communities

When learning (or originating) a prefix in BGP, zero or more so called `communities`
can be added to it along the way. The _Routing Information Base_ or _RIB_ carries
these communities and can share them between peering sessions. Communities can be
added, removed and modified. Some communities have special meaning (which is
agreed upon by everyone), and some have local meaning (agreed upon by only
one or a small set of operators).

There's three types of communities: _normal_ communities are a pair of 16-bit
integers; _extended_ communities are 8 bytes, split into one 16-bit integer
and an additional 48-bit value; and finally _large_ communities consist of
a triplet of 32-bit values.

Examples: `(8298, 1234)` (normal), or `(8298, 3, 212323)` (large)

# Routing Policy

Now that I've explained a little bit about the ingredients we have to work with,
let me share an observation that took me a few decades to make: BGP sessions are
really all the same. As such, every single one of the BGP sessions at IPng Networks
are generated with one template. What makes the difference between 'Transit', 'Customer'
and 'Peer' and 'Private Interconnect', really all boils down to what types of filtering
are applied on in- and outbound updates. I will demonstrate this by means of two main
functions in Bird: `ebgp_import()` discussed first in the section ***Inbound: Learning
Routes*** section, and `ebgp_export()` in the section ***Outbound: Announcing Routes***.

## 2. Inbound: Learning Routes

Let's consider this function:

```
function ebgp_import(int remote_as) {
  if aspath_bogon() then return false;
  if (net.type = NET_IP4 && ipv4_bogon()) then return false;
  if (net.type = NET_IP6 && ipv6_bogon()) then return false;

  if (net.type = NET_IP4 && ipv4_rpki_invalid()) then return false;
  if (net.type = NET_IP6 && ipv6_rpki_invalid()) then return false;

  # Demote certain AS nexthops to lower pref
  if (bgp_path.first ~ AS_LOCALPREF50 && bgp_path.len > 1) then bgp_local_pref = 50;
  if (bgp_path.first ~ AS_LOCALPREF30 && bgp_path.len > 1) then bgp_local_pref = 30;
  if (bgp_path.first ~ AS_LOCALPREF10 && bgp_path.len > 1) then bgp_local_pref = 10;

  # Graceful Shutdown (RFC8326)
  if (65535, 0) ~ bgp_community then bgp_local_pref = 0;

  # Scrub BLACKHOLE community
  bgp_community.delete((65535, 666));

  return true;
}
```

The function works by order of elimination -- for each prefix that is offered on the
session, it will either be rejected (by means of returning `false`), or modified (by means
of setting attributes like `bgp_local_pref`) and then accepted (by means of returning
`true`).

***AS-Path Bogon*** filtering is a way to remove prefixes that have an invalid AS
number in their path. The main example of this are private AS numbers (64496-131071)
and their 32 bit equivalents (4200000000-4294967295). In case you haven't come across
this yet, AS number 23456 is also magic, see [RFC4893](https://datatracker.ietf.org/doc/html/rfc4893)
for details:
```
function aspath_bogon() {
  return bgp_path ~ [0, 23456, 64496..131071, 4200000000..4294967295];
}
```

***Prefix Bogon*** comes next, as certain prefixes that are not publicly routable (you
know, such as [RFC1918](https://datatracker.ietf.org/doc/html/rfc1918), but there are
many others). They look differently for IPv4 and IPv6:
```
function ipv4_bogon() {
  return net ~ [
    0.0.0.0/0,              # Default
    0.0.0.0/32-,            # RFC 5735 Special Use IPv4 Addresses
    0.0.0.0/0{0,7},         # RFC 1122 Requirements for Internet Hosts -- Communication Layers 3.2.1.3
    10.0.0.0/8+,            # RFC 1918 Address Allocation for Private Internets
    100.64.0.0/10+,         # RFC 6598 IANA-Reserved IPv4 Prefix for Shared Address Space
    127.0.0.0/8+,           # RFC 1122 Requirements for Internet Hosts -- Communication Layers 3.2.1.3
    169.254.0.0/16+,        # RFC 3927 Dynamic Configuration of IPv4 Link-Local Addresses
    172.16.0.0/12+,         # RFC 1918 Address Allocation for Private Internets
    192.0.0.0/24+,          # RFC 6890 Special-Purpose Address Registries
    192.0.2.0/24+,          # RFC 5737 IPv4 Address Blocks Reserved for Documentation
    192.168.0.0/16+,        # RFC 1918 Address Allocation for Private Internets
    198.18.0.0/15+,         # RFC 2544 Benchmarking Methodology for Network Interconnect Devices
    198.51.100.0/24+,       # RFC 5737 IPv4 Address Blocks Reserved for Documentation
    203.0.113.0/24+,        # RFC 5737 IPv4 Address Blocks Reserved for Documentation
    224.0.0.0/4+,           # RFC 1112 Host Extensions for IP Multicasting
    240.0.0.0/4+            # RFC 6890 Special-Purpose Address Registries
  ];
}

function ipv6_bogon() {
 return net ~ [
    ::/0,                   # Default
    ::/96,                  # IPv4-compatible IPv6 address - deprecated by RFC4291
    ::/128,                 # Unspecified address
    ::1/128,                # Local host loopback address
    ::ffff:0.0.0.0/96+,     # IPv4-mapped addresses
    ::224.0.0.0/100+,       # Compatible address (IPv4 format)
    ::127.0.0.0/104+,       # Compatible address (IPv4 format)
    ::0.0.0.0/104+,         # Compatible address (IPv4 format)
    ::255.0.0.0/104+,       # Compatible address (IPv4 format)
    0000::/8+,              # Pool used for unspecified, loopback and embedded IPv4 addresses
    0100::/8+,              # RFC 6666 - reserved for Discard-Only Address Block
    0200::/7+,              # OSI NSAP-mapped prefix set (RFC4548) - deprecated by RFC4048
    0400::/6+,              # RFC 4291 - Reserved by IETF
    0800::/5+,              # RFC 4291 - Reserved by IETF
    1000::/4+,              # RFC 4291 - Reserved by IETF
    2001:10::/28+,          # RFC 4843 - Deprecated (previously ORCHID)
    2001:20::/28+,          # RFC 7343 - ORCHIDv2
    2001:db8::/32+,         # Reserved by IANA for special purposes and documentation
    2002:e000::/20+,        # Invalid 6to4 packets (IPv4 multicast)
    2002:7f00::/24+,        # Invalid 6to4 packets (IPv4 loopback)
    2002:0000::/24+,        # Invalid 6to4 packets (IPv4 default)
    2002:ff00::/24+,        # Invalid 6to4 packets
    2002:0a00::/24+,        # Invalid 6to4 packets (IPv4 private 10.0.0.0/8 network)
    2002:ac10::/28+,        # Invalid 6to4 packets (IPv4 private 172.16.0.0/12 network)
    2002:c0a8::/32+,        # Invalid 6to4 packets (IPv4 private 192.168.0.0/16 network)
    3ffe::/16+,             # Former 6bone, now decommissioned
    4000::/3+,              # RFC 4291 - Reserved by IETF
    5f00::/8+,              # RFC 5156 - used for the 6bone but was returned
    6000::/3+,              # RFC 4291 - Reserved by IETF
    8000::/3+,              # RFC 4291 - Reserved by IETF
    a000::/3+,              # RFC 4291 - Reserved by IETF
    c000::/3+,              # RFC 4291 - Reserved by IETF
    e000::/4+,              # RFC 4291 - Reserved by IETF
    f000::/5+,              # RFC 4291 - Reserved by IETF
    f800::/6+,              # RFC 4291 - Reserved by IETF
    fc00::/7+,              # Unicast Unique Local Addresses (ULA) - RFC 4193
    fe80::/10+,             # Link-local Unicast
    fec0::/10+,             # Site-local Unicast - deprecated by RFC 3879 (replaced by ULA)
    ff00::/8+               # Multicast
  ];
}
```

That's a long list!! But operators on the _DFZ_ should really never be accepting any
of these, and we should all collectively yell at those who propagate them.

***RPKI Filtering*** is a fantastic routing security feature, described in [RFC6810](https://datatracker.ietf.org/doc/html/rfc6810)
and relatively straight forward to implement. For each _originating_ AS number, we can
check in a table of known `<origin,prefix>` mapping, if it is the correct ISP to
originate the prefix. The lookup can either match (which makes the prefix RPKI valid),
the lookup can fail because the prefix is missing (which makes the prefix RPKI unknown),
and it can specifically mismatch (which makes the prefix RPKI invalid). Operators are
encouraged to flag and drop _invalid_ prefixes:

```
function ipv4_rpki_invalid() {
  return roa_check(t_roa4, net, bgp_path.last) = ROA_INVALID;
}

function ipv6_rpki_invalid() {
  return roa_check(t_roa6, net, bgp_path.last) = ROA_INVALID;
}
```

***NOTE***: In NLNOG my post sparked a bit of debate on the use of `bgp_path.last_nonaggregated`
versus simply `bgp_path.last`. Job Snijders did some spelunking and offered [this post](https://bird.network.cz/pipermail/bird-users/2019-September/013805.html) and a reference to [RFC6907](https://datatracker.ietf.org/doc/html/rfc6907) for details, and
Tijn confirmed that Coloclue (on which many of my approaches have been modeled) indeed uses
`bgp_path.last`. I've updated my configs, with many thanks for the discussion.

Alright, now that I've determined the as-path and prefix are kosher, and that it
is not known to be hijacked (ie. is either `ROA_VALID` or `ROA_UNKNOWN`), I'm ready
to set a few attributes, notably:

*   ***AS_LOCALPREF*** If the peer I learned this prefix from is in the given list, set
    the BGP local preference to either 50, 30 or 10 respectively (a lower localpref means
    the prefix is less likely to be selected). Some internet providers send lots of
    prefixes, but have poor network connectivity to the place I learned the routes from
    (a few examples to this, 6939 is often oversubscribed in Amsterdam, and 39533 was
    for a while connected via a tunnel (!) to Zurich, and several hobby/amateur IXPs are
    on a VXLAN bridged domain rather than a physical switch).

*   ***Graceful Shutdown*** described in [RFC8326](https://datatracker.ietf.org/doc/html/rfc8326),
    shows a way to allow operators to pre-announce their downtime by setting a special
    BGP community that informs their peers to deselect that path by setting the local
    preference to the lowest possible value. This oneliner matching on `(65535,0)`
    implements that behavior.

*   ***Blackhole Community*** described in [RFC7999](https://datatracker.ietf.org/doc/html/rfc7999),
    is another special BGP community of `(65535,666)` which signals the need to stop sending
    traffic to the prefix at hand. I haven't yet implemented the blackhole routing (this has
    to do with an intricacy of the VPP Linux-CP code that I wrote), so for now I'll just remove
    the community.

Alright, based on this one template, I'm now ready to implement all three types of
BGP session: ***Peer***, ***Upstream***, and ***Downstream***.

### Peers

```
function ebgp_import_peer(int remote_as) {
  # Scrub BGP Communities (RFC 7454 Section 11)
  bgp_community.delete([(8298, *)]);
  bgp_large_community.delete([(8298, *, *)]);

  return ebgp_import(remote_as);
}
```

It's dangerous to accept communities for my own AS8298 from peers. This is because
several of them can actively change the behavior of route propagation (these types
of communities are commonly called _action_ communities). So with peering
relationships, I'll just toss them all.

Now, working my way up to the actual BGP peering session, taking for example a
peer that I'm connecting to at LSIX (the routeserver, in fact) in Amsterdam:

```
filter ebgp_lsix_49917_import {
  if ! ebgp_import_peer(49917) then reject;

  # Add IXP Communities
  bgp_community.add((8298,1036));
  bgp_large_community.add((8298,1,1036));

  accept;
}

protocol bgp lsix_49917_ipv4_1 {
  description "LSIX IX Route Servers (LSIX)";
  local as 8298;
  source address 185.1.32.74;
  neighbor 185.1.32.254 as 49917;
  default bgp_med 0;
  default bgp_local_pref 200;
  ipv4 {
    import keep filtered;
    import filter ebgp_lsix_49917_import;
    export filter ebgp_lsix_49917_export;
    receive limit 100000 action restart;
    next hop self on;
  };
};
```

Parsing this through: the ipv4 import filter is called `ebgp_lsix_49917_import` and its
job is to run the whole kittenkaboodle of filtering I described above, and then if the
`ebgp_import_peer()` function returns false, to simply drop the prefix. But if it is
accepted, I'll tag it with a few communities. As I'll show later, any other peer will receive
these communities if I decide to propagate the prefix to them. This is specifically
useful for downstream (customers), who can decide to accept/deny the prefix based on a
wellknown set of communities we tag.

***IXP Community***: If the prefix is learned at an IXP, I'll add a large community
`(8298,1,*)` and backwards compat normal community `(8298,10XX)`.

One last thing I'll note, and this is a matter of taste, is for most peering prefixes
picked up at internet exchanges (like LSIX), are typically much cheaper per megabit than
the transit routes, so I will set a default `bgp_local_pref` of 200 (higher localpref
is more likely to be selected as the active route).

### Upstream

An interesting observation: from Peers and from Upstreams I typically am happy to take
all the prefixes I can get (but see the epilog below for an important note on this). For a
Peer, this is mostly "their own prefixes" and for a Transit, this is mostly "all prefixes",
but there's things in the middle, say partial transit of "all prefixes learned at IXP A B
and C". Really, all inbound sessions are very similar:

```
function ebgp_import_upstream(int remote_as) {
  # Scrub BGP Communities (RFC 7454 Section 11)
  bgp_community.delete([(8298, *)]);
  bgp_large_community.delete([(8298, *, *)]);

  return ebgp_import(remote_as);
}
```

... is in fact identical to the `ebgp_import_peer()` function above, so I'll not discuss
it further. But for the sessions to upstream (==transit) providers, it can make sense
to use slightly different BGP community tags and a lower localpref:

```
filter ebgp_ipmax_25091_import {
  if ! ebgp_import_upstream(25091) then reject;

  # Add BGP Large Communities
  bgp_large_community.add((8298,2,25091));

  # Add BGP Communities
  bgp_community.add((8298,2000));

  accept;
}

protocol bgp ipmax_25091_ipv4_1 {
  description "IP-Max Transit";
  local as 8298;
  source address 46.20.242.210;
  neighbor 46.20.242.209 as 25091;
  default bgp_med 0;
  default bgp_local_pref 50;
  ipv4 {
    import keep filtered;
    import filter ebgp_ipmax_25091_import;
    export filter ebgp_ipmax_25091_export;
    next hop self on;
  };
};
```

Again, a very similar pattern; the only material difference is that the inbound prefixes
are tagged with an ***Upstream Community*** which is of the form `(8298,2,*)` and backwards
compatible `(8298,20XX)`. Downstream customers can use this, if they wish, to select or
reject routes (maybe they don't like routes coming from AS25091, although they should know
better because IP-Max rocks!).

The other slight change here is the `bgp_local_pref` is set to 50, which implies that it will
be used only if there are no alternatives in the _RIB_ with a higher localpref, or with a
similar localpref but shorter as-path, or many other scenarios which I won't get into here,
because BGP selection criteria 101 is a whole blogpost of its own.

## Downstream

That brings us to the third type of BGP sessions -- commonly referred to as customers except
that not everybody pays :) so I just call them _downstreams_:

```
function ebgp_import_downstream(int remote_as) {
  # We do not scrub BGP Communities (RFC 7454 Section 11) for customers
  return ebgp_import(remote_as);
}
```

Here, I have a special relationship with the `remote_as`, and I do not scrub the communities,
letting the downstream operator set whichever they like. As I'll demonstrate in the next
chapter, they can use these communities to drive certain types of behavior.

Here's how I use this `ebgp_import_downstream()` function in the full filter for a downstream:

```
# bgpq4 -Ab4 -R 24 -m 24 -l 'define AS201723_IPV4' AS201723
define AS201723_IPV4 = [
    185.54.95.0/24
];

# bgpq4 -Ab6 -R 48 -m 48 -l 'define AS201723_IPV6' AS201723
define AS201723_IPV6 = [
    2001:678:3d4::/48,
    2001:67c:6bc::/48
];

filter ebgp_raymon_201723_import {
  if (net.type = NET_IP4 && ! (net ~ AS201723_IPV4)) then reject;
  if (net.type = NET_IP6 && ! (net ~ AS201723_IPV6)) then reject;
  if ! ebgp_import_downstream(201723) then reject;

  # Add BGP Large Communities
  bgp_large_community.add((8298,3,201723));

  # Add BGP Communities
  bgp_community.add((8298,3500));

  accept;
}

protocol bgp raymon_201723_ipv4_1 {
  local as 8298;
  source address 185.54.95.250;
  neighbor 185.54.95.251 as 201723;
  default bgp_med 0;
  default bgp_local_pref 400;
  ipv4 {
    import keep filtered;
    import filter ebgp_raymon_201723_import;
    export filter ebgp_raymon_201723_export;
    receive limit 94 action restart;
    next hop self on;
  };
};
```

OK, so this is a mouthful, but the one thing that I really need to do with customers is
ensure that I only accept prefixes from them that they're supposed to send me. I do this
with a `prefix-list` for IPv4 and IPv6, and in the importer, I simply reject any prefixes
that are not in the list. From then on, it looks very much like a peer, with identical
filtering and tagging, except now I'm using yet another ***Customer Community*** which
starts with `(8298,3,*)` and a vanilla `(8298,3500)` community. Anybody who wishes to,
can act on the presence of these communities to know that it's a downstream of IPng Networks
AS8298.

***A note on Peers and Downstreams***:

Some ISPs will not peer with their customers (as in: once you become a transit customer
they will terminate all BGP sessions at public internet exchanges), and I find that silly.
However, for me the situation becomes a little bit more complex if I were to have AS201723
both as a Downstream (as shown here) as well as a Peer (which in fact, I do, at multiple Amsterdam
based internet exchanges). Note how the `bgp_local_pref` is 400 on this session, and it
will always be lower on other types of sessions. The implication is that this prefix from the _RIB_
which carries `(8298,3,201723)` will be selected, and the ones I learn from LSIX will
carry `(8298,1,*)` and the ones I learn from A2B (a transit provider) will carry `(8298,2,51088)`
and both will not be selected due to those having a lower localpref. As I'll demonstrate below,
I can make smart use of these communities when announcing prefixes to my own peers and upstreams,
... read on :)

## 3. Outbound: Announcing Routes

Alright, the _RIB_ is now filled with lots of prefixes that have the right localpref and
communities, for example from having been learned at an IXP, from an Upstream, or from a
Downstream. Now let's consider the following generic exporter:

```
function ebgp_export(int remote_as) {
  # Remove private ASNs
  bgp_path.delete([64512..65535, 4200000000..4294967295]);

  # Well known BGP Large Communities
  if (8298, 0, remote_as) ~ bgp_large_community then return false;
  if (8298, 0, 0) ~ bgp_large_community then return false;

  # Well known BGP Communities
  if (0, 8298) ~ bgp_community then return false;
  if (remote_as < 65536 && (0, remote_as) ~ bgp_community) then return false;

  # AS path prepending
  if ((8298, 103, remote_as) ~ bgp_large_community ||
      (8298, 103, 0) ~ bgp_large_community) then {
    bgp_path.prepend( bgp_path.first );
    bgp_path.prepend( bgp_path.first );
    bgp_path.prepend( bgp_path.first );
  } else if ((8298, 102, remote_as) ~ bgp_large_community ||
             (8298, 102, 0) ~ bgp_large_community) then {
    bgp_path.prepend( bgp_path.first );
    bgp_path.prepend( bgp_path.first );
  } else if ((8298, 101, remote_as) ~ bgp_large_community ||
             (8298, 101, 0) ~ bgp_large_community) then {
    bgp_path.prepend( bgp_path.first );
  }

  return true;
}
```

Oh, wow! There's some really cool stuff to unpack here. As a belt-and-braces type safety,
I will remove any private AS numbers from the as-path - this avoids my own announcements
from tripping any as-path bogon filtering. But then, there's a few well-known communities
that help determine if the announcement is made or not, and there are three-and-a-half
ways of doing this:
1.   `(8298,0,remote_as)`
1.   `(8298,0,0)`
1.   `(0,8298)`
1.   `(0,remote_as)` but only if the remote_as is 16 bits.

All four of these methods will tell the router to refuse announcing the prefix on this
session. Note that downstreams are allowed to set `(8298,*,*)` and `(8298,*)` communities
(and they're the only ones who are allowed to do so). So here is where some of the cool
magic starts to happen.

Then, to drive prepending of the prefix on this session, I'll again match certain
communities `(8298, 103, *)` will prepend the customer's AS number three times, using
`102` will prepend twice, and `101` will prepend once. If the third digit is `0`, then
any session with this filter will prepend. If the third digit is the AS number, then
only sessions to this AS number will be prepended.

Using these types of communities allow downstream (customers) incredibly fine grained
propagation actions, at the per-IPng-session level. Not many ISPs offer this functionality!

### Peers

Exporting to peers, I really need to make sure that I don't send too many prefixes. Most
of us have at some point gone through the embarassing motions of being told by a fellow
operator "hey you're sending a full table". It is paramount to good peering hygiene
that I do not leak. So I'll define a healthy set of _defense in depth_ principles here:

```
# bgpq4 -A4b -R 24 -m 24 -l 'define AS8298_IPV4' AS8298
define AS8298_IPV4 = [ 92.119.38.0/24, 194.1.163.0/24, 194.126.235.0/24 ];

# bgpq4 -A6bR 48 -m 48 -l 'define AS8298_IPV6' AS8298
define AS8298_IPV6 = [ 2001:678:d78::/48, 2a0b:dd80::/29{29,48} ];

# bgpq4 -A4b -R 24 -m 24 -l 'define AS_IPNG_IPV4' AS-IPNG
define AS_IPNG_IPV4 = [ ... ## Removed for brevity ];

# bgpq4 -A6bR 48 -m 48 -l 'define AS_IPNG_IPV6' AS-IPNG
define AS_IPNG_IPV6 = [ .. ## Removed for brevity ];

# bgpq4 -t4b -l 'define AS_IPNG' AS-IPNG
define AS_IPNG = [112, 8298, 50869, 57777, 60557, 201723, 212323, 212855];

function aspath_first_valid() {
  return (bgp_path.len = 0 || bgp_path.first ~ AS_IPNG);
}

# A list of well-known tier1 transit providers
function aspath_contains_tier1() {
  return bgp_path ~ [
     174,                  # Cogent
     209,                  # Qwest (HE carries this on IXPs IPv6 (Jul 12 2018))
     701,                  # UUNET
     702,                  # UUNET
     1239,                 # Sprint
     1299,                 # Telia
     2914,                 # NTT Communications
     3257,                 # GTT Backbone
     3320,                 # Deutsche Telekom AG (DTAG)
     3356,                 # Level3
     3549,                 # Level3
     3561,                 # Savvis / CenturyLink
     4134,                 # Chinanet
     5511,                 # Orange opentransit
     6453,                 # Tata Communications
     6762,                 # Seabone / Telecom Italia
     7018 ];               # AT&T
}

# The list of our own uplink (transit) providers
# Note: This list is autogenerated by our automation.
function aspath_contains_upstream() {
  return bgp_path ~ [ 8283,25091,34549,51088,58299 ];
}

function ipv4_prefix_valid() {
  # Our (locally sourced) prefixes
  if (net ~ AS8298_IPV4) then return true;

  # Customer prefixes in AS-IPNG must be tagged with customer community
  if (net ~ AS_IPNG_IPV4 &&
       (bgp_large_community ~ [(8298, 3, *)] || bgp_community ~ [(8298, 3500)])
     ) then return true;

  return false;
}
function ipv6_prefix_valid() {
  # Our (locally sourced) prefixes
  if (net ~ AS8298_IPV6) then return true;

  # Customer prefixes in AS-IPNG must be tagged with customer community
  if (net ~ AS_IPNG_IPV6 &&
       (bgp_large_community ~ [(8298, 3, *)] || bgp_community ~ [(8298, 3500)])
     ) then return true;

  return false;
}
function prefix_valid() {
  # as-path based filtering
  if !aspath_first_valid() then return false;
  if aspath_contains_tier1() then return false;
  if aspath_contains_upstream() then return false;

  # prefix (and BGP community) based filtering
  if (net.type = NET_IP4 && !ipv4_prefix_valid()) then return false;
  if (net.type = NET_IP6 && !ipv6_prefix_valid()) then return false;
  return true;
}

function ebgp_export_peer(int remote_as) {
  if !prefix_valid() then return false;
  return ebgp_export(remote_as);
}
```

Wow, alrighty then!! All I'm doing here is checking if the call to `prefix_valid()`
returns true. That function isn't very complex. It takes a look at three as-path based
filters and then a prefix-list based filter. Let's go over them in turn:

***aspath_first_valid()*** takes a look at the first hop in the as-path. I need to
make sure that I've received this prefix from an actual downstream, and those are
collected in a RIPE `as-set` called `AS-IPNG`. So if the first BGP hop in the path is
not one of these, I'll refuse to announce the prefix.

***aspath_contains_tier1()*** is a belt-and-braces style check. How on earth would
I provide transit for any prefix for which there's already a global _Tier1_ provider
in the path? I mean, in no universe would AS174 or AS1299 need me to reach any of
their customers, or indeed, any place in the world. So this filter helps me never
announce the prefix, if it has one of these ISPs in the path.

***aspath_contains_upstream()*** similarly, if I am receiving a full table from an
upstream provider, I should not be passing this prefix along - I would for similar
reasons never be a transit provider for A2B or IP-Max or Meerfarbig. Due to a bug
in my configuration, my buddy Erik kindly pointed out this issue to me, so hat-tip
to him for the intelligence.

***ipv[46]_prefix_valid()*** is the main thrust of prefix-based filtering. At this
point we've already established that the as-path is clean, but it could be that
the downstream is sending prefixes they should not (possibly leaking a full table)
so let's take a look at a good way to avoid this.
*   First, we look at locally sourced routes from `AS8298`, that is the ones that I
    myself originate at IPng Networks. These are always OK. The list is carefully
    curated.
*   Alternatively, the prefix needs to be from the as-set `AS-IPNG` (which contains
    both my prefixes and all `route` and `route6` objects belonging to any AS number
    that I consider a downstream),
*   Finally, if the prefix is from `AS-IPNG`, I'll still add one additional check to
    ensure that there is a so-called _customer community_ attached. Remember that I
    discused this specifically up in the ***Inbound - Downstream*** section.

So before I were to announce anything on such a session, all _four_ of as-path,
inbound prefix-list, outbound prefix-list and bgp-community are checked. This
makes it incredibly unlikely that AS8298 ever leaks prefixes -- knock on wood!

### Upstream

Interestingly and if you think about it, unsurprisingly, an upstream configuration
is exactly identical to a peer:

```
function ebgp_export_upstream(int remote_as) {
  if !prefix_valid() then return false;
  return ebgp_export(remote_as);
}
```

Alright, nothing to see here, moving on ...

### Downstream

Now the difference between a Peer and an Upstream on the one hand, and a Downstream
on the other, is that the former two will only see a very limited set of prefixes,
heavily guarded by all of that filtering I described. But a downstream typically
has the luxury of getting to learn every prefix I've learned:

```
function ipv4_acceptable_size() {
  if net.len < 8 then return false;
  if net.len > 24 then return false;
  return true;
}
function ipv6_acceptable_size() {
  if net.len < 12 then return false;
  if net.len > 48 then return false;
  return true;
}
function ebgp_export_downstream(int remote_as) {
  if (source != RTS_BGP && source != RTS_STATIC) then return false;
  if (net.type = NET_IP4 && ! ipv4_acceptable_size()) then return false;
  if (net.type = NET_IP6 && ! ipv6_acceptable_size()) then return false;

  return ebgp_export(remote_as);
}
```

So here I'll assert that the prefix has to be either from the `RTS_BGP` source, or
from the `RTS_STATIC` source. This latter source is what Bird uses for locally
generated routes (ie. the ones in AS8298 itself). Locally generated routes are not
known from BGP, but known instead because they are blackholed / null-routed on the
router itself. And from these routes, I further deselect those prefixes that are
too short or too long, which are slightly different based on address family (IPv4
is anywhere between /8-/24 and for IPv6 is anywhere between /12-/48).

Now, I will note that I've seen many operators who inject OSPF or connected or
static routes into BGP, and all of those folks will have to maintain elaborate egress
"bogon" route filters, for example for those IXP prefixes that they picked up due to
them being directly connected. If those operators would simply not propagate directly
connected routes, their life would be so much simpler .. but I digress and it's time
for me to wrap up.

## Epilog

I hope this little dissertation proves useful for other Bird enthusiasts out there.
I myself had to fiddle a bit over the years with the idiosyncracies (and bugs) of
Bird and Bird2. I wanted to make a few comments:

1.  Thanks to the crew at [Coloclue](https://coloclue.net/) for having a really phenomenal
    routing setup, with a lot of thoughtful documentation, action communities, and strict
    ingress and egress filtering. It's also fully automated and I've derived, although
    completely rewritten, my own automation based off of [Kees](https://github.com/coloclue/kees).
1.  I understand that the main destinction on inbound Peer and Upstream, is that for Peers
    many folks will want to do strict filtering. I've considered this for a long time and
    ultimately decided against it, because a combination of max prefix, tier1 as-path filtering
    and RPKI filtering would take care of the most egregious mistakes and otherwise, I'm actually
    happy to get more prefixes via IXPs rather than less.