Rewrite all images to Hugo format
This commit is contained in:
412
content/articles/2024-05-25-nat64-1.md
Normal file
412
content/articles/2024-05-25-nat64-1.md
Normal file
@ -0,0 +1,412 @@
|
||||
---
|
||||
date: "2024-05-25T12:23:54Z"
|
||||
title: 'Case Study: NAT64'
|
||||
---
|
||||
|
||||
# Introduction
|
||||
|
||||
{{< image width="400px" float="right" src="/assets/oem-switch/s5648x-front-opencase.png" alt="Front" >}}
|
||||
|
||||
IPng's network is built up in two main layers, (1) an MPLS transport layer, which is disconnected
|
||||
from the Internet, and (2) a VPP overlay, which carries the Internet. I created a BGP Free core
|
||||
transport network, which uses MPLS switches from a company called Centec. These switches offer IPv4,
|
||||
IPv6, VxLAN, GENEVE and GRE all in silicon, are very cheap on power and relatively affordable per
|
||||
port.
|
||||
|
||||
Centec switches allow for a modest but not huge amount of routes in the hardware forwarding tables.
|
||||
I loadtested them in [[a previous article]({% post_url 2022-12-05-oem-switch-1 %})] at line rate
|
||||
(well, at least 8x10G at 64b packets and around 110Mpps), and they forward IPv4, IPv6 and MPLS
|
||||
traffic effortlessly, at 45 watts.
|
||||
|
||||
I wrote more about the Centec switches in [[my review]({% post_url 2023-03-11-mpls-core %})] of them
|
||||
back in 2022.
|
||||
|
||||
### IPng Site Local
|
||||
|
||||
{{< image width="400px" float="right" src="/assets/nat64/MPLS IPng Site Local v2.svg" alt="IPng SL" >}}
|
||||
|
||||
I leverage this internal transport network for more than _just_ MPLS. The transport switches are
|
||||
perfectly capable of line rate (at 100G+) IPv4 and IPv6 forwarding as well. When designing IPng Site
|
||||
Local, I created a number plan that assigns ***IPv4*** from the **198.19.0.0/16** prefix, and ***IPv6***
|
||||
from the **2001:678:d78:500::/56** prefix. Within these, I allocate blocks for _Loopback_ addresses,
|
||||
_PointToPoint_ subnets, and hypervisor networks for VMs and internal traffic.
|
||||
|
||||
Take a look at the diagram to the right. Each site has one or more Centec switches (in red), and
|
||||
there are three redundant gateways that connect the IPng Site Local network to the Internet (in
|
||||
orange). I run lots of services in this red portion of the network: site to site backups
|
||||
[[Borgbackup](https://www.borgbackup.org/)], ZFS replication [[ZRepl](https://zrepl.github.io/)], a
|
||||
message bus using [[Nats](https://nats.io)], and of course monitoring with SNMP and Prometheus all
|
||||
make use of this network. But it's not only internal services like management traffic, I also
|
||||
actively use this private network to expose _public_ services!
|
||||
|
||||
For example, I operate a bunch of [[NGINX Frontends]({% post_url 2023-03-17-ipng-frontends %})] that
|
||||
have a public IPv4/IPv6 address, and reversed proxy for webservices (like
|
||||
[[ublog.tech](https://ublog.tech)] or [[Rallly](https://rallly.ipng.ch/)]) which run on VMs and
|
||||
Docker hosts which don't have public IP addresses. Another example which I wrote about [[last
|
||||
week]({% post_url 2024-05-17-smtp %})], is a bunch of mail services that run on VMs without public
|
||||
access, but are each carefully exposed via reversed proxies (like Postfix, Dovecot, or
|
||||
[[Roundcube](https://webmail.ipng.ch)]). It's an incredibly versatile network design!
|
||||
|
||||
### Border Gateways
|
||||
|
||||
Seeing as IPng Site Local uses native IPv6, it's rather straight forward to give each hypervisor and
|
||||
VM an IPv6 address, and configure IPv4 only on the externally facing NGINX Frontends. As a reversed
|
||||
proxy, NGINX will create a new TCP session to the internal server, and that's a fine solution.
|
||||
However, I also want my internal hypervisors and servers to have full Internet connectivity. For
|
||||
IPv6, this feels pretty straight forward, as I can just route the **2001:678:d78:500::/56** through
|
||||
a firewall that blocks incoming traffic, and call it a day. For IPv4, similarly I can use classic
|
||||
NAT just like one would in a residential network.
|
||||
|
||||
**But what if I wanted to go IPv6-only?** This poses a small challenge, because while IPng is fully
|
||||
IPv6 capable, and has been since the early 2000s, the rest of the internet is not quite there yet.
|
||||
For example, the quite popular [[GitHub](https://github.com/pimvanpelt/)] hosting site still has
|
||||
only an IPv4 address. Come on, folks, what's taking you so long?! It is for this purpose that NAT64
|
||||
was invented. Described in [[RFC6146](https://datatracker.ietf.org/doc/html/rfc6146)]:
|
||||
|
||||
> Stateful NAT64 translation allows IPv6-only clients to contact IPv4 servers using unicast
|
||||
> UDP, TCP, or ICMP. One or more public IPv4 addresses assigned to a NAT64 translator are shared
|
||||
> among several IPv6-only clients. When stateful NAT64 is used in conjunction with DNS64, no
|
||||
> changes are usually required in the IPv6 client or the IPv4 server.
|
||||
|
||||
The rest of this article describes version 2 of the IPng SL border gateways, which opens the path
|
||||
for IPng to go IPv6-only. By the way, I thought it would be super complicated, but in hindsight: I
|
||||
should have done this years ago!
|
||||
|
||||
#### Gateway Design
|
||||
|
||||
{{< image width="400px" float="right" src="/assets/nat64/IPng NAT64.svg" alt="IPng Border Gateway" >}}
|
||||
|
||||
Let me take a closer look at the orange boxes that I drew in the network diagram above. I call these
|
||||
machines _Border Gateways_. Their job is to sit between IPng Site Local and the Internet. They'll
|
||||
each have one network interface connected to the Centec switch, and another connected to
|
||||
the VPP routers at AS8298. They will provide two main functions: firewalling, so that no unwanted
|
||||
traffic enters IPng Site local, and NAT translation, so that:
|
||||
1. IPv4 users from **198.19.0.0/16** can reach external IPv4 addresses,
|
||||
1. IPv6 users from **2001:678:d78:500::/56** can reach external IPv6,
|
||||
1. _IPv6-only_ users can reach external **IPv4** addresses, a neat trick.
|
||||
|
||||
#### IPv4 and IPv6 NAT
|
||||
|
||||
Let me start off with the basic tablestakes. You'll likely be familiar with _masquerading_, a
|
||||
NAT technique in Linux that uses the public IPv4 address assigned by your provider, allowing
|
||||
many internal clients, often using [[RFC1918](https://datatracker.ietf.org/doc/html/rfc1918)] addresses,
|
||||
to access the internet via that shared IPv4 address. You may not have come across IPv6 _masquerading_
|
||||
though, but it's equally possible to take an internal (private, non-routable)
|
||||
IPv6 network and access the internet via a shared IPv6 address.
|
||||
|
||||
I will assign a pool of four public IPv4 addresses and eight IPv6 addresses to each border gateway:
|
||||
|
||||
| **Machine** | **IPv4 pool** | **IPv6 pool** |
|
||||
| border0.chbtl0.net.ipng.ch | <span style='color:green;'>194.126.235.0/30</span> | <span style='color:blue;'>2001:678:d78::3:0:0/125</span> |
|
||||
| border0.chrma0.net.ipng.ch | <span style='color:green;'>194.126.235.4/30</span> | <span style='color:blue;'>2001:678:d78::3:1:0/125</span> |
|
||||
| border0.chplo0.net.ipng.ch | <span style='color:green;'>194.126.235.8/30</span> | <span style='color:blue;'>2001:678:d78::3:2:0/125</span> |
|
||||
| border0.nlams0.net.ipng.ch | <span style='color:green;'>194.126.235.12/30</span> | <span style='color:blue;'>2001:678:d78::3:3:0/125</span> |
|
||||
|
||||
Linux iptables _masquerading_ will only work with the IP addresses assigned to the external
|
||||
interface, so I will need to use a slightly different approach to be able to use these _pools_. In
|
||||
case you're wondering -- IPng's internal network has grown to the size now that I cannot expose it
|
||||
all behind a single IPv4 address; there will not be enough TCP/UDP ports. Luckily, NATing via a pool
|
||||
is pretty easy using the _SNAT_ module:
|
||||
|
||||
```
|
||||
pim@border0-chrma0:~$ cat << EOF | sudo tee /etc/rc.firewall.ipng-sl
|
||||
# IPng Site Local: Enable stateful firewalling on IPv4/IPv6 forwarding
|
||||
iptables -P FORWARD DROP
|
||||
ip6tables -P FORWARD DROP
|
||||
iptables -I FORWARD -i enp1s0f1 -m state --state NEW -s 198.19.0.0/16 -j ACCEPT
|
||||
ip6tables -I FORWARD -i enp1s0f1 -m state --state NEW -s 2001:678:d78:500::/56 -j ACCEPT
|
||||
iptables -I FORWARD -m state --state RELATED,ESTABLISHED -j ACCEPT
|
||||
ip6tables -I FORWARD -m state --state RELATED,ESTABLISHED -j ACCEPT
|
||||
|
||||
# IPng Site Local: Enable NAT on external interface using NAT pools
|
||||
iptables -t nat -I POSTROUTING -s 198.19.0.0/16 -o enp1s0f0 \
|
||||
-j SNAT --to 194.126.235.4-194.126.235.7
|
||||
ip6tables -t nat -I POSTROUTING -s 2001:678:d78:500::/56 -o enp1s0f0 \
|
||||
-j SNAT --to 2001:678:d78::3:1:0-2001:678:d78::3:1:7
|
||||
EOF
|
||||
```
|
||||
|
||||
From the top -- I'll first make it the default for the kernel to refuse to _FORWARD_ any traffic that
|
||||
is not explicitly accepted. I will only allow traffic that comes in via `enp1s0f1` (the internal
|
||||
interface), only if it comes from the assigned IPv4 and IPv6 site local prefixes. On the way back,
|
||||
I'll allow traffic that matches states created on the way out. This is the _firewalling_ portion of
|
||||
the setup.
|
||||
|
||||
Then, two _POSTROUTING_ rules turn on network address translation. If the source address is any of
|
||||
the site local prefixes, I'll rewrite it to come from the IPv4 or IPv6 pool addresses, respectively.
|
||||
This is the _NAT44_ and _NAT66_ portion of the setup.
|
||||
|
||||
#### NAT64: Jool
|
||||
|
||||
{{< image width="400px" float="right" src="/assets/nat64/jool.png" alt="Jool" >}}
|
||||
|
||||
So far, so good. But this article is about NAT64 :-) Here's where I grossly overestimated how
|
||||
difficult it might be -- and if there's one takeaway from my story here, it should be that NAT64 is
|
||||
as straight forward as the others! Enter [[Jool](https://jool.mx)], an Open Source SIIT and NAT64
|
||||
for Linux. It's available in Debian as a DKMS kernel module and userspace tool, and it integrates
|
||||
cleanly with both _iptables_ and _netfilter_.
|
||||
|
||||
Jool is a network address and port translating
|
||||
implementation, which is referred to as _NAPT_, just as regular IPv4 NAT. When internal IPv6 clients
|
||||
try to reach an external endpoint, Jool will make note of the internal src6:port, then select an
|
||||
external IPv4 address:port, rewrite the packet, and on the way back, correlate the src4:port with
|
||||
the internal src6:port, and rewrite the packet. If this sounds an awful lot like NAT, then you're
|
||||
not wrong! The only difference is, Jool will also translate the *address family*: it will rewrite
|
||||
the internal IPv6 addresses to external IPv4 addresses.
|
||||
|
||||
Installing Jool is as simple as this:
|
||||
|
||||
```
|
||||
pim@border0-chrma0:~$ sudo apt install jool-dkms jool-tools
|
||||
pim@border0-chrma0:~$ sudo mkdir /etc/jool
|
||||
pim@border0-chrma0:~$ cat << EOF | sudo tee /etc/jool/jool.conf
|
||||
{
|
||||
"comment": {
|
||||
"description": "Full NAT64 configuration for border0.chrma0.net.ipng.ch",
|
||||
"last update": "2024-05-21"
|
||||
},
|
||||
"instance": "default",
|
||||
"framework": "netfilter",
|
||||
"global": { "pool6": "2001:678:d78:564::/96", "lowest-ipv6-mtu": 1280, "logging-debug": false },
|
||||
"pool4": [
|
||||
{ "protocol": "TCP", "prefix": "194.126.235.4/30", "port range": "1024-65535" },
|
||||
{ "protocol": "UDP", "prefix": "194.126.235.4/30", "port range": "1024-65535" },
|
||||
{ "protocol": "ICMP", "prefix": "194.126.235.4/30" }
|
||||
]
|
||||
}
|
||||
EOF
|
||||
pim@border0-chrma0:~$ sudo systemctl start jool
|
||||
```
|
||||
|
||||
.. and that, as they say, is all there is to it! There's two things I make note of here:
|
||||
1. I have assigned **2001:678:d78:564::/96** as NAT64 `pool6`, which means that if this machine
|
||||
sees any traffic _destined_ to that prefix, it'll activate Jool, select an available IPv4
|
||||
address:port from the `pool4`, and send the packet to the IPv4 destination address which it
|
||||
takes from the last 32 bits of the original IPv6 destination address.
|
||||
1. Cool trick: I am **reusing** the same IPv4 pool as for regular NAT. The Jool kernel module
|
||||
happily coexists with the _iptables_ implementation!
|
||||
|
||||
#### DNS64: Unbound
|
||||
|
||||
There's one vital piece of information missing, and it took me a little while to appreciate that. If
|
||||
I take an IPv6 only host, like Summer, and I try to connect to an IPv4-only host, how does that even
|
||||
work?
|
||||
|
||||
```
|
||||
pim@summer:~$ ip -br a
|
||||
lo UNKNOWN 127.0.0.1/8 ::1/128
|
||||
eno1 UP 2001:678:d78:50b::f/64 fe80::7e4d:8fff:fe03:3c00/64
|
||||
pim@summer:~$ ip -6 ro
|
||||
2001:678:d78:50b::/64 dev eno1 proto kernel metric 256 pref medium
|
||||
fe80::/64 dev eno1 proto kernel metric 256 pref medium
|
||||
default via 2001:678:d78:50b::1 dev eno1 proto static metric 1024 pref medium
|
||||
|
||||
pim@summer:~$ host github.com
|
||||
github.com has address 140.82.121.4
|
||||
pim@summer:~$ ping github.com
|
||||
ping: connect: Network is unreachable
|
||||
```
|
||||
|
||||
Now comes the really clever reveal -- NAT64 works by assigning an IPv6 prefix that snugly fits the
|
||||
entire IPv4 address space, typically **64:ff9b::/96**, but operators can chose any prefix they'd like.
|
||||
For IPng's site local network, I decided to assign **2001:678:d78:564::/96** for this purpose
|
||||
(this is the `global.pool6` attribute in Jool's config file I described above). A resolver can then
|
||||
tweak DNS lookups for IPv6-only hosts to return addresses from that IPv6 range. This tweaking is
|
||||
called DNS64, described in [[RFC6147](https://datatracker.ietf.org/doc/html/rfc6147)]:
|
||||
|
||||
> DNS64 is a mechanism for synthesizing AAAA records from A records. DNS64 is used with an
|
||||
> IPv6/IPv4 translator to enable client-server communication between an IPv6-only client and an
|
||||
> IPv4-only server, without requiring any changes to either the IPv6 or the IPv4 node, for the
|
||||
> class of applications that work through NATs.
|
||||
|
||||
I run the popular [[Unbound](https://www.nlnetlabs.nl/projects/unbound/about/)] resolver at IPng,
|
||||
deployed as a set of anycasted instances across the network. With two lines of configuration only, I
|
||||
can turn on this feature:
|
||||
|
||||
```
|
||||
pim@border0-chrma0:~$ cat << EOF | sudo tee /etc/unbound/unbound.conf.d/dns64.conf
|
||||
server:
|
||||
module-config: "dns64 iterator"
|
||||
dns64-prefix: 2001:678:d78:564::/96
|
||||
EOF
|
||||
pim@border0-chrma0:~$ sudo systemctl restat unbound
|
||||
```
|
||||
|
||||
The behavior of the resolver now changes in a very subtle but cool way:
|
||||
|
||||
```
|
||||
pim@summer:~$ host github.com
|
||||
github.com has address 140.82.121.3
|
||||
github.com has IPv6 address 2001:678:d78:564::8c52:7903
|
||||
pim@summer:~$ host 2001:678:d78:564::8c52:7903
|
||||
3.0.9.7.2.5.c.8.0.0.0.0.0.0.0.0.4.6.5.0.8.7.d.0.8.7.6.0.1.0.0.2.ip6.arpa
|
||||
domain name pointer lb-140-82-121-3-fra.github.com.
|
||||
```
|
||||
|
||||
Before, [[github.com](https://github.com/pimvanpelt/)] did not return an AAAA record, so there was
|
||||
no way for Summer to connect to it. But now, not only does it return an AAAA record, but it also
|
||||
rewrites the PTR request, knowing that I'm asking for something in the DNS64 range of
|
||||
**2001:678:d78:564::/96**, Unbound will instead strip off the last 32 bits (`8c52:7903`, which is the
|
||||
hex encoding for the original IPv4 address), and return the answer for a PTR lookup for the original
|
||||
`3.121.82.140.in-addr.arpa` instead. Game changer!
|
||||
|
||||
{{< image width="400px" float="right" src="/assets/nat64/IPng NAT64.svg" alt="IPng Border Gateway" >}}
|
||||
|
||||
#### DNS64 + NAT64
|
||||
|
||||
What I learned from this, is that the _combination_ of these two tools provides the magic:
|
||||
|
||||
1. When an IPv6-only client asks for AAAA for an IPv4-only hostname, Unbound will synthesize an AAAA
|
||||
from the IPv4 address, casting it into the last 32 bits of its NAT64 prefix **2001:678:d78:564::/96**
|
||||
1. When an IPv6-only client tries to send traffic to **2001:678:d78:564::/96**, Jool will do the
|
||||
address family (and address/port) translation. This is represented by the red (ipv6) flow in the
|
||||
diagram to the right turning into a green (ipv4) flow to the left.
|
||||
|
||||
What's left for me to do is to ensure that (a) the NAT64 prefix is routed from IPng Site Local to
|
||||
the gateways and (b) that the IPv4 and IPv6 NAT address pools is routed from the Internet to the
|
||||
gateways.
|
||||
|
||||
#### Internal: OSPF
|
||||
|
||||
I use Bird2 to accomplish the dynamic routing - and considering the Centec switch network is by
|
||||
design _BGP Free_, I will use OSPF and OSPFv3 for these announcements. Using OSPF has an important
|
||||
benefit: I can selectively turn on and off the Bird announcements to the Centec IPng Site local
|
||||
network. Seeing as there will be multiple redundant gateways, if one of them goes down (either due
|
||||
to failure or because of maintenance), the network will quickly reconverge on another replica. Neat!
|
||||
|
||||
Here's how I configure the OSPF import and export filters:
|
||||
|
||||
```
|
||||
filter ospf_import {
|
||||
if (net.type = NET_IP4 && net ~ [ 198.19.0.0/16 ]) then accept;
|
||||
if (net.type = NET_IP6 && net ~ [ 2001:678:d78:500::/56 ]) then accept;
|
||||
reject;
|
||||
}
|
||||
|
||||
filter ospf_export {
|
||||
if (net.type=NET_IP4 && !(net~[198.19.0.255/32,0.0.0.0/0])) then reject;
|
||||
if (net.type=NET_IP6 && !(net~[2001:678:d78:564::/96,2001:678:d78:500::1:0/128,::/0])) then reject;
|
||||
|
||||
ospf_metric1 = 200; unset(ospf_metric2);
|
||||
accept;
|
||||
}
|
||||
```
|
||||
|
||||
When learning prefixes _from_ the Centec switch, I will only accept precisely the IPng Site Local
|
||||
IPv4 (198.19.0.0/16) and IPv6 (2001:678:d78:500::/56) supernets. On sending prefixes _to_ the Centec
|
||||
switches, I will announce:
|
||||
* ***198.19.0.255/32*** and ***2001:678:d78:500::1:0/128***: These are the anycast addresses of the Unbound resolver.
|
||||
* ***0.0.0.0/0*** and ***::/0***: These are default routes for IPv4 and IPv6 respectively
|
||||
* ***2001:678:d78:564::/96***: This is the NAT64 prefix, which will attract the IPv6-only traffic
|
||||
towards DNS64-rewritten destinations, for example 2001:678:d78:564::8c52:7903 as DNS64 representation
|
||||
of github.com, which is reachable only at legacy address 140.82.121.3.
|
||||
|
||||
{{< image width="100px" float="left" src="/assets/nat64/brain.png" alt="Brain" >}}
|
||||
|
||||
I have to be careful with the announcements into OSPF. The cost of E1 routes is the cost of the
|
||||
external metric **in addition to** the internal cost within OSPF to reach that network. The cost
|
||||
of E2 routes will always be the external metric, the metric will take no notice of the internal
|
||||
cost to reach that router. Therefor, I emit these prefixes without Bird's `ospf_metric2` set, so
|
||||
that the closest border gateway is always used.
|
||||
|
||||
With that, I can see the following:
|
||||
```
|
||||
pim@summer:~$ traceroute6 github.com
|
||||
traceroute to github.com (2001:678:d78:564::8c52:7903), 30 hops max, 80 byte packets
|
||||
1 msw0.chbtl0.net.ipng.ch (2001:678:d78:50b::1) 4.134 ms 4.640 ms 4.796 ms
|
||||
2 border0.chbtl0.net.ipng.ch (2001:678:d78:503::13) 0.751 ms 0.818 ms 0.688 ms
|
||||
3 * * *
|
||||
4 * * * ^C
|
||||
```
|
||||
|
||||
I'm not quite there yet, I have one more step to go. What's happening at the Border Gateway? Let me
|
||||
take a look at this, while I ping6 to github.com:
|
||||
|
||||
```
|
||||
pim@summer:~$ ping6 github.com
|
||||
PING github.com(lb-140-82-121-4-fra.github.com (2001:678:d78:564::8c52:7904)) 56 data bytes
|
||||
... (nothing)
|
||||
|
||||
pim@border0-chbtl0:~$ sudo tcpdump -ni any src host 2001:678:d78:50b::f or dst host 140.82.121.4
|
||||
11:25:19.225509 enp1s0f1 In IP6 2001:678:d78:50b::f > 2001:678:d78:564::8c52:7904:
|
||||
ICMP6, echo request, id 3904, seq 7, length 64
|
||||
11:25:19.225603 enp1s0f0 Out IP 194.126.235.3 > 140.82.121.4:
|
||||
ICMP echo request, id 61668, seq 7, length 64
|
||||
```
|
||||
|
||||
Unbound and Jool are doing great work. Unbound saw my DNS request for IPv4-only github.com, and
|
||||
synthesized a DNS64 response for me. Jool then saw the inbound packet from enp1s0f1, the internal
|
||||
interface pointed at IPng Site Local. This is because the **2001:678:d78:564::/96** prefix is
|
||||
announced in OSPFv3 so every host knows to route traffic to that prefix to this border gateway.
|
||||
But then, I see the NAT64 in action on the outbound interface enp1s0f0. Here, one of the IPv4 pool
|
||||
addresses is selected as source address. But there is no return packet, because there is no route
|
||||
back from the Internet, yet.
|
||||
|
||||
#### External: BGP
|
||||
|
||||
The final step for me is to allow return traffic, from the Internet to the IPv4 and IPv6 pools to
|
||||
reach this Border Gateway instance. For this, I configure BGP with the following Bird2
|
||||
configuration snippet:
|
||||
|
||||
|
||||
```
|
||||
filter bgp_import {
|
||||
if (net.type = NET_IP4 && !(net = 0.0.0.0/0)) then reject;
|
||||
if (net.type = NET_IP6 && !(net = ::/0)) then reject;
|
||||
accept;
|
||||
}
|
||||
filter bgp_export {
|
||||
if (net.type = NET_IP4 && !(net ~ [ 194.126.235.4/30 ])) then reject;
|
||||
if (net.type = NET_IP6 && !(net ~ [ 2001:678:d78::3:1:0/125 ])) then reject;
|
||||
|
||||
# Add BGP Wellknown community no-export (FFFF:FF01)
|
||||
bgp_community.add((65535,65281));
|
||||
accept;
|
||||
}
|
||||
```
|
||||
|
||||
I then establish an eBGP session from private AS64513 to two of IPng Networks' core routers at
|
||||
AS8298. I add the wellknown BGP no-export community (`FFFF:FF01`) so that these prefixes are learned
|
||||
in AS8298, but never propagated. It's not strictly necessary, because AS8298 won't announce more
|
||||
specifics like these anyway, but it's a nice way to really assert that these are meant to stay
|
||||
local. Because AS8298 is already announcing **194.126.235.0/24** and **2001:678:d78::/48**
|
||||
supernets, return traffic will already be able to reach IPng's routers upstream. With these more
|
||||
specific announcements of the /30 and /125 pools, the upstream VPP routers will be able to route the
|
||||
return traffic to this specific server.
|
||||
|
||||
And with that, the ping to Unbound's DNS64 provided IPv6 address for github.com shoots to life.
|
||||
|
||||
### Results
|
||||
|
||||
I deployed four of these Border Gateways using Ansible: one at my office in Brüttisellen, one
|
||||
in Zurich, one in Geneva and one in Amsterdam. They do all three types of NAT:
|
||||
|
||||
* Announcing the IPv4 default **0.0.0.0/0** will allow them to serve as NAT44 gateways for
|
||||
**198.19.0.0/16**
|
||||
* Announcing the IPv6 default **::/0** will allow them to serve as NAT66 gateway for
|
||||
**2001:678:d78:500::/56**
|
||||
* Announcing the IPv6 nat64 prefix **2001:678:d78:564::/96** will allow them to serve as NAT64 gateway
|
||||
* Announcing the IPv4 and IPv6 anycast address for `nscache.net.ipng.ch` allows them to serve DNS64
|
||||
|
||||
Each individual service can be turned on or off. For example, stopping to announce the IPv4 default
|
||||
into the Centec network, will no longer attract NAT44 traffic through a replica. Similarly, stopping
|
||||
to announce the NAT64 prefix will no longer attract NAT64 traffic through that replica. OSPF in the
|
||||
IPng Site Local network will automatically select an alternative replica in such cases. Shutting
|
||||
down Bird2 alltogether will immediately drain the machine of all traffic, while traffic is
|
||||
immediately rerouted.
|
||||
|
||||
If you're curious, here's a few minutes of me playing with failover, while watching YouTube videos
|
||||
concurrently.
|
||||
|
||||
{{< image src="/assets/nat64/nat64.gif" alt="Asciinema" >}}
|
||||
|
||||
### What's Next
|
||||
|
||||
I've added an Ansible module in which I can configure the individual instances' IPv4 and IPv6 NAT
|
||||
pools, and turn on/off the three NAT types by means of steering the OSPF announcements. I can also
|
||||
turn on/off the Anycast Unbound announcements, in much the same way.
|
||||
|
||||
If you're a regular reader of my stories, you'll maybe be asking: Why didn't you use VPP? And that
|
||||
would be an excellent question. I need to noodle a little bit more with respect to having all three
|
||||
NAT types concurrently working alongside Linux CP for the Bird and Unbound stuff, but I think in the
|
||||
future you might see a followup article on how to do all of this in VPP. Stay tuned!
|
Reference in New Issue
Block a user