Linux Router Setup — NixOS & “v6 plus” edition
Background
I recently moved to Tokyo. The apartment I ended up in has a fiber optic port (conventionally labeled 光, meaning “light”).
When I signed up for internet, I was warned that I needed a router that
supported “v6 plus”. I have a PCEngines APU2 running Linux, which I’ve used as
a router in the states for years.
I had no clue what “v6 plus” was, but I was confident that Linux would support any acronym soup.
In my head, I was expecting to just write the usual router config (you know,
run a dhcp server, flip on ip_forwarding
, NAT traffic out the WAN interface,
call it a day), and then maybe flip on a “v6 plus” option somewhere in my NixOS config, or at worse as an option on the router’s dhcp client or something.
It turns out things are more complicated than I thought. It took a couple days of fiddling around to actually get things setup. What’s more, most of the resources about “v6 plus” are in Japanese.
I figure it’s worth writing up my understanding of this and sharing my router config in case it helps someone out there. Unfortunately, I also don’t have a full understanding of all the details, so my explanations will also be lacking!
With that, let’s talk about the basics of “v6 plus”.
What is v6 plus?
You know you’re in for some good tech when the wikipedia page for it is only in Japanese (at the time of writing anyway)!
As best as I can tell, “v6 plus” is a marketing term for two technologies, and specifically for the deployment of them as commonly used by Japanese ISPs. The technologies are a lightly idiosyncratic IPv6 deployment, and then a MAP-E based IPv6 tunnel for IPv4 traffic.
I believe “v6 plus” is primarily meant to help deal with IPv4 exhaustion, though it seems to also be bundled with using IPoE instead of PPPoE, which also gives performance improvements. Despite “v6 plus” seeming to imply IPoE, I don’t actually know if there’s any technical reason there, and it won’t come up again since that doesn’t end up mattering for the router.
So, I mentioned IPv4 exhaustion and MAP-E above. What’s MAP-E?
MAP-E
MAP-E is described in RFC7597. It is a way to subdivide public IPv4 addresses among many end user devices (routers). The primary alternative (what gets used to deal with IPv4 exhaustion in the US for example) is CGNAT. While CGNAT and MAP-E accomplish similar goals (multiple users of one public IPv4 address), they do so in notably different ways.
Most notably, with CGNAT, it’s transparent to the customer’s router. The router simply gets an IPv4 address, and then uses it without any further thought. On the ISPs side, they of course have to do additional work to add a second layer of NAT to that IP, and also typically to track connection state.
MAP-E is not transparent to the customer’s router. Instead, the router is told what IPv4 address it has (which is a real public address!), and what ports on that IPv4 address it is allowed to use, and then it is expected to apply its own layer of NAT to ensure all outgoing traffic has a source IP of that address, and a source port of one of the allowed ports.
MAP-E also specifies one additional thing. It describes how to setup a tunnel to the IPv6 address of a server which will accept encapsulated IPv4 traffic (encapsulated using an IPv4 over IPv6 tunnel), and then route that traffic to the public internet.
The majority of MAP-E’s spec isn’t about the tunnel though. It’s largely about how the router can derive all the necessary configuration for that tunnel from just the IPv6 address it was assigned. The “necessary information” in this case is the public IPv4 address to use, the port ranges to use, the specific IPv6 address to use for its end of the tunnel, and so on.
I, personally, do not know why they didn’t propose new DHCP options for the v6 DHCP server to send down. All this information about how to setup an IPv4 tunnel seems like it doesn’t need to be derived in such a roundabout way, and instead could be given directly to the router using DHCP, or even just a plain old API since, well, you already have IPv6 connectivity, the network is your oyster.
I’ll also note that I’ve seen claims that the MAP-E deployment in Japan isn’t fully spec compliant, but I haven’t verified these. It seems quite plausible though. I didn’t actually refer to the spec for any of my router configuration, only to people describing their configuration, and describing the reality of what Japan actually does.
MAP-E Example
Let’s look at a sample of what the MAP-E calculation looks like using this rust v6-plus calculator I hacked up.
First, let’s assume that the ISP assigns you, via DHCP6, the IPv6 address of 240b:12:3456:7800:a:100:2000:3000/64
. Note, for the calculated values, only the first 64 bits matter (i.e. 240b:12:3456:7800::
in this example).
The MAP-E calculation would give you the following values:
$ v6plus-tun calculate 240b:12:3456:7800:a:100:2000:3000
IPv4 Addr (CE IPv4 Address): 14.8.52.86
CE IPv6 Addr: 240b:12:3456:7800:e:834:5600:7800
Port Ranges: 6016-6031, 10112-10127, 14208-14223, 18304-18319, 22400-22415,
26496-26511, 30592-30607, 34688-34703, 38784-38799, 42880-42895,
46976-46991, 51072-51087, 55168-55183, 59264-59279, 63360-63375
PSID: 120
Border Relay Address (BR Address): 2404:9200:225:100::64
So, what does that mean? It means “Create an IP tunnel from 240b:12:3456:7800:e:834:5600:7800
(an addr in your /64
) to 2404:9200:225:100::64
(the ISPs server) and send all IPv4 traffic over it. SNAT your IPv4 traffic to a source address of 14.8.52.86
, and to one of the 240 ports in the specified range.”
If you want to understand where those values came from, I recommend either reading the rust code, or this excellent blog post.
Benefits of MAP-E
So, what benefits does MAP-E have?
As best I can tell, the main benefit is that the ISP doesn’t have to run a CGNAT server. They instead run a server, called the “Border Relay server”, which is the other end of the customer’s IPv4 over IPv6 tunnel. This server can be relatively simple since it only has to keep track of static information, like “This tunnel, from IPv6 address a:b:c::d can use IPv4 address 1.2.3.4’s port 1234”. Since the consumer router does the NATing itself, the border relay server doesn’t need to track connection state itself.
It also doesn’t require adding any new DHCP options (as mentioned above), which I guess is some sorta benefit?
It’s likely there’s other benefits which I don’t understand.
Downsides of MAP-E
One downside, which I’ve experienced firsthand, is that routers become more complicated. Having to perform this layer of NAT, setup a tunnel, and perform the encapsulation… it’s all a bit fiddly! Also, I think the IPIP tunnels being used can’t easily be offloaded to hardware, so it’s plausible the consumer routers need beefier CPUs to keep up with line-speed under this scheme.
The usual downsides of CGNAT apply, such as IP bans, rate limiting, and so on typically bucketing only by IPv4 address. This means that some other customer of your ISP, who happens to map to the same IP address, could get you banned or rate-limited through their poor behavior.
I’m sure there’s other downsides, but so far I haven’t found them. Everything I’ve tried to use has worked fine.
v6 Plus — The v6 part
What about IPv6 connectivity? Well, for the router it’s easy, DHCP gives it an address, all works as normal for it. Unfortunately, because you are only given a “/64”, your router can’t subdivide that further, so getting an IPv6 address on your other devices is also complicated. In practice, it seems like people typically use ndppd to proxy through neighbor discovery requests to the upstream IPv6 gateway the ISP runs. I’m probably using some terms wrong here since my understanding of this part is a little shaky.
At the end of the day though, all your devices get routable public IPv6 addresses, and, well, you probably want to do some firewalling!
Other resources
There are a lot of good resources on this stuff.
I would not have been able to configure my own router without consulting the following posts, resources, etc:
- https://vector.hateblo.jp/entry/2021/02/17/142458 — This is the primary resource I consulted for my entire router setup. Amazingly useful blogpost.
- https://intaa.net/archives/13173 — This post, and the following two parts, were invaluable for understanding MAP-E better without having to read the RFC :D
- http://ipv4.web.fc2.com/map-e.html — This webpage is referenced all over the place. I ended up reimplementing its functionality, but it was a vital reference for doing so.
Router configuration
In this section, I’ll link to my router configuration, and talk about a few details of it.
At a high level, I’m using NixOS on a PCEngines APU2 router. I took inspiration from this post and made a live-USB / “run from memory” NixOS setup.
The configuration I ultimately ended up with is here.
I’ll walk through some important sections of it to describe what’s going on.
v6 plus tunnel
This section of the config, specified here, setups up the IPv4 in IPv6 encapsulation:
systemd.services.v4-plus = {
enable = true;
description = "setup v4 tunnel";
unitConfig = {
Type = "simple";
};
path = with pkgs; [ iptables iproute2 ];
serviceConfig = {
ExecStart = "${pkgs.v6plus-tun}/bin/v6plus-tun setup-linux \
--wan enp1s0 ${inputs.secrets.ipv6_addr}"; # TODO: stop hardcoding
};
wantedBy = [ "multi-user.target" ];
# setup ipv4 after ipv6 is up
after = [ "network-online.target" ];
requires = [ "network-online.target" ];
};
I didn’t mention before, but I checked the IPv6 address I was assigned, and it hasn’t changed yet so I’ve just hardcoded it. Ideally, this would dynamically find it on the WAN interface, or check the DHCP6 lease file or such, and use that value, but in practice it was much easier to hardcode it. There’s plenty of reports online that this rarely (never?) changes, so I feel pretty safe in doing so.
Anyway, this shells out to the “v6plus-tun” rust code I wrote here, specifically main.rs#L143-L184. It’s just a bash script in rust btw.
I’ll comment on a couple small sections:
-
run_cmd!(ip -6 tunnel add $tun_dev \ mode ip4ip6 remote $br_addr local $edge_addr dev $wan_dev encaplimit none)?;
This creates the actual IPv4 over IPv6 tunnel using the addresses derived from MAP-E. Linux calls these ip4ip6 tunnels
-
let mark_base = 0x10; run_cmd!(iptables -t mangle -I PREROUTING -j HMARK \ --hmark-tuple sport \ --hmark-mod $num_ranges \ --hmark-offset $mark_base \ --hmark-rnd 4)?; for (i, (start, end)) in data.port_ranges.iter().enumerate() { let mark = mark_base + i; // arbitrary for proto in ["icmp", "tcp", "udp"] { run_cmd!(iptables -t nat -A POSTROUTING -p $proto -o $tun_dev \ -m mark --mark $mark -j SNAT --to $ipv4_addr:$start-$end)?; } }
This marks traffic based on the hash of the source port. In practice, there are always 15 sets of 16 contiguous ports, so there’s always 15 buckets. We then NAT that traffic to allowed source ports for each bucket and protocol.
The use of HMARK there is my contribution to linux “v6 plus” setups. Every other part of my router setup, I saw someone else doing online, but I didn’t see anyone else using HMARK. Most of them use “-m statistic” to do every Nth packet, but HMARK ends up taking significantly fewer iptables rules, and seems like it could be nicer for a few cases as well, since it gives a consistent source port in more cases.
For now, I’ll leave the rest uncommented (as an exercise to the reader you might say).
Issues
Currently, I actually have an issue with my internet setup which I don’t fully understand yet. I get a small amount of IPv6 packet loss (no IPv4 packet loss) for all my devices, except the router, and don’t fully understand why yet.
I intend to figure that out and report back.
Edit (From me, 10 months later)
Future me reporting back: I fixed this by doing the following two things:
- Switching to the 1.0-devel branch of ndppd
- Adding the ip command to it’s path
This ended up being confusing to debug because the path it had under systemd differed from the path it had when I ran it by hand in my shell to debug, so I’d often think “yes, this works” (when running it by hand over ssh), and then commit the config only to find out “oh no, it doesn’t” (with systemd controlling the binary).
This of course was not helped by it being a small amount of packet loss every once in a while, not constantly.
Anyway. It’s done, I’m not writing a followup post for something that small, so I’m just editing it in here.
At 10 months, it’s been running fine, no complaints.
Building and running
The way I’ve been building and running this router configuration is pretty simple:
- Install nix
- From the repo,
make
- Plug in usb drive (
/dev/sdb
) cp ./iso/*/*.iso /dev/sdb
- Plug the drive into my router, power cycle it.
That’s it, configuration applied. I leave the USB stick in the router so that if there’s a power outage, it’ll still come up with a correct config.
Note well, there’s a secrets input, which contains my assigned IPv6 address, which is required for this to build. I have not checked that input in.
Conclusion
Overall, I’m moderately happy with this setup. I actually kinda like MAP-E in a weird way after getting it setup. I don’t like that I only get a “/64”, and honestly that part is way more of a pain than the IPv4 stuff.
I have considered ditching the IPv4 tunnel in favor of a wireguard tunnel that has an IPv4 address (which is another easy way to tunnel IPv4 over IPv6), but so far it’s working well enough I doubt I actually will.
I am quite frustrated by my remaining IPv6 packet loss. My internet is very usable, but I do notice it occasionally eating packets, and I really want to know what I did wrong. I’ll write a followup once I figure it out.
Thanks for reading!
If any of this was helpful for your own internet setup, I’d love to hear about it!
Finally, here’s a picture of my router next to the NTT ONU. Please ignore the mess of cables, I’ll hide it all under some shelves soon!