ADR-0009: Happy Eyeballs — IPv4/IPv6 Connection Racing Within Endpoint Categories
Date: 2026-03-21 Status: accepted Tags: networking, ipv6, dual-stack, connectivity Supersedes: (none)
Context
Sprint 18 completes the IPv6 connectivity track (epic #1295, phases 21-22). With
EndpointCandidate / EndpointKind now available (PR #1358, ADR-0003), nodes can
store and exchange multiple addresses per kind (local, public, relay).
In dual-stack environments, a peer may have both an IPv4 and an IPv6 local address. Naively preferring IPv6 breaks nodes where the IPv6 path is degraded (e.g. NAT64 tunnels, misconfigured 6to4, IPv6-only interfaces with no default route). Naively preferring IPv4 leaves IPv6-native networks underserved and slows connection setup for nodes where IPv6 is faster.
The existing dial_parallel() in nat_dial.rs already races endpoint categories
(local vs public). This ADR extends that to racing IP versions within a category
following RFC 8305.
Decision
Add dial_happy_eyeballs(addrs: Vec<SocketAddr>, ...) to nat_dial.rs:
- Sort order: IPv6 addresses first, IPv4 second (future-proof for IPv6-only networks).
- Stagger: Spawn the IPv6 dial immediately. If no success within
happy_eyeballs_delay_ms(default 250ms), spawn the IPv4 dial. This matches the delay specified in RFC 8305 §5 and used by all major browsers. - Winner takes all: Return the first successful connection, cancel the other task.
- Scope: Races within a single
EndpointKind(Local, Public, or Relay). The existingdial_parallel()continues to race across kinds (local vs public). Happy Eyeballs adds a second level of racing within a kind. - Config: Add
happy_eyeballs_delay_ms: u64toNatDialConfigwith#[serde(default)]so existing config files remain valid.
Consequences
Easier:
- Dual-stack nodes connect via whichever path (IPv4 or IPv6) responds first.
- IPv6-native nodes get preference without harming dual-stack nodes with slow IPv6.
- The 250ms stagger limits overhead: worst case is one extra dial attempt per connection.
Harder or riskier:
- Slightly more concurrent dials during connection setup (bounded to 2 per category).
dial_parallel()already handles the winner-takes-all pattern; extending it adds complexity that must be tested carefully.
Out of scope (deferred):
- Cross-category Happy Eyeballs (local-IPv6 vs public-IPv4) — not needed now.
- Adaptive delay based on observed RTT — overkill for Sprint 18.
Alternatives Considered
| Alternative | Why rejected |
|---|---|
| Static IPv6 preference | Breaks dual-stack nodes with degraded IPv6 paths; no recovery mechanism |
| Static IPv4 preference | Leaves IPv6-native networks underserved; wrong direction for the cooperative internet's long-term trajectory |
| No racing, try IPv6 then IPv4 sequentially | Adds full public_dial_timeout_ms (10s) latency when IPv6 fails; unacceptable UX |
| Race all addresses simultaneously with no stagger | More concurrent connections, wastes bandwidth, and doesn't give IPv6 preferential treatment as required by RFC 8305 |
| Per-peer IP version learning / memory | Correct long-term, but premature for Sprint 18 scope; deferred to a future ADR |