The CUBIC Conundrum: 10 Key Insights from a Linux Kernel Bug That Broke QUIC

By

Imagine a bug so sneaky that it only strikes after the network has already been beaten into submission. That's precisely what happened when a seemingly harmless Linux kernel optimization to CUBIC—the default congestion controller in Linux—unleashed a permanent traffic jam in Cloudflare's QUIC implementation. In this listicle, we break down the twisted tale of how a targeted fix for TCP went awry when ported to QUIC, locking the congestion window at its floor and never recovering. From the basics of congestion control to the one-line solution that saved the day, here are 10 things you need to know about this bizarre kernel bug.

1. CUBIC: The Unsung King of Internet Traffic

First standardized in RFC 9438, CUBIC is the default congestion control algorithm in the Linux kernel. As a result, it governs the vast majority of TCP and QUIC connections across the public internet. At Cloudflare, our open-source QUIC implementation—quiche—relies on CUBIC as its default controller. This means that any bug in CUBIC directly impacts the performance of a significant slice of the web traffic we handle daily. Understanding CUBIC is crucial because it dictates how senders probe for available bandwidth, back off during loss, and recover—essentially the heartbeat of network data flow.

The CUBIC Conundrum: 10 Key Insights from a Linux Kernel Bug That Broke QUIC
Source: blog.cloudflare.com

2. The Congestion Window: A Simple but Critical Knob

At the core of any congestion control algorithm is the congestion window (cwnd)—a sender-side cap on how many bytes can be in flight (sent but not yet acknowledged) at any moment. A larger cwnd lets the sender push more data per round trip, while a smaller cwnd throttles it. Every loss-based algorithm, including CUBIC, defines a policy for growing cwnd when the network is healthy and shrinking it when loss is detected. Think of cwnd as the accelerator pedal: too much and you overrun the network's capacity; too little and you waste expensive bandwidth.

3. The Fundamental Assumption of Loss-Based Algorithms

Loss-based congestion controllers like CUBIC operate on a simple two-pronged premise: (1) If there is no packet loss, increase the sending rate to utilize more bandwidth; (2) If loss occurs, assume the network is overloaded and decrease the sending rate. This logic works well in stable environments but relies on the assumption that loss equals congestion. Over the years, this assumption has been revisited—particularly in modern networks where loss can stem from non-congestion causes like radio interference or bufferbloat. However, for the purposes of this bug, the classic model held true, until it didn't.

4. The First Red Flag: A Test That Failed 61% of the Time

The investigation began when Cloudflare's ingress proxy integration test pipeline started throwing unexpected failures. The erratic behavior emerged in a specific scenario: heavy packet loss early in a connection when CUBIC was in use. These failures were not rare flukes—they appeared in 61% of test runs. This was a clear signal that something was fundamentally wrong with how CUBIC recovered from a congestion collapse. Most congestion control tests focus on steady-state and growth phases, but this corner case—where the cwnd hits rock bottom—was rarely probed, making the bug invisible in standard throughput tests.

5. Congestion Collapse: When the Cwnd Gets Stuck at Minimum

The core symptom of the bug was that after a severe loss event, CUBIC's cwnd became permanently pinned at its minimum value (typically 2 MSS – Maximum Segment Size). Normally, after a congestion collapse, the controller should eventually probe for more bandwidth and grow the cwnd again. In this bug, however, the cwnd never recovered, effectively strangling the connection forever. This is exactly the regime a congestion controller exists to handle—but the fix for a different problem inadvertently destroyed recovery in this case.

6. The Innocent Change: App-Limited Exclusion from RFC 9438 §4.2-12

The story starts with a Linux kernel commit that aimed to align CUBIC with the app-limited exclusion rule described in RFC 9438 Section 4.2, bullet 12. This rule is a real improvement for TCP: it prevents the controller from inflating cwnd when the sender is application-limited (i.e., not pushing data fast enough). In TCP, this fix correctly distinguishes between network-limited and app-limited flows, avoiding unnecessary growth. When the same logic was ported to quiche (Cloudflare's QUIC implementation), it seemed like a straightforward copy. But QUIC's different architecture turned this fix into a poison pill.

The CUBIC Conundrum: 10 Key Insights from a Linux Kernel Bug That Broke QUIC
Source: blog.cloudflare.com

7. The QUIC Twist: How Porting Broke the Recovery Loop

QUIC operates on a distinct packet-number space and uses a different acknowledgment model than TCP. The app-limited exclusion logic, designed for TCP's byte-stream paradigm, interfered with QUIC's loss recovery state machine. In quiche, after a congestion collapse, the code would incorrectly classify the flow as app-limited because of how acknowledgments are batched in QUIC. This prevented CUBIC from ever leaving the recovery state and thus never re-entering a growth phase. The cwnd remained at its minimum, the connection stalled, and tests failed repeatedly.

8. The Bug Mechanism: A Self-Perpetuating Deadlock

Let's dive deeper into the mechanics. After heavy loss, CUBIC enters a recovery state with cwnd reduced to 2 MSS. In this state, the sender awaits acknowledgments to validate forward progress. However, with the app-limited exclusion, if the sender is not pushing data at full speed (which it can't, due to tiny cwnd), the algorithm considers the flow app-limited and refuses to update cwnd based on loss recovery events. The result? Cwnd stuck at minimum, no probing, no growth. It became a catch-22: the flow couldn't send enough data to generate the ACKs needed to prove it was network-limited, so it stayed forever app-limited.

9. The Fix: A Near-One-Line Change That Broke the Cycle

After tracing the deadlock, the Cloudflare team proposed an elegant solution: remove or adjust the app-limited exclusion for the recovery state. Specifically, they changed the condition so that when the connection is in recovery (after loss), the app-limited check is bypassed. This alowed the cwnd to be updated based on acknowledgments even if the flow appeared app-limited. The change was nearly a one-line fix—simple on the surface but profound in effect. After deployment, the test suite passed reliably, and CUBIC recovered normally from congestion collapse in QUIC.

10. Lessons Learned: The Perils of Porting Kernel Optimizations

This bug serves as a cautionary tale about porting kernel optimizations across protocols. A change that is perfect for TCP can break QUIC's different state machine. It also highlights the importance of testing edge cases—specifically, the rare but critical regime of congestion collapse. Most developers focus on steady-state performance, but the true test of a congestion controller is how it recovers from the worst. Finally, the near one-line fix underscores that sometimes the simplest solutions are the hardest to find. For network engineers, this story is a reminder that even well-established algorithms can hide surprises when moved to new contexts.

Conclusion: The CUBIC conundrum ended happily with a tiny patch that restored sanity to QUIC connections. But it also revealed a vulnerability in how we think about congestion control: what works for TCP may not work for QUIC, and edge cases matter far more than we often assume. By sharing this story, Cloudflare hopes to encourage more thorough testing and cross-protocol awareness. After all, the internet's reliability depends on algorithms that can recover from any crash—including their own bugs.

Related Articles

Recommended

Discover More

Apple Launches Swift System Metrics 1.0: Production-Grade Process Monitoring Now AvailableInside Fast16: A Step-by-Step Guide to How This Hidden Sabotage Malware OperatesTesla Unveils Basecharger and Megacharger Pricing for Semi Truck Charging Program7 Key Pillars for Shared Design Leadership That Actually Works7 Critical Facts About the xrdp RCE Vulnerability (CVE-2025-68670)