TCP (Transport Control Protocol) is much more than just a single protocol — it is a common transport protocol header format. All packets use this header format and have a uniform interpretation of header fields. But how the flow and congestion control are implemented is left to the system. This leads to many variants of TCP that attempt to optimize channel utilization and maintain fairness in network environments.

Consider a simple scenario. Suppose you are driving along the highway for the first time. After traveling distance 'X' you encounter police, and you are handed a fine for speeding. Using this information, the next time you drive you don't want to get fined yet still want to drive fast. You speed up to 'X', but slow down as you approach it. Now the police are not at 'X' but some other place 'Y' — now you want to be cautious and travel faster, so you slowly increase speed again until you are fined again.

This is the same approach TCP variants have been trying to use: optimize link utilization without losing packets.

TCP congestion control maintains variables in TCB (TCP Control Block): cwnd (congestion window), ssthresh (slow start threshold), and others that contain information about the connection state and feedback parameters.


AIMD

Additive Increase — increase congestion window by 1 MSS for each ACK. Multiplicative Decrease — decrease ssthresh by a factor of 2 on loss.


TCP Reno

Ack-pacing flow control protocol. TCP makes use of "Ack clocking" — acknowledgements are counted and there is no explicit timer/clock. In its steady state (Congestion Avoidance), Reno probes the network and steadily increases the congestion window by 1 MSS every RTT.

When buffers overflow at the receiver, packets are dropped and a duplicate ACK is sent. Senders interpret this as a signal of the receiver's data reception rate. There are two types of loss events:

  • 3 Dup ACK — signals mild congestion
  • Timeout — interpreted as severe congestion

TCP Reno cuts its sending rate by half and this behavior keeps repeating — the sawtooth pattern.

Additive increase of 1 MSS doesn't work with high-speed networks. It would take hours to reach higher flow rates, assuming there is no packet drop — a strict constraint.


BIC

Binary Increase Congestion Control: when BIC performs a window reduction in response to a packet drop, it remembers the previous maximum window size and the current window setting. BIC increases the rate non-linearly — it inflates the rate by half of the difference between the current window size and the previous maximum.

The challenge: BIC can be aggressive in low-RTT networks and in slower-speed situations.


CUBIC

An improvement over BIC. CUBIC introduces a higher-order polynomial function as a congestion control algorithm. Key difference: CUBIC is a function of a timer associated with window reduction, rather than BIC's RTT counter. This makes TCP fairer in concurrent TCP sessions that have different RTTs.

CUBIC is more efficient in high-speed flows.

But if CUBIC is efficient, why do we need another TCP variant? To understand that, we need to understand how queues work.

Consider a pipe with a fixed diameter that can store some volume of water as it passes along. The maximum flow rate is achieved when there is no water in the pipe, and once this pipe is full, water will overflow. Three queue states:

  • No Queue — pipe is empty
  • Queue Formation — water is accumulating
  • Queue Saturation — pipe is full, overflow starts

The optimal point is the onset of queue formation, not the onset of overflow.

Reno tries to maintain the Queue Formation state while CUBIC places the flow at the onset of Queue Saturation — leading to increased delays due to queue occupancy.


TCP Vegas

Detects congestion based on increasing RTT time and does not wait for packet loss. It reduces the sending rate when RTT is high.

Limitation: when other flows in the network don't use TCP Vegas, Vegas loses out on utilization share — because other flows are not yet decreasing their sending rate. Also, if a longer RTT is due to a route change, the sending rate is dropped erroneously.


BBR

Bottleneck Bandwidth and Round-trip propagation time — Google's delay-controlled TCP flow control algorithm.

Bandwidth and RTT are influenced by several factors beyond data being transmitted. Once BBR determines its sustainable capacity for the flow, it attempts to actively defend it from being overcrowded by conventional AIMD protocols.

The bottleneck capacity is the maximum data delivery rate to the ACK stream, over a sliding window of the most recent 6–10 RTT intervals. The intended operational mode is that the sender passes packets into the network at a rate anticipated not to encounter queuing within the entire path — a significant contrast to protocols like Reno, which relies on network queues to perform rate adaptation.

For every received ACK, BBR checks if the originally sent data was application-limited. If not, it incorporates the path RTT and path BW calculations into current flow estimates. BBR's probing mechanism sends at a multiple of the BW-delay product for one RTT interval, increasing the rate by the same gain factor until the estimated bottleneck bandwidth no longer changes. This gives BBR exponential adaptation to increased bandwidth — rather than the linear adaptation used by TCP Vegas.

BBR achieves fair share due to its periodic probing at an elevated sending rate.


BBR and the Token Bucket Problem

To understand BBR's challenges, we must first understand QoS traffic policing.

ISPs impose an upper bound on flow rate based on SLA — the CIR (Committed Information Rate). The link is capable of higher rates, but ISPs limit traffic based on what you're paying (the "Traffic Contract"). This is done through policing or shaping:

  • Policing — drops exceeding traffic
  • Shaping — buffers it

The token bucket constrains traffic bursts. The basic idea: packets that pass are equal to the number of tokens in the bucket after replenishing (based on packet arrival time and police rate).

Where BBR suffers: BBR may probe bandwidth at a time when the bucket is full and set its sending rate based on that. But then the bucket might empty, and packets are discarded because they were sent at a higher rate than accepted. This leads to upstream bandwidth wastage and increased latency due to packet losses.

Google has added policer detection and an explicit policer model to BBR to counter this. The feedback that the token-bucket policer impresses onto BBR is: an increase in data in flight generates packet drop rather than increased RTT — commensurate with the increased volume of data in flight.


References

  1. QoS Traffic Policing Explained
  2. Juniper: Understanding Policers