Devices in a home network connect to a router, which connects to an Internet service provider (ISP) to access the Internet. The ISP might connect to a larger ISP before connecting to a fiber-optic “backbone” serving an entire nation or region. To access a Web server—for example, to read a news story on the BBC website—data would travel from a device on the home network across numerous other devices to reach its destination: the BBC webserver.
Key Insights
- Bufferbloat occurs when too many data packets are queued in a router’s buffer waiting to be sent. While buffering is needed to reduce data packet loss, overly large buffers lead to greater delays and poor performance. It is not well understood how large a buffer should be, nor how size affects network performance.
- Many consumer-grade router manufacturers do not use firmware that can prevent bufferbloat. Reducing bufferbloat by another means seems to be required.
- Using data-stream shaping and smart queue management (SQM) to mitigate bufferbloat in home broadband routers results in low delay, which does not significantly vary under load. User experience can be improved.
Multiple devices on the home network can send and receive data while our device accesses the BBC webserver. These independent data traffic streams intersect at the home broadband router. The router has several input ports that allow multiple devices to send data outside of the home network to the Internet. Data packets are transmitted from output ports on a router. Typically, a home broadband router has one output port that connects a line to an ISP.
Transmission Control Protocol (TCP) and User Datagram Protocol (UDP), part of the Internet Protocol (IP) suite, send data across the Internet. TCP ensures delivery of data packets but is slower than UDP, which is “best effort.” A receiver acknowledges receipt of the TCP data packets. If a sender does not receive an acknowledgement for a data packet, the sender will retransmit it.
Our device accessing the BBC webserver does not know what the link’s capabilities are. It uses TCP slow to avoid creating congestion by slowly increasing the number of packets sent until maximum capacity is detected. TCP uses flow control to ensure that the webserver (receiver) is not overwhelmed by our device’s (sender) data packets. This is accomplished by the receiver “advertising” its receive window size. Although the sender sends its data packets as fast as the advertised receive window, congestion can still occur at any device along the path. Figure 1 shows that there are many devices along the path between the home network device (the sender) and the BBC webserver (the receiver). Thus, speed can be affected by sender performance, receiver performance, and the devices in the path between them.
Figure 1. Theoretical devices along the path between a home computer and a Web server.
Data packets traveling the path can be queued or dropped due to congestion, which occurs when the load (amount of data packets) is greater than the line’s capacity. The TCP congestion control mechanism aims to avoid congestion by using the flow of sent data packets and acknowledgements to determine a send rate. If a TCP sender detects little or no congestion along the path between it and the receiver, it increases the transmission rate. If a sender sends too many data packets, this results in the receiver dropping data packets. Data packets that are dropped are not acknowledged. Likewise, if a data packet is not acknowledged quickly enough, it is assumed to have been dropped. Dropped data packets signal to the sender that the transmission rate is higher than capacity and the transmission rate is reduced. Thus, TCP congestion control is “feedback-control.”
Latency is the combined transmission delay, processing delay, and queuing delay. The queuing delay is dependent on the queue size. Thus, bufferbloat can result from excessively large buffers. Routers use buffers to temporarily store data packets that are awaiting transmission when congestion occurs (see Figure 2). These data packets (both TCP and UDP) are queued so they are not lost and can be sent when bandwidth becomes available on the line. These buffers are required to maintain the flow of data packets at the maximum transmission rate. A problem arises if a device along the path has a large buffer: Data packets can wait a long time before they are sent. As data packets are not dropped, the sender does not receive information that the line’s capacity has been exceeded. A timeout will eventually signal that there is congestion. This delay in feedback affects performance; if the delay is very large, the feedback provides an inaccurate state of the network. Thus, the mechanisms do not respond as they should—that is, reducing the data packet send rate—when congestion occurs.
Figure 2. Buffers on a router.
Bufferbloat does affect other protocols as well as TCP, partly because the same buffers are used. The router’s buffer can fill with data before packets begin to drop. TCP packets could be queued ahead of interactive applications. Bufferbloat occurs as too many data packets are queued, which increases delays7 and leads to the degradation of application performance. The effect can be seen in Figure 3. For users on a video call, the delay could mean they are viewing a camera image from five or more seconds prior.
Figure 3. Throughput and delay when there are large buffers.
Delay-sensitive applications that are commonly affected by bufferbloat are:
- VoIP calls
- Videoconferencing
- Online gaming
- Video streaming
- Music streaming
Compared to today’s routers, older routers had small buffers that filled more quickly, and packets would be dropped quickly after a line was saturated. Bufferbloat “has led Internet delays to occasionally exceed the light propagation delay from the Earth to the Moon.”2 Newer routers have larger buffers which can often hold the equivalent of approximately 10 seconds of data. Thus, 10 seconds’ worth of data can be sent without feedback of packets being dropped.8 This results in the TCP sawtooth shown in Figure 4. TCP slowly increases the data rate as capacity on the line is available (seen as the line rising slowly). When congestion occurs, it quickly decreases the rate (seen as the line decreasing sharply).
Buffers are larger today as the cost of memory has fallen. It is not well understood how large a buffer should be, nor how size affects network performance. Router buffers are typically sized based on a rule attributed to Villamizar and Song’s 2004 paper, “High-Performance TCP in ANSNET.”1 The increased size of buffers can be seen to prevent network-congestion avoidance algorithms from functioning correctly. On most home broadband routers, the user cannot change buffer size. Quality of Service (QoS) settings on a router do not effectively reduce bufferbloat. As some data traffic is prioritized, the rest is queued behind it. This means all data traffic in the buffer still needs to be sent. A high-speed connection does not prevent bufferbloat. The delay can still be high. Many applications are more sensitive to delay than they are to low bandwidth.
Bufferbloat can be mitigated by Smart Queue Management (SQM) algorithms. SQM performs per-packet/per-flow network scheduling, active queue length management (AQM), traffic shaping/rate limiting, and QoS. SQM puts traffic from a single IP address or port into its own queue. This means, unlike when using QoS, queues do not become too long as data packets from flows that have a small queue or no queue are prioritized. If a queue becomes too large, a certain percentage of data packets are dropped to allow congestion avoidance to take effect.
Many consumer-grade router manufacturers do not use firmware that can prevent bufferbloat.9 As the design of routers has not changed for some years and appears not to be changing soon, reducing bufferbloat by another means seems necessary. The experiment discussed below was conducted to demonstrate how to mitigate the effects of bufferbloat.
Measuring Bufferbloat
The aim of the experiment was to validate the approach of using data-stream shaping and SQM to prevent bufferbloat, thus improving the user experience. Numerous tests were conducted with different router configurations and measuring latency under varying loads. To signify that bufferbloat is not an issue, there should be no variation in the latency between unloaded and loaded tests.
During the experiment, various loads were created by video streaming and file downloading. Different loads were created using one or two devices to stream video at different resolutions. Video resolution was changed so that bandwidth requirement increases. Tests included streaming 720p, 1080p (high definition), and 2160p (4K) video. Additional tests were conducted to include downloading a file on one device while streaming video simultaneously. To saturate the bandwidth, a speed test was run simultaneously, where only the download portion was performed.
To enable SQM on the router, the SQM package for the OpenWRT firmware was installed: luci-app-sqm. The queue discipline used was cake, and the script was pieces_of_cake.qos. The SQM link-layer adaptation settings used were ATM and 44-byte overhead. The router configuration was changed to enable and disable data-stream shaping and SQM. During the experiment when data-stream shaping was used, upstream and downstream were shaped to use 95% of the line’s capacity.
For each test, a ping test was used to record minimum, maximum, and average latencies as well as data packets lost. If latency increased significantly with load, this indicated bufferbloat. Additionally, if the video being streamed buffered, froze, or pixelated during a test, it was recorded. For tests where the file download was performed, the time taken to download the file was recorded.
Some factors were outside the researcher’s control: speed of the YouTube servers (for video streaming), Microsoft servers (for file downloading), Google Domain Name System (DNS) servers (for measuring latency), and Speedof.me servers (for saturating the bandwidth). As a result, each test was repeated three times to detect anomalies.
The Effect of Bufferbloat
The experiment validated the approach of using SQM and data-stream shaping to mitigate bufferbloat and improve user experience. Latency when SQM and data-stream shaping were enabled was low and did not significantly vary under load. Thus, bufferbloat was mitigated. Streamed video did not buffer, freeze, or pixelate when viewed. The average Round-Trip Time (RTT) resulting from the ping test from the three tests was used, rounded to the nearest whole number. A high average RTT meant bufferbloat was being experienced compared to a consistently low average RTT, which meant bufferbloat was not being exhibited.
The graphs in Figure 5 show plots for each load’s average RTT with a trend-line for each configuration. Latency was lowest when both data streaming and SQM were enabled.
Figure 5. Latency for each router configuration.
Preventing Bufferbloat
The experiment validated the use of data-stream shaping and SQM to mitigate bufferbloat. The variance between latency for each load is lowest when SQM and data-stream shaping were used. This signifies that bufferbloat is not experienced. User experience can be improved. When these mechanisms are used, streaming video does not buffer, freeze, or pixelate. As latency is consistently low, delay-sensitive applications are not affected. People can simultaneously stream video, play online games, video-conference, and conduct VoIP calls without experiencing delay or application instability. File download times increased when SQM and data-stream shaping were enabled as video traffic was prioritized. Data-stream shaping did result in a small loss of bandwidth. However, it is necessary for SQM to work most effectively.
Conclusion
Bufferbloat was uncovered in 2011 by Gettys. Since that time, a lot of Internet traffic has moved toward small bursts dependent on RTTs or using rate-limited streaming to avoid bufferbloat.5 Rate limiting controls the amount of incoming and outgoing traffic to or from a network. However, the issue of bufferbloat still exists.
Router capacity has doubled every 24 months following Moore’s Law in 1965. Moore’s Law explains how processing speed, memory capacity, sensors, and so on are improving at exponential rates as their price decreases similarly. Routers have more memory, but router speed has been limited by memory access speed, which has not improved since 2011. Routers typically have Dynamic Random Access Memory (DRAM). The access speed of DRAM is not as fast as Static Random Access Memory (SRAM) but is cheaper. Thus, router memory is typically designed for large storage and not for speed.6
The experiment validated the use of data-stream shaping and smart queue management to mitigate bufferbloat.
The approach demonstrated in this article requires some configuration to attain the best router performance. As ISPs typically provide routers for free, people are unlikely to pay more for a router with SQM. The OpenWRT firmware Web interface is likely to be beyond the understanding of most home broadband users. Additionally, a user would need to perform firmware upgrades themselves, and support for the router would not be provided. Consumer-grade and ISP-provided routers are intended to be easy to set up and operate. There are reports that some ISP providers are adding SQM to the routers they provide.3 However, as demonstrated in this experiment, SQM in combination with data-stream shaping provides the best performance. As ISPs seek to reduce customer service costs and the process of data-stream shaping would require technical skills beyond most home users’ abilities, it is unlikely to be seen as economically viable for ISPs to promote the use of data-stream shaping. Bufferbloat can be reduced, but it requires buy-in from the ISPs to use routers with SQM and router manufacturers to change their designs.
Further Research
This experiment demonstrates the use of SQM and data-stream shaping on a single home broadband router to improve performance. The SQM Fair Queueing Controlled Delay (FQ_CoDel) seems to be a good algorithm. Further research to model the effect of the widespread deployment of FQ_CoDel is recommended to ascertain if there would be any negative effects. As resources (link capacity) are finite, would many routers using SQM result in resources not being shared fairly?
Join the Discussion (0)
Become a Member or Sign In to Post a Comment