 |
Networks and Buffering
|
 |
Introduction
The predominant transport protocol on the
Internet is TCP/IP. TCP guarantees delevery of streams of bytes. It
does so by keeping buffers at the sending and receiving side so that
it is capable of retransmitting lost data. The buffers get updated
with new data when both ends agree that previous data is received
and delivered to the application. Another problem that TCP tries to
solve is congestion. If all computers connected to networks try to
send and receive at full speed at the same time (collective events
are notorious, like everybody wanting to look at the same football
game at the same time) congestion is unavoidable. It is not unlike
trafficjams on the highways during rush hour. The TCP protocol tries
to avoid an internet meltdown by monitoring dropped packets and
reduce the rate it transmits packets when certain patterns of lost
packets occur. Also TCP tries to increase the rate when no packets
are lost for some time under the assumption that there is more
headroom. For a good efficient and highest throughput operation the
above mentioned buffers that TCP utilizes need to be as big as the
amount of data that can be transported during one round trip time
from sender to receiver and back (RTT). TCP typically implements
congestion avoidance by increasing and decreasing these buffers.
This implicitly rate-limits TCP as when the buffers are decreased by
the protocol because of lost packets, the sender can not send new
data when it has already transmitted all data in the buffer. It has
to wait for the other side to get confirmation that everything
arrived and to move on, or in case of lost packets, start
retransmitting.
This approach has worked well in the last 20
years, although in several cornercases protocol adjustments and
parameter tuning has been necessary. Typical problems arise in
single large bandwidth high latency transport of scientific or cloud
data, where the necessary buffers are way bigger than those for
typical home usage of mail/brouwsing/twitter. The scientific
community has investigated this problem with the emergence of the
big eScience projects as the Large Hadron Collider, astronomy and
earth observation. Several alternate tunded modified TCP-alike
stacks exist and the TCP standard is slowly evolving to scale
better.
The problem that now arises is this: suppose we
have a concatenation of network devices with different bottlenecks
or congestions in the chain, but at all those bottlenecks there is
an infinite amount of memory so that no packet is dropped ever. We
also assume here that the sender is not the bottleneck, but can send
at higher rate than one or more downstram devices can forward. Then
the current TCP algorithms assume that there is even more headroom
and increase their buffers to the maximum and continue to transmit
if possible at line rate of their interface. The net effect will be
that the perceived latency increases as it will take longer and
longer to get an answer related to a specific packet inserted in the
stream. That packet will encounter increasingly filled queues along
the path and will take longer to reach destination, at which point
the reaction to the sender can be transmitted. The net effect is a
steadily increasing RTT. Typically at some point TCP will time out
and consider the connection broken.
Surprisingly this situation is now starting to
occur as memory is becoming very cheap, home computers and their
interfaces very fast, and the typical bottlenecks may in fact be the
modems and routers at home and the ethernet switches at the ISP's
locations. Many ISP's nowadays proudly claim interfaces with many
gigabytes of buffer memory. For years I countered those remarks with
the question: "Is that a good or a bad message?" after which they
typically look as if they see water burning.
Proposed
solutions
If we consider one end to end link with a bottleneck somewhere in
the middle (does actually not matter where and how long the strech
of lower bandwidth behind the bottleneck is), then:
- if we assume the incoming bandwidth = B1, outgoing bandwidth
is B2
- need to handle bursts of packets at the size of S
- then the required memory M will be:
M = S *
(1 - B2/B1)
- if we assume the biggest burst to be coming from a single TCP
flow
- round trip time sender - recever = RTT
- that TCP flow should average to B2
- then the TCP window will be:
W = RTT
* B2
- we assume that TCP can be triggered to send a burst consisting
of one window W
- then:
S = W
M = RTT
* B2 * (1 - B2/B1)
- All of the above is true for a single stream system where
there is no traffic shaping at the sender, or a system where in
the middle traffic gets mixed and creates a congestion causing a
B2 to be on average smaller than B1. One solution is to make the
TCP stacks always shape the outgoing traffic such that no bursts
will be formed. The average speed should then be approached,
being:
B2 =
W/RTT
- In such case no switch in the middle has to buffer and M can
approach 0.
- if one goes for the burst model
- and one wants to support the extreme case having one flow
going half way around the planet (RTT = 200 ms)
- and there is no other statistical averaging, then:
M =
0.200 * B2 * (1 - B2/B1)
The best solution in my opinion would be the combination of two
actions:
- TCP implementations only sending shaped traffic where packets
are spaced in time to approach the average datarate, so
eleminate bursts.
- TCP should tune its window size to approach a minimum RTT.
If one would eliminate any memory in the network, then the RTT would
always be the minimum, because a packet would either travel at
lightspeed or being dropped. Therefore, the dropped packets should
also play a role in the window decrease/increase determination. Note
that multiple (n) TCP streams each using a smal part (1/n) of the
bandwidth gives often enough statistical multiplexing for throughput
to saturate the link with even too little memory present.
Links
- Stuart Cheshire: It's the Latency, Stupid
- Jim Getty's Ramblings: The criminal mastermind: bufferbloat!
- Robert Cringeley: Bufferbloat will become a huge problem this
year.
- http://www.bufferbloat.net/
Related projects
In february-march 2011 this problem was posed to the students of the
Grid Master at the University of Amsterdam. This resulted in the
following two reports:
- Bert Gijsbers, Deepthi Devaki Akkoorath, "Performance
simulation of buffer bloat in routers", Technical report,
Faculty of Science, Universiteit van Amsterdam, march 2011.
- Mark Santcroos, Sytse van Genderen, "Bufferbloat: A
simulation.", Technical report, Faculty of Science, Universiteit
van Amsterdam, march 2011.
An in june 2011 two sudents of the SNE master looked into detection:
- Harald Kleppe and Danny Groenewegen: "BufferBloat detection.",
Technical report, Faculty of Science, Universiteit van
Amsterdam, june 2011.
Related publications & talks
- Antony Antony, Johan Blom, Cees de Laat, Jason Lee, Wim Sjouw,
"Microscopic Examination of TCP flows over transatlantic Links",
Future Generation Computer Systems, Volume 19, Issue 6, August
2003, Pages 1017-1029. Link to publication: http://ext.delaat.net/pubs/2003-j-3.pdf
- 17-apr 2002: Nordunet conference, Copenhagen, 15-17 april,
Keynote: "The road to optical networking": http://ext.delaat.net/talks/cdl-2002-04-17.pdf