Packet loss is a common occurrence that is often misunderstood. Many applications can run successfully with little or no attention paid to packet loss. However, latency-sensitive applications can often benefit from improved performance if packet loss is better understood. In our work at 29West deploying our LBM product, we have encountered many myths surrounding packet loss. Following paragraphs address these myths and provide links to additional helpful information.
Reality--Our experience and anecdotal evidence from others indicates that buffer overflow is the most common cause of packet loss. These buffers may be in network hardware (e.g. switches and routers) or it may be in operating systems. See Section 7 for background information.
Reality--The normal operation of TCP congestion control may cause loss due to queue overflow. See this report for more information. Loss rates of several percent were common under heavy congestion.
Reality--The flow control mechanism of TCP should prevent packet loss due to host buffer overflows. However, UDP contains no flow-control mechanism leaving the possibility that UDP receiver buffers will overflow. Hosts receiving high-volume UDP traffic often experience internal packet loss due to UDP buffer overflow. See Section 7.6 for more on the contrast between TCP buffering and UDP buffering. See Section 8.9 for advice on detecting UDP buffer overflow in a host.
Reality--Packet loss plays at least two important, beneficial roles:
Implicit signaling of congestion: TCP uses loss to discover contention with other TCP streams. Each TCP stream passing through a congestion point must dynamically discover the existence of other streams sharing the congestion point in order to fairly share the available bandwidth. They do this by monitoring 1) the SRTT (Smoothed Round-Trip Time) as an indication of queue depth, and 2) packet loss as an indication of queue overflow. Hence TCP relies upon packet loss as an implicit signal of network congestion. See Section 2.3 for discussion of the impact this can have on latency.
Efficient discarding of stale, latency-sensitive data: See Section 8.1 for more information.
Copyright 2004 - 2010 29West, Inc.