There are many questions surrounding UDP buffer sizing. What is the optimal size? What are the consequences of an improperly sized UDP buffer? What are the equations needed to compute an appropriate size for a UDP buffer? What default limit will the OS kernel place on UDP buffer size and how can I change it? How can I tell if I'm having UDP loss problems due to buffers that are too small? Answers to these questions and more are given in the following sections.
UDP buffer sizes should be large enough to allow an application to endure the normal variance in CPU scheduling latency without suffering packet loss. They should also be small enough to prevent the application from having to read through excessively old data following an unusual spike in CPU scheduling latency.
Too little UDP buffer space causes the operating system kernel to discard UDP packets. The resulting packet loss has consequences described below.
The kernel often keeps counts of UDP packets received and lost. See Section 8.9 for information on detecting UDP loss due to UDP buffer space overflow. A common myth is that all UDP loss is bad (see Myth: All Packet Loss is Bad).
In most cases, it's the secondary effects of UDP loss that matter most. That is, it's the reaction to the loss that has material consequences more so than the loss itself. Note that the consequences discussed here are independent of the cause of the loss. Inadequate UDP receive buffering is just one of the more common causes we've encountered deploying LBM.
Consider these areas when assessing the consequences of UDP loss:
Latency--The time that passes between the initial transmission of a UDP packet and the eventual successful reception of a retransmission is latency that could have been avoided were it not for the intervening loss.
Bandwidth--UDP loss usually results in requests for retransmission, unless more up-to-date information is expected soon (e.g. in the case of stock quote updates). Bandwidth used for retransmissions may become significant, especially in cases where there is a large amount of loss or a large number of receivers experiencing loss. See Section 15 for more information on multicast retransmissions.
CPU Time--UDP loss causes the receiver to use CPU time to detect the loss, request one or more retransmissions, and perform the repair. Note that efficiently dealing with loss among a group of receivers requires the use of many timers, often of short-duration. Scheduling and processing such timers generally requires CPU time in both the operating system kernel ("system time") and in the application receiving UDP ("user time"). Additional CPU time is required to switch between kernel and user modes.
On the sender, CPU time is used to process retransmission requests and to send retransmissions as appropriate. As on the receiver, many timers are required for efficient retransmission processing, thus requiring many switches between kernel and user modes.
Memory--UDP receivers that can only process data in the order that it was initially sent must allocate memory while waiting for retransmissions to arrive. UDP loss causes such receivers to receive data in an order different than that used by the sender. Memory is used to restore the order in which it was initially sent.
Even UDP receivers that can process UDP packets in the order they arrive may not be able to tolerate duplication of packets. Such receivers must allocate memory to track which packets have been successfully processed and which have not.
UDP senders interested in reliable reception by their receivers must allocate memory to retain UDP packets after their initial transmission. Retained packets are used to fill retransmission requests.
Even though too little UDP buffer space is definitely bad and more is generally better, it is still possible to have too much of a good thing. Perhaps the two most significant consequences of too much UDP buffer space are slower recovery from loss and physical memory usage. Each of these is discussed in turn below.
Slower Recovery--To best understand the consequences of too much UDP buffer space, consider a stream of packets that regularly updates the current value of a rapidly-changing variable in every tenth packet. Why buffer more than ten packets? Doing so would only increase the number of stale packets that must be discarded at the application layer. Given a data stream like this, it's generally better to configure a ten-packet buffer in the kernel so that no more than ten stale packets have to be read by the application before a return to fresh ones from the stream.
It's often counter-intuitive, but excessive UDP buffering can actually increase the recovery time following a large packet loss event. UDP receive buffers should be sized to match the latency budget allocated for CPU scheduling latency with knowledge of expected data rates. See Section 16 for more information on latency budgets. See Section 8.6 for a UDP buffer sizing equation.
Physical Memory Usage--It is possible to exhaust available physical memory with UDP buffer space. Requesting a UDP receive buffer of 32 MB and then invoking ten receiver applications uses 320 MB of physical memory. See Section 7.4 for more information.
Assuming that an average rate is known for a UDP data stream, the amount of latency that would be added by a full UDP receive buffer can be computed as:
Max Latency = Buffer Size / Average Rate
Note: Take care to watch for different units in buffer size and average rate (e.g. kilobytes vs. megabits per second).
Assuming that an average rate is known for a UDP data stream, the buffer size needed to avoid loss a given worst case CPU scheduling latency can be computed as:
Buffer Size = Max Latency * Average Rate
Note: Since data rates are often measured in bits per second while buffers are often allocated in bytes, careful conversion may be necessary.
The kernel variable that limits the maximum size allowed for a UDP receive buffer has different names and default values by kernel given in the following table:
The examples in this table give the commands needed to set the kernel UDP buffer limit to 8 MB. Root privilege is required to execute these commands.
| Kernel | Command |
|---|---|
| Linux | sysctl -w net.core.rmem_max=8388608 |
| Solaris | ndd -set /dev/udp udp_max_buf 8388608 |
| FreeBSD, Darwin | sysctl -w kern.ipc.maxsockbuf=8388608 |
| AIX | no -o sb_max=8388608 (note: AIX only permits sizes of 1048576, 4194304 or 8388608) |
The AIX command given above will change the current value and automatically modify /etc/tunables/nextboot so that the change will survive rebooting. Other platforms require additional work described below to make changes survive a reboot.
For Linux and FreeBSD, simply add the sysctl variable setting given above to /etc/sysctl.conf leaving off the sysctl -w part.
We haven't found a convention for Solaris, but would love to hear about it if we've missed something. We've had success just adding the ndd command given above to the end of /etc/rc2.d/S20sysetup.
Interpreting the output of netstat is important in detecting UDP loss. Unfortunately, the output varies considerably from one flavor of Unix to another. Hence we can't give one set of instructions that will work with all flavors.
For each Unix flavor, we tested under normal conditions and then under conditions forcing UDP loss while keeping a close eye on the output of netstat -s before and after the tests. This revealed the statistics that appeared to have a relationship with UDP packet loss. Output from Solaris and FreeBSD netstat was the most intuitive; Linux and AIX much less so. Following sections give the command we used and highlight the important output for detecting UDP loss.
Use netstat -s. Look for udpInOverflows. It will be in the IPv4 section, not in the UDP section as you might expect. For example:
IPv4:
udpInOverflows = 82427
Use netstat -su. Look for packet receive errors in the Udp section. For example:
Udp:
38799 packet receive errors
The command, netstat -s, doesn't work the same in Microsoft® Windows® as it does in other operating systems. Therefore, unfortunately, there is no way to detect UDP buffer overflow in Microsoft Windows.
Use netstat -s. Look for fragments dropped (dup or out of space) in the ip section. For example:
ip:
77070 fragments dropped (dup or out of space)
Use netstat -s. Look for dropped due to full socket buffers in the udp section. For example:
udp:
6343 dropped due to full socket buffers
Copyright 2004 - 2009 29West, Inc.