The latency of a messaging application is the sum of the latencies of its parts. Often, the primary concern is the application latency, not the latencies of the parts. In theory, this allows some parts of the application more than average latency for a given message if other parts use less than average.
In practice, it seldom works out this way. The same factors that make one part take longer often make other parts take longer. Increasing message size is the obvious example. Another example is a message traffic burst. It can cause CPU contention thereby adding latency throughout a messaging application. Network traffic bursts can sharply increase latency in network queues. See Section 17.8 for details.
The challenge is building a messaging application that consistently meets its latency goals for all workloads. Ideally, failures are logged for later analysis if latency goals cannot be met. Similar challenges are faced by the builders of VoIP telephony systems and others concerned with real-time performance. We have seen successful messaging applications designed by borrowing the concept of a latency budget from other real-time systems.
A latency budget is best applied by first establishing a total latency budget for the messaging application. Then identify all the sources of latency in the application and allocate a portion of the application budget to each source. (See Section 17 for a list of commonly-encountered latency sources.) Each latency source is aware of its budget and monitors its performance. It logs any cases where it can't complete its work within the budgeted time.
Most "managed" pieces of network hardware already do this. Routers and managed switches have counters that are incremented every time a packet cannot be queued because a queue is already full. This isn't so much a time budget as it is a queue space budget. But space can be easily converted to time by dividing by the speed of the interface.
The 29West LBM messaging system supports latency boundaries in message batching, latency-bounded TCP, and in our reliable unicast and multicast transport protocols. Messaging applications are notified whenever loss-free delivery cannot be maintained within the latency budget.
One of the big benefits we've seen from establishing a latency budget is that it helps to guide the work and reduce "finger pointing" in large organizations. A messaging application often can't meet its latency goals without the cooperation of many groups within a large organization. The team that maintains the network infrastructure must keep queuing delays to a minimum. The team that administers operating systems must optimize tuning and limit non-essential load on the OS. The team that budgets for hardware purchases must make sure adequate CPU and networking hardware is purchased. The team that administers the messaging system must configure it to make efficient use of available CPU and network resources. An overall application latency budget that is subdivided over potential latency sources is a good tool for identifying the root cause of latency problems when they occur. If each latency source logs cases where it exceeded its budget or dropped a message, it's much easier to take corrective action.
Copyright 2004 - 2008 29West, Inc.