Next Generation Innovation 

Messaging Performance LBM

We often benchmark LBM with simple test programs that send and receive messages as fast as possible. This pushes the underlying hardware and OS as hard as any real messaging application possibly could. It establishes an upper bound on performance for the tested systems that can be reached when all hardware and OS resources are available to run LBM. LBM performance is very dependent on the hardware capabilities, OS tuning, and application behavior. Hence it is impossible to accurately predict how it will perform in an untested environment. However, these actual measurements can improve the quality of performance estimation.

Figure 1. Message Rate and Payload Rate vs. Payload Size

These tests were run on a pair of Dell Precision Workstations (model 390n) connected by a gigabit LAN. Each using Intel™ Core® Duo E6600 processor operating at 2.40 GHz and 2 GB of RAM running Red Hat Enterprise Linux WS v4 for the 64-bit EM64T instruction set along with a Broadcom Corporation NetXtreme BCM5754 Gigabit Ethernet PCI Express.

Figure 1 shows that CPU power is the limiting factor at small message sizes (left side of graph) and that network bandwidth is the limiting factor at large message sizes (right side). The cross-over point where the bottleneck moves from CPU to network seems to be around a message payload size of about 100 bytes.

Of special interest is the effect of using two threads. With small message sizes, the message rate almost doubles (indicative of a CPU-bound activity). At large message sizes, where the network itself is the bottleneck, the second thread provides no improvement at all. (Note that improvement derived from multi-threading would not be seen on a single-CPU machine.)

In absolute numbers, the tested machines can generate and consume over 2,000,000 messages/second for message sizes below 100 bytes. They can saturate a gigabit LAN for message sizes over 100 bytes. To go faster, additional NICs would be required at large message sizes while more or faster processors would be required at smaller message sizes.

These tests were run with LBM version 3.0. The receiving machine ran the command:

  • lbmmrcv -C 2 -R 2

  • The sending machine ran either:
  • lbmmsrc -T 1 -S 1 -c perf.conf

  • or:
  • lbmmsrc -T 2 -S 2 -c perf.conf

  • where perf.conf contained:
  • source implicit_batching_minimum_length 8192
  • This config file option tells LBM to batch up to 8192 bytes of messages before sending them.

                                            
    © Copyright 2002-2007. 29West Inc. All right reserved.