Interpreting OpenDDS Performance Testing Results

by
Mike Martinez, Principal Engineer
Object Computing, Inc. (OCI)

Introduction

This document discusses the measurements, statistics, and charts used to describe the performance of OpenDDS. Performance tests can be executed using the OpenDDS-Bench[2] performance testing framework. The framework includes preconfigured tests that can be executed by any user in their own environment. Test results obtained using this framework include the measurements and data described here. OpenDDS-Bench provides scripts to reduce the raw data produced to plottable form and also scripts for charting using the GNUPlot[5] tool. The discussion here uses the existing preconfigured tests from the OpenDDS-Bench framework as the source of the data to analyze.

A test execution run from early 2010, using OpenDDS version 2.0 was used to provide data for the latency and jitter examples. Throughput data was taken from tests executed on the more recent version 2.1.2. The OpenDDS website[1] has more recent data results and charts. The scripts for plotting included in the code archive[4] associated with this article are based on those that are new on the development trunk after the version 2.1.2 release and will be included in the next release of OpenDDS. The code archive scripts plot only within the local application while those included with the source code plot to image files. You can get the current versions of the source code scripts from the subversion repository[3] trunk prior to the next release.

Measured Values

Measurements for OpenDDS performance testing include time values, delays (the difference between two time values), amounts and sizes of data, and system parameters.

Time

Measurements for performance testing include time values. Time values are always in seconds, or some scaled version of seconds, such as microseconds or milliseconds. Linux systems used for testing provide microsecond precision in time measurements. This is sufficient for the delays being measured, which are on the order of tens to hundreds of microseconds and longer. The accuracy of modern Linux kernels is better than this precision, so the measurements are not subject to clock errors.

It is difficult to establish and maintain synchronization of clocks between systems. Measurements that we have made in our lab environment indicate that the clock drift between the systems during a test are of the same magnitude as the delays being measured. This means that errors due to clock skew could range from 0 to +/-100% during the same test. If time measurements will be compared or combined during data reduction and analysis, it is important to account for these synchronization errors. To avoid this error source completely, all measurements to be compared can be made using the same clock. This means that a potentially large synchronization error will not be present in the processed results. If a network is included during a test, this implies that at least two hops are involved in the testing - one to reach a remote host and another to return to the originating host for the measurement to be taken. Preconfigured OpenDDS-Bench tests that include delay measurements are structured to be a simple 2 hop loop between a local and remote host.

Intervals

Interval measurements made during all tests include the test duration, or the time that samples are being sent. This interval is determined by taking a time measurement at the point where samples are starting to be sent and again at the point where the writes are terminated. If the 'wait for acknowledgments' constraint has been specified, then this second measurement includes this post write interval as well. The test duration is the difference between these two measurements.

OpenDDS includes the ability to measure per hop delay intervals when enabled. This is done via the latency statistics gathering extension of the DataReader Entity type. Since these per hop measurements suffer from clock skew between the hosts at each end of the hop they are not used for multi host latency measurements by the OpenDDS-Bench performance test framework.

Interval measurements made during latency tests include individual sample delays. These are measured by determining a value at the time a sample is being constructed and adding that value to the sample being sent. Once a sample has been received at the terminating subscription, another time value is taken and the difference between these measurements is used as the latency delay for that sample.

Since the latency intervals are for more than a single hop, the single hop latency is estimated by dividing by the number of hops in the measurement path. For the preconfigured latency tests there are two hops so the measurements are divided by two. Doing so results in slightly different statistical values than would have been obtained if we were able to accurately measure the single hop performance. If we assume that the latencies are Normally distributed (which is actually not likely to be the case, as discussed below), then we can use the results of combining Normally distributed random variables[9] in order to understand the estimated single hop performance (see sidebar at right).

The above discussion is based on the use of Normally distributed random variables and measurements. The distribution density for latencies tend to be skewed to the high end and do not reflect Normal statistics. This can be seen intuitively by considering that there are no negative latencies, which would be required to have a normal distribution tail towards the left. The distribution resembles a Rayleigh distribution[18] more than a Gaussian distribution[17] since it has a fixed lower bound. Since we are using robust statistics the detailed results will vary from the exact answers above, but they will likely be of the same form.

In addition to the mean and variance of the measurements, the extreme values (maximum and minumum) are also not likely to accurately reflect the actual extreme values for a single hop. This can be seen intuitively by considering that it is unlikely that a latency will include two extreme values, one from each hop. The measured extreme values will not accurately represent the expected extreme values for single hops.

For the delay interval measurements of the OpenDDS-Bench preconfigured tests the latency delays are reported as the measurement divided by the number of hops. This should result in an accurate estimate of the average per hop latency and an optimistic estimate of the single hop variance.

Other Measurements

In addition to time values, the number of samples sent and received, and the sample payload in bytes are measured. System parameters such as CPU and memory usage, swapping and network stack performance are measured to the ability of the systems used in the testing.

When using the OpenDDS-Bench framework preconfigured tests on Linux hosts, the 'vmstat' command is used to gather system parameters during the test, the 'top' command is used to gather individual process information, and the 'netstat' command is used to gather network performance information. Other systems may use different commands to gather similar information.

Derived Values

In addition to direct measurements described above other values can be derived. These include the jitter or first order difference between adjacent sample latencies and the total throughput.

Jitter

Delay variation between samples is commonly called 'jitter'[15]. RFC 3393[16]: IP Packet Delay Variation Metric for IP Performance Metrics (IPPM) defines mechanisms for deriving this kind of measurement. From RFC 3393 §1.1: The variation in packet delay is sometimes called 'jitter'. This term, however, causes confusion because it is used in different ways by different groups of people.

RFC 3393 §4.5 Type-P-One-way-ipdv-jitter describes the mechanism that is used by the OpenDDS-Bench data analysis scripts to derive jitter values from measurement data. The scripts use a selection function that specifies that consecutive samples are selected for the packet pairs used in IPDV computation. The delay variation, or jitter, is then just the difference in our measured latency between consecutive samples.

Throughput

Throughput measurements are derived from the direct measurements of the total time spent sending samples, as described above, and the total number of payload bytes sent during the test. These measurements are normally presented in units of Mbps (not the IEC Mibps[6]) and related to either the rated capacity of the network or the measured maximum capacity of the network to transfer data using the ftp protocol.

While it is possible to estimate throughput with a finer granularity than the total test and observe how it might vary over the duration of the test, the preconfigured throughput tests do not perform this type of analysis.

Statistics

The collected data are subjected to statistical analysis. For our purposes, this analysis is an attempt to describe properties of the measurements concisely. We assume that the measurements we take come from a population described by a probability distribution and use the parameters of the distribution to summarize the values. This results in estimates[7] of location (what is the most likely value), scale (how spread out is the data), and shape (how symmetrical is the data).

Classical statistical methods tend to be sensitive to data that violates the underlying assumptions (such as being Normally distributed). Robust statistical estimators[8] address this issue and attempt to produce estimate values that are not affected as much by departures from model assumptions. This leads to stable results in the presence of outlier data points.

In the discussions below the data reduction scripts provided as part of the OpenDDS-Bench framework are assumed as the source of the statistical analysis.

Location

Estimates of location include the arithmetic mean and median values. The median value is a more robust estimate of the most likely value of a sample. For example, 9 numbers close to a value and 1 that is 100 times as large will result in the arithmetic mean value that is 10 times larger than it would have been if the last value were close to the others. A median value would be unchanged. The data reduction scripts derive median values due to this robust property.

Scale

The scale, or spread[10], of data can be measured in many different ways with the classical statistical value of variance or standard deviation being most common. Some of these measures are also robust[11].

Shape

The shape of a statistical distribution includes properties describing the 'peakedness' and the asymmetry of the distribution.

The 'peakedness' of a distribution is described by a measure called kurtosis. This describes the contribution of variance from extreme data to the variance of the distribution. Lower values indicate that variance is a result of frequent smaller deviations, while higher values indicate the variance is a result of fewer, more extreme values.

The asymmetry of a distribution is described by a measure called skewness. Positive skew indicates that the distribution has a longer tail on the right, and negative skew indicates a longer tail on the left.

The data reduction scripts do not calculate any shape parameters numerically. Note that qualitatively the shape of the latency distribution appears consistently skewed to the right, as described above, and that the jitter distribution appears Normally distributed with few, if any, contributions from extreme values.

Probability Distribution

In addition to the single value numerical estimators described above, there is additional information that can be derived from a data set. This includes descriptions of the underlying probability density distribution and its cumulative distribution function. These can be described by estimating the density distribution and its cumulative function.

Charts

The OpenDDS-Bench performance test framework provides scripts to plot the data produced by the tests. These scripts are specific to the GNUPlot[5] plotting tool. The charts produced by these scripts include Latency and Jitter information as well as Throughput data.

Data

The data and scripts used for charting are available in the code archive[4] associated with this article. The data is located in files produced by following the data reduction steps for the preconfigured tests of the OpenDDS-Bench framework. This is described in the OpenDDS-Bench User Guide[2]. In general the plot script files can be modified to create a chart from any of the data files. As provided the latency plotting scripts use the data/latency-udp-1000.gpd data file and the throughput plotting script uses the data/throughput-data.gpd data file for plotting. To run the scripts using GNUPlot, simply use the script file as an argument to the command:

   shell> gnuplot scripts/plot-latency-timeline.gpi -

Some filenames include an indication of the transport and message size used in the test results being reported. The message size is a number indicating the number of payload bytes added to transmitted samples. Sizes used include: 50, 100, 250, 500, 1000, 2500, 5000, 8000, 16000, and 32000 bytes. The transport is indicated by one of the following abbreviations:

mbeBest effort multicast transport was used
mrelReliable multicast transport was used
tcpTCP transport was used
udpUDP transport was used

The latency plot script file names include a type indication that determines if latency or jitter is plotted. These correspond to the plot script files supplied as part of the OpenDDS-Bench framework, but differ in that these files only produce a local chart and do not create an image file for the charts The files and scripts included are summarized in the table below.

Data and Script files included in the article code archive
File(s) Contents Description
data/latency-<transport>-<size>.gpd Data set of sample measurements and derived values. These files contain the full set of latency and jitter data for each test case. Tests are executed for different transport and message size combinations. The resulting data is then converted into these files which contain several GNUPlot format indexes. The data consists of:
Index Column
12
0 latency measurements derived jitter data
1 lowest value of bin latency sample count in bin
2 lowest value of bin jitter sample count in bin
3 latency quantile data
---
4 jitter quantile data
---
data/throughput-data.gpd Throughput data for tests. This file contains the summary data for the execution of the steepest ascent test cases for Throughput tests using the UDP transport. The Bidirectional and Publication Bound Throughput tests each have a separate index with the message rate, sample size, and measured throughput included as column data:
Index Column
123
0 sample sending rate
(samples/second)
sample size
(bytes/sample)
Bidirectional Throughput Test
measured throughput (bps)
1 sample sending rate
(samples/second)
sample size
(bytes/sample)
Publication Bound Throughput Test
measured throughput (bps)
data/latency.csv Summary data for all tests. This file contains a summary of all the statistical estimates for each transport and size combination for which test data exists. This data is in the form of a CSV file with fields defined as described in the file itself. The statistics include the arithmetic mean and standard deviation, the median and MAD, and the maximum and minimum values measured. This data is used to produce the summary charts for both latency and jitter.
data/latency-<transport>-<size>.stats GNUPlot loadable string variables. These files contain GNUPlot specific commands to load string variables containing the statistical summary data for the particular transport and size of the file. The string variables are: latency_stats which is added as a label to the latency histogram chart; and jitter_stats which is added as a label to the jitter histogram chart.
scripts/plot-<type>-summary.gpi GNUPlot commands to create a summary chart. These commands will produce a chart containing the summary data for each transport at all measured message sizes.
scripts/plot-<type>-timeline.gpi GNUPlot commands to create a timeline chart. These commands will produce a timeline chart containing the individual sample data plotted over time.
scripts/plot-<type>-quantiles.gpi GNUPlot commands to create a quantile chart. These commands will produce a quantile chart containing individual sample data.
scripts/plot-<type>-histogram.gpi GNUPlot commands to create a histogram chart. These commands will produce a histogram chart of the latency or jitter data.
scripts/plot-<type>-density.gpi GNUPlot commands to create a density chart. These commands will produce a kernel density estimation chart of the latency or jitter data.
scripts/plot-throughput.gpi GNUPlot commands to create a throughput results chart. These commands will produce a plot with the results from the throughput test cases in the throughput datafile.

Latency and Jitter

The Latency and Jitter data collected from the preconfigured Latency tests of the OpenDDS-Bench framework can be summarized and charted in different forms. This includes summary plots of the entire data set, Timeline plots for individual test runs, Quantile plots, Histograms, and Density plots for both the measured latencies and the jitter.

Throughput

The OpenDDS-Bench framework preconfigured throughput tests include different test topologies, sample rates, and sample sizes. The throughput for each of these cases is extracted and can then be plotted. In addition to the measured throughput data it is typical to include the specified test bit rate and the network capacity on the chart to bound the test data. For tests with bit rates lower than the network capacity, the measured throughput is bounded by the specified rate. For specified bit rates higher than the capacity, the capacity bounds the measured throughput.

The network capacity is both theoretical and practical. The theoretical capacity is the rated network bandwidth, such as 100 Mbps or a Gigabit network with 1 Gbps capacity. The practical throughput can be estimated by using a typical transfer at maximum rate and measuring the achieved throughput. Transfers using the ftp protocol are commonly used to characterize practical limits on network usable bandwidth.

The following chart shows the results of two test topologies: bidirectional, and publication bound using a UDP transport implementation. They were executed in five separate test cases with different message rates and message sizes. The network was rated as a Gbps network, and the practical limit of the network was determined by transferring ftp data as fast as possible. The ftp performance was 595 Mbps, or just under 60% of the theoretical rated capacity.

The black horizontal line is the rated network capacity limit. The red horizontal line is the practical network capacity. The orange diagonal line is the specified test throughput. While this line is below the network capacity lines, it will bound the measured throughput. This is observed for both test topologies. Once the practical network capacity is reached, both tests then reach a throughput limit, one higher than the other. Since the bidirectional topology sends the message traffic from both the local and the remote host, they will collide as more traffic is added, resulting in dropped packets and the limit similar to the ftp protocol, which also includes back flow traffic. The publication bound test topology sends samples from the originating host to two separate terminating hosts. No traffic from the receiving hosts to the sending host is generated, which results in somewhat more of the theoretical bandwidth being used.

Summary

In this article we have discussed the various types of measurements taken for OpenDDS performance testing and evaluation. Issues with the measurements and their interpretation were touched on. A discussion of the ways to summarize the data as statistical values and as charts was undertaken. The charts that are available from the OpenDDS-Bench performance testing framework were identified and discussed. See the OpenDDS-Bench Users Guide[2] for details on how to setup and execute the preconfigured tests. The framework also provides scripts to perform data reduction to the formats needed to produce the charts discussed in this article.

The knowledge of the chart types and what they represent will be valuable in reviewing the existing and future performance testing results for OpenDDS that will be published. OpenDDS users can generate these charts locally for target system environments. They can then evaluate and compare the various performance parameters applicable to the desired system operation. The example charts were generated from testing in a low performance desktop environment and should not be used as a basis for evaluation of OpenDDS. Users are encouraged to use the OpenDDS-Bench performance test framework on their own systems to evaluate OpenDDS for suitability.

References

[1]OpenDDS Portal http://www.opendds.org
[2]OpenDDS-Bench User Guide https://svn.dre.vanderbilt.edu/viewvc/DDS/trunk/performance-tests/Bench/doc/userguide.html?view=co
[3]OpenDDS Subversion Repository https://svn.dre.vanderbilt.edu/viewvc/DDS/trunk
[4]Article Code Archive http://www.ociweb.com/mnb/code/mnb201003-code.tar.gz
[5]GNUPlot http://gnuplot.info/
[6]Binary Prefixes http://en.wikipedia.org/wiki/Binary_prefix
[7]Summary Statistics http://en.wikipedia.org/wiki/Summary_statistics
[8]Robust Statistics http://en.wikipedia.org/wiki/Robust_statistics
[9]Random Variable Sum http://en.wikipedia.org/wiki/Sum_of_normally_distributed_random_variables
[10]Dispersion http://en.wikipedia.org/wiki/Statistical_dispersion
[11]Robust Scale http://en.wikipedia.org/wiki/Robust_measures_of_scale
[12]Median Absolute Deviation http://en.wikipedia.org/wiki/Median_absolute_deviation
[13]Kernel Density Estimation http://en.wikipedia.org/wiki/Kernel_density_estimation
[14]Quantiles http://en.wikipedia.org/wiki/Quantiles
[15]Packet Delay Variation http://en.wikipedia.org/wiki/Packet_delay_variation
[16]Jitter http://tools.ietf.org/html/rfc3393
[17]Gaussian Distribution http://en.wikipedia.org/wiki/Gaussian_distribution
[18]Rayleigh Distribution http://en.wikipedia.org/wiki/Rayleigh_distribution

OCI Education Services

Object Computing, Inc. (OCI) is the leading provider of object-oriented technology training in the Midwest. Thousands of students participate in our training program every year. Targeted toward software engineers and the development community, our extensive program of over 50 hands-on workshops is delivered to corporations and individuals throughout the U.S. and internationally. OCI's Education Services include Private Training, Public Training, and Lab Rentals. Visit www.ociweb.com/training or contact us at training@ociweb.com.

OCI Products and Services

OCI offers downloads and commercial support for a variety of middleware technologies.

ACE + TAO - premier open source C++ CORBA ORB Boost - portable reusable free C++ libraries JacORB - Leading open source Java CORBA ORB opalORB - premier open source Perl CORBA ORB OpenDDS - open source C++ implementation of DDS QuickFIX - Full-featured open source multi-platform language QuickFAST - Open source C++ and .Net implementation of FAST