To measure bottleneck link bandwidth, you must specify to nettimer to source of the measurements and the destination of the results. The source of the measurements can either be packet capture servers or trace files. The destination of the results is a file. In the following sections, I will describe how these parameters are specified.
In this section I describe how to set up the measurement sources. The two kinds of measurement sources are nettimer run in distributed packet capture server mode and tcpdump traces. Use the distributed packet capture servers if you want to use the calculations in real time. Use the traces if you want to be able to repeat calculations for testing or debugging.
You must first decide which hosts you want to take measurements at and then which host will do the actual bandwidth calculation. The bandwidth calculation can consume many CPU cycles, so it may be better to do so on a machine that isn't being measureed. On the other hand, the result of the calculation is likely to be most useful one of the hosts being measured, so it may be better do the the calculation on one of the measured machines. One scenario would be to take measurements at a web client and a web server and then do the calculation at the server.
If you are going to take measurements and do calculation at the same machine, then you don't need to start up a separate distributed packet capture server at that machine. Nettimer run in bandwidth calculation client mode can start one up automatically (see Section 5.2).
On the measurement machines, start up nettimer in packet capture server mode:
nettimer {––run_dpcap_server} [ ––interface interface ] [ ––dpcap_filter expression ] [ ––dpcap_server_port port_num ] [ ––dpcap_server_packet_buffer packet_buffer_size ] [ ––dpcap_server_send_timeout timeout ] [ ––dpcap_cap_len cap_len ]
This will start up a program which will capture all packets that it hears on the interface network interface (e.g. "eth0"). If this option isn't specified, then the default interface is used, so this is most useful on multi-homed hosts.
The captured packets are filtered using the tcpdump filtering expression. Filtering the packets cuts down on the amount of data sent to each of the clients.
At the same time as it captures packets, the server will listen for incoming TCP connections from clients on portnum. The default port is 9090. If the server can't get this port number , then it will pick a random port number and print it out. Clients that want to connect to this server must specify this non-default port number (see Section 5.2.1).
After a client connects, the server will transmit the first cap_len bytes of each captured packet to that client. The default value of 60 should be sufficient to get the TCP and IP header of most packets on most link layer technologies. However, some link layers and/or packets with many IP options may push the TCP header to far into the packet, so this parameter may need to be increased. On the other hand, the larger this value, is the more data is sent to each client.
The server buffers up captured headers until packet_buffer_size bytes have been captured or timeout seconds have passed since the last buffer was sent to the client. The tradeoff here is between the efficiency and timeliness of communication between the server and its clients. The packet_buffer_size can safely be set to be much larger than the TCP maximum segment size. However, setting the timeout to a low values will cause timely but possibly small packets to be sent, while setting it to a large values will cause less timely, but likely larger packets to be sent.
The server can handle any number of clients. The clients can open and close connections at any time, but they will only receive packets that were captured during the time they were connected.
The traces must be gathered using a version of tcpdump with the same trace file format as the version of libpcap that is linked with nettimer. The name of the trace file must be of the form IPAddress_pcap or IPAddress_*_pcap where IPAdress is the IP address of the host were the trace was taken. In the absence of synchronized clocks, Nettimer needs this information to be able to determine whether a packet is coming or going in a trace.
In this section I describe how to run nettimer in bandwidth calculation client mode. In this mode, nettimer gathers measurements from measurement sources, calculates the bandwidth, and outputs the result. Before running in this mode, you must set up the measurement sources (see Section 5.1). There are so many options for running the bandwidth calculation client that I have divided them into the general options and the calculation options.
The general client options cover how to specify the verbosity, how to specify the measurement sources, how to get results, and how to control memory usage.
nettimer [ ––version ] [ ––verbose verbosity_level ] { ––dpcap_servers address_or_hostname:portnum | ––read_trace file_name } [ ––interface interface ] [ ––dpcap_filter expression ] [ ––log_file_name log_file_name ] [ ––update_file_name update_file_name ] [ ––update_interval update_interval ] [ ––no_convert_addressess ] [ ––qualify_host_names ] [ ––maximum_number_of_flows max_num_flows ] [ ––maximum_flow_length max_flow_len ]
The ––version option causes the version number to be printed to stdout.
The ––verbose option controls how verbose the output is. The default is 0.
If you are using live distributed packet capture servers, then specify the ––dpcap_servers option followed by the addresses or hostnames of the servers as a single command line argument. For example, to take measurements from distributed packet capture servers running on the localhost, listening on the default port on host1, and listening on port 10001 on host2:
nettimer ––dpcap_servers "localhost host1 host2:10001" |
Nettimer will automatically start up a local distributed packet capture server if "localhost" is specified as one of the servers, so you almost never need to explicitly start up a distributed packet capture server on the same host as the bandwidth calculation client. The ––interface option specifies the interface for this local server to capture on.
On the other hand, if you are reading trace(s), then use must specify the ––read_trace option followed by the filename(s) of the traces as a single command line argument. For example, to read trace from the files 127.0.0.2_pcap and 127.0.0.3_pcap:
nettimer ––read_trace "127.0.0.2_pcap 127.0.0.3_pcap" |
Regardless of the type of the measurement sources, you can specify a filter expression using the ––dpcap_filter option to reduce the amount of data that has to be transferred from the servers to the client and/or the amount of computation the client has to do. If you are using live servers, then the expression is sent to all the servers when the client connects. Note that the server uses both its own filtering expression and the client's expression when determining what packet reports are sent to that client. If you are reading from traces, then the expression is applied to all of the filters.
Nettimer can generate results in three ways: to the screen, as a file updated with only the latest results, and to a log file containing all the results. By default the results are written to the screen. The user can also redirect stdout to a file, which will then be updated continuously with the latest results. The user can also specify the update file through the ––update_file_name option. The flows are sorted from most-recently used to least-recently used. Use the ––update_interval option to control how frequently the screen or this file is updated. To exit this mode, press 'q'. Any other key causes the file to be updated. Nettimer uses flock() to lock the update file while it is being updated, so programs that need to read a consistent updated file should flock() it before reading. To generate the log file, specify the ––log_file_name option. This option can generate very large files if nettimer runs for a long time.
By default, nettimer output uses an unqualified hostname to identify hosts. The ––no_convert_addresses option causes nettimer to use IP addresses instead of hostnames. Nettimer will also use IP addresses if it cannot resolve addresses to names while it is calculating (e.g. no network connection or no reverse mapping in the DNS server). The ––qualify_host_name option causes nettimer to write fully domain-qualified hostnames.
Use the ––maximum_number_of_flows option (default: 100) to control the maximum number of flows that the client keeps track of. If this limit is exceeded, then the oldest flow and all its packets are discarded. You might have to increase this limit if you want to keep track of many flows or decrease it if you are running out of memory. Use the ––maximum_flow_length option (default: 1000) to control the maximum number of packets each flows can have. You might have to increase this limit if you want to use information more than 1000 packets ago or decrease it if you are running out of memory.
The client calculation options control how the client does the bandwidth calculation. These settings exist mostly to make experimenting with the algorithms easier, so you shouldn't need to tweak them.
nettimer [ ––calc_restrict_packet_pair packet_pair_algorithm ] [ ––calc_window_size max_flow_len ] [ ––calc_kernel_bound_ratio kernel_bound_ratio ] [ ––calc_kernel_resolution kernel_resolution ] [ ––calc_kernel_function kernel_function ] [ ––calc_min_bunch_size min_bunch_size ] [ ––calc_max_bunch_size max_bunch_size ] [ ––calc_potential_bw potential_bw ] [ ––calc_min_delta_rtt ] [ ––calc_max_delta_rtt ] [ ––calc_no_bw_ratio max_flow_len ]
The ––calc_restrict_packet_pair option (default: auto) restricts the client to only calculating one of SBPP (Sender Based Packet Pair), RBPP (Receiver Based Packet Pair), ROPP (Receiver Only Packet Pair), or auto (automatically select the optimal algorithm). This option is only useful to compare the effectiveness of the different algorithms.
The ––calc_window_size option controls how many samples of history to use in the calculation of bandwidth. Larger values allow a more stable calculation while smaller values allow faster reaction to bandwidth change. This value cannot exceed the maximum flow length(see Section 5.2.1).
The ––calc_kernel_bound_ratio, ––calc_kernel_resolution, and ––calc_kernel_function options control how the kernel density algorithm works. The ––calc_kernel_bound_ratio (default: .05) option controls the ratio of the kernel width to the kernel's x value. A larger value gives a smoother distribution function, while smaller values are faster to calculate. The ––calc_kernel_resolution (default: 100) option controls the minimal parameter resolution of the kernel density algorithm. This means that different samples as much as kernel_resolution apart may generate the same kernel value. A larger value will make the algorithm faster, while a smaller value will make the algorithm more accurate. The ––calc_kernel_function option (default: triangular) selects what kernel function to use. The choices are "uniform" and "triangular". In almost all cases, the default is sufficient.
In this section, I describe how to interpret the results that nettimer generates.
Nettimer organizes its measurements by flow. A flow is defined as all packets that travel from a specific IP address to another specific IP address. For each flow, nettimer calculates several bandwidth metrics, like Sender Based Packet Pair or Receiver Based Packet Pair. The types and meanings of these metrics depends on the orientation of the distributed packet servers to the flow. A flow could originate from a distributed packet server, terminate at a distributed packet server, both arrive at and leave from a distributed packet server (e.g. the libdpcap server is a router), or bypass a libdpcap server completely.
Nettimer uses some rules to determine which metric it should calculate. If it only has the arrival timings for packets in a flow at a libdpcap server, then it computes Receiver Only Packet Pair to that server. If it has both transmission timings and round trip timings for packets in a flow, then it computes Sender Based Packet Pair from that server. If it has the transmission timings for a flow from one libdpcap server and the arrival timings at another server, then it computes Receiver Based Packet Pair from the transmission timings server to the arrival timings server. For any particular flow, more than one of these rules may apply, and for each rule, there may be multiple combinations of libdpcap servers. As a result, nettimer generates multiple bandwidth metrics for each flow.
As described in Section 5.2.1, nettimer can generate results by updating a file with only the latest results or appending to a log file. Both are described in the following sections.
The updated results file consists of a line of column headings followed by a variable number of lines, each describing a metric of a particular flow. The columns are space delimited.
The meaning of the columns are as described below. Only the non-verbose columns are documented.
Updated Results File Columns
Heading : FlowSource
Meaning : The hostname or IP address of the source of the flow.
Heading : FlowDest
Meaning : The hostname or IP address of the destination of the flow.
Heading : FI
Meaning : The index (relative to the command line specification) of the distributed packet capture server or trace file where the transmission timings for the packets of this flow are taken. An index of -1 indicates that nettimer has no information about the transmission timings for this flow.
Heading : TI
Meaning : The index (relative to the command line specification) of the distributed packet capture server or trace file these measurements are taken to.
Heading : Metr
Meaning : The name of the metric that is calculated. "SBPP" = Sender Based Packet Pair, "RBPP" = Receiver Based Packet Pair, "ROPP" = Receiver Only Packet Pair.
Heading : Bandwidth
Meaning : The bandwidth measured in bits/second.
An example of non-verbose output:
Cmds: (q)uit, any other key to update FlowSource FlowDest FI TI Metr Bandwidth 192.168.45.1 10.0.0.2 -1 0 ROPP 87972363.38 192.168.45.1 10.0.0.2 0 0 SBPP 4004618.24 192.168.45.1 10.0.0.2 1 0 RBPP 87833710.03 10.0.0.2 192.168.45.1 -1 0 ROPP 139792.80 10.0.0.2 192.168.45.1 0 1 RBPP 137449.52 10.0.0.2 192.168.45.1 1 1 SBPP 43924.45 |
The lines 3-5 describe a flow from 192.168.45.1 to 10.0.0.2. The libdpcap servers specified on the command line were "10.0.0.2 192.168.45.1".
Line 3 says that the bandwidth using Receiver Only Packet Pair and the arrival timings taken at the 0 libdpcap server (10.0.0.2) is 87972363.38 b/s. In other words, this is an estimate of the bottleneck bandwidth from 192.168.45.1 to 10.0.0.2.
Line 4 says that the bandwidth using Sender Based Packet Pair and the transmission and round trip timings taken at the 0 libdpcap server (10.0.0.2) is 4004618.24 b/s. Since the packets of this flow are travelling to 10.0.0.2, this is the bottleneck bandwidth from the driver in 10.0.0.2's operating system to its TCP/IP stack and back to the driver.
Line 5 says that the bandwidth using Receiver Based Packet Pair, the transmission times at the 1 libdpcap server (192.168.45.1), and the arrival times at the 0 libdpcap server (10.0.0.2) is 87833710.03 b/s. This is another (and probably most accurate) estimate of the bottleneck bandwidth from 192.168.45.1 to 10.0.0.2.
The last three lines describe the reverse flow (10.0.0.2 to 192.168.45.1) of the one described above. All of the lines describe similar metrics, except in the reverse direction. However, the bandwidths are much smaller. In fact, this path has symmetric bandwidth, but only TCP acknowledgements are flowing in the reverse direction. Small and slowly sent packets like acknowledgements are insufficient to measure the high (100Mb/s) bottleneck bandwidth that exists on this path.
When both the sending and arrival times of packets are available, then nettimer in most cases will not compute the Receiver Only Packet Pair bandwidth between hosts because it is inferior to the Receiver Based Packet Pair bandwidth. However, sometimes nettimer gets the arrival times for packets before it gets the send times (e.g. the client is at the receiver). In this case, nettimer will calculate a ROPP bandwidth because it has no way of knowing when or if the send timings will arrive. When the send timings do arrive, then nettimer will calculate the RBPP bandwidth, but the ROPP bandwidth will remain.
The log file consists of series of lines, one for the transmission or arrival of a packet that could change a bandwidth estimate. This usually happens when a packet is sent or arrives. The columns are space delimited.
The meaning of the columns are as described below. Only the non-verbose columns are documented.
Updated Results File Columns
Column : 1
Meaning : The time at which a packet was sent.
Column : 2
Meaning : The time at which a packet was received.
Column : FlowSource
Meaning : The hostname or IP address of the source of the flow.
Column : FlowDest
Meaning : The hostname or IP address of the destination of the flow.
Column : FI
Meaning : The index (relative to the command line specification) of the distributed packet capture server or trace file where the transmission timings for the packets of this flow are taken. An index of -1 indicates that nettimer has no information about the transmission timings for this flow.
Column : TI
Meaning : The index (relative to the command line specification) of the distributed packet capture server or trace file these measurements are taken to.
Column : ID
Meaning : The IP packet id.
Column : Metr
Meaning : The name of the metric that is calculated. "SBPP" = Sender Based Packet Pair, "RBPP" = Receiver Based Packet Pair, "ROPP" = Receiver Only Packet Pair.
Column : Bandwidth
Meaning : The bandwidth measured in bits/second.