How to check interconnect performance in Greenplum Appliance

Post date: Feb 06, 2014 12:17:34 AM

gpcheckperf  command is used verify the baseline hardware performance of the specified hosts. 

gpcheckperf -d test_directory [-d test_directory ...]

{-f hostfile_gpcheckperf | - h hostname [-h hostname ...]}

[-r ds] [-B block_size] [-S file_size] [-D] [-v|-V]

gpcheckperf -d temp_directory

{-f hostfile_gpchecknet | - h hostname [-h hostname ...]}

[ -r n|N|M [--duration time] [--netperf] ] [-D] [-v|-V]

[gpadmin@sachi]$ gpcheckperf -h

COMMAND NAME: gpcheckperf

Verifies the baseline hardware performance of the specified hosts.

*****************************************************

SYNOPSIS

*****************************************************

gpcheckperf -d <test_directory> [-d <test_directory> ...]

{-f <hostfile_gpcheckperf> | -h <hostname> [-h <hostname> ...]}

[-r ds] [-B <block_size>] [-S <file_size>] [-D] [-v|-V]

gpcheckperf -d <temp_directory>

{-f <hostfile_gpchecknet> | -h <hostname> [-h <hostname> ...]}

[ -r n|N|M [--duration <time>] [--netperf] ] [-D] [-v|-V]

gpcheckperf -? 

gpcheckperf --version

*****************************************************

DESCRIPTION

*****************************************************

The gpcheckperf utility starts a session on the specified hosts and runs the following performance tests:

* Disk I/O Test (dd test) - To test the sequential throughput  performance of a logical disk or file system, the utility uses  the dd command, which is a standard UNIX utility. It times how  long it takes to write and read a large file to and from disk and calculates your disk I/O performance in megabytes (MB) per second. By default, the file size that is used for the test is calculated at two times the total RAM on the host. This ensures that the test is truly testing disk I/O and not using the 

memory cache.

* Memory Bandwidth Test (stream) - To test memory bandwidth, the utility uses the STREAM benchmark program to measure sustainable  memory bandwidth (in MB/s). This tests that your system is not  limited in performance by the memory bandwidth of the system in  relation to the computational performance of the CPU. In applications  where the data set is large (as in Greenplum Database), low memory bandwidth is a major performance issue. If memory bandwidth is significantly lower than the theoretical bandwidth of the CPU, then it can cause the CPU to spend significant amounts of time waiting for data to arrive from system memory.

* Network Performance Test (gpnetbench) - To test network performance (and thereby the performance of the Greenplum Database interconnect), the utility runs a network benchmark program that transfers a 5 second stream of data from the current host to each remote host included in the test. The data is transferred in parallel to each remote host and the minimum, maximum, average and median network transfer rates are reported in megabytes (MB) per second. If the summary transfer rate is slower than expected (less than 100 MB/s), you can run the network test serially using the -r n option to obtain per-host results. To run a full-matrix bandwidth test,

you can specify -r M which will cause every host to send and receive data from every other host specified. This test is best used to validate if the switch fabric can tolerate a full-matrix workload.

To specify the hosts to test, use the -f option to specify a file containing a list of host names, or use the -h option to name single host names on the command-line. At least one host name (-h) or a host file (-f) is required.

You must also specify at least one test directory (with -d). The user who runs gpcheckperf must have write access to the specified test directories on all remote hosts. For the disk I/O test, the test directories should correspond to your segment data directories (primary and/or mirrors). For the memory bandwidth and network

performance tests, a test directory is still required to copy over the test program files.

Before using gpcheckperf, you must have a trusted host setup between the hosts involved in the performance test. You can use the utility gpssh-exkeys to update the known host files and exchange public keys between hosts if you have not done so already.

Note that gpcheckperf calls to gpssh and gpscp, so these Greenplum utilities must also be in your $PATH.

*****************************************************

OPTIONS

*****************************************************

-B <block_size>

Specifies the block size (in KB or MB) to use for disk I/O test. The default is 32KB, which is the same as the Greenplum Database page size. The maximum block size is 1 MB.

-d <test_directory> 

For the disk I/O test, specifies the file system directory locations to test. You must have write access to the test directory on all hosts involved in the performance test. You can use the -d option multiple times to specify multiple test directories (for example, to test disk I/O of your primary and mirror data directories).

-d <temp_directory> 

For the network and stream tests, specifies a single directory where the test program files will be copied for the duration of the test. You must have write access to this directory on all hosts involved in the test.

-D (display per-host results)

Reports performance results for each host for the disk I/O tests. The default is to report results for just the hosts with the minimum and maximum performance, as well as the total and average performance of all hosts.

--duration <time>

Specifies the duration of the network test in seconds (s), minutes (m), hours (h), or days (d). The default is 15 seconds.

-f <hostfile_gpcheckperf>

For the disk I/O and stream tests, specifies the name of a file that contains one host name per host that will participate in the performance test. The host name is required, and you can optionally specify an alternate user name and/or SSH port number per host. The syntax of the host file is one host per line as follows:

[username@]<hostname>[:ssh_port]

-f <hostfile_gpchecknet>

For the network performance test, all entries in the host file must be for host adresses within the same subnet. If your segment hosts have multiple network interfaces configured on different subnets, run the network test once for each subnet. For example (a host file containing segment host address names for interconnect subnet 1):

sdw1-1

sdw2-1

sdw3-1

-h <hostname>

Specifies a single host name (or host address) that will participate in the performance test. You can use the -h option multiple times to specify multiple host names.

--netperf

Specifies that the netperf binary should be used to perform the network test instead of the Greenplum network test. To use this option, you must download netperf from www.netperf.org and install it into $GPHOME/bin/lib on all Greenplum hosts (master and segments).

-r ds{n|N|M}

Specifies which performance tests to run. The default is dsn:

* Disk I/O test (d)

* Stream test (s)

* Network performance test in sequential (n), parallel (N), or full-matrix (M) mode. The optional --duration option specifies how long (in seconds) to run the network test. To use the parallel (N) mode, you must run the test on an even number of hosts. If you would rather use netperf (www.netperf.org) instead of the Greenplum network test, you can download it and install it into $GPHOME/bin/lib on all Greenplum hosts (master and segments). You would then specify the optional --netperf option to use the netperf binary instead of the default gpnetbench* utilities.

-S file_size

Specifies the total file size to be used for the disk I/O test for all directories specified with -d. file_size should equal two times total RAM on the host. If not specified, the default is calculated at two times the total RAM on the host where gpcheckperf is executed. This ensures that the test is truly testing disk I/O and not using

the memory cache. You can specify sizing in KB, MB, or GB.

-v (verbose) | -V (very verbose)

Verbose mode shows progress and status messages of the performance tests as they are run. Very verbose mode shows all output messages generated

by this utility.

--version

Displays the version of this utility.

-? (help)

Displays the online help.

*****************************************************

EXAMPLES

*****************************************************

Run the disk I/O and memory bandwidth tests on all the hosts  in the file host_file using the test directory of /data1 and /data2:

$ gpcheckperf -f hostfile_gpcheckperf -d /data1 -d /data2 -r ds

Run only the disk I/O test on the hosts named sdw1 and sdw2 using the test directory of /data1. Show individual host results and run in verbose mode:

$ gpcheckperf -h sdw1 -h sdw2 -d /data1 -r d -D -v

Run the parallel network test using the test directory of /tmp, where host_file_subnet* specifies all network interface host names within the same subnet:

$ gpcheckperf -f hostfile_gpchecknet_ic1 -r N -d /tmp

$ gpcheckperf -f hostfile_gpchecknet_ic2 -r N -d /tmp

Run the same test as above, but use netperf instead of the Greenplum network test (note that netperf must be installed in $GPHOME/bin/lib on all Greenplum hosts):

$ gpcheckperf -f hostfile_gpchecknet_ic1 -r N --netperf -d /tmp

$ gpcheckperf -f hostfile_gpchecknet_ic2 -r N --netperf -d /tmp

gpcheckperf -f /home/gpadmin/gpconfigs/hostfile_gpdb_ic1 -r N -d /tmp     ======>ic11 represent Interconnect 1

Netperf bisection bandwidth test

mdw-1 -> smdw-1 = 1092.190000

sdw1-1 -> sdw2-1 = 1063.040000

sdw3-1 -> sdw4-1 = 1056.180000

sdw5-1 -> sdw6-1 = 1048.130000

sdw7-1 -> sdw8-1 = 1066.540000

sdw9-1 -> sdw10-1 = 1046.260000

sdw11-1 -> sdw12-1 = 1034.860000

sdw13-1 -> sdw14-1 = 1065.730000

sdw15-1 -> sdw16-1 = 1055.030000

smdw-1 -> mdw-1 = 1085.930000

sdw2-1 -> sdw1-1 = 1068.980000

sdw4-1 -> sdw3-1 = 1072.270000

sdw6-1 -> sdw5-1 = 1098.390000

sdw8-1 -> sdw7-1 = 1088.750000

sdw10-1 -> sdw9-1 = 1066.050000

sdw12-1 -> sdw11-1 = 1080.100000

sdw14-1 -> sdw13-1 = 1061.060000

sdw16-1 -> sdw15-1 = 1081.360000

Summary:

sum = 19230.85 MB/sec

min = 1034.86 MB/sec

max = 1098.39 MB/sec

avg = 1068.38 MB/sec

median = 1066.54 MB/sec

gpcheckperf -f /home/gpadmin/gpconfigs/hostfile_gpdb_ic2 -r N -d /tmp 

Netperf bisection bandwidth test

mdw-2 -> smdw-2 = 1058.600000

sdw1-2 -> sdw2-2 = 1068.230000

sdw3-2 -> sdw4-2 = 1071.720000

sdw5-2 -> sdw6-2 = 1051.070000

sdw7-2 -> sdw8-2 = 1077.870000

sdw9-2 -> sdw10-2 = 1041.330000

sdw11-2 -> sdw12-2 = 1046.020000

sdw13-2 -> sdw14-2 = 1068.650000

sdw15-2 -> sdw16-2 = 1043.890000

smdw-2 -> mdw-2 = 1082.050000

sdw2-2 -> sdw1-2 = 1092.410000

sdw4-2 -> sdw3-2 = 1090.450000

sdw6-2 -> sdw5-2 = 1071.020000

sdw8-2 -> sdw7-2 = 1094.280000

sdw10-2 -> sdw9-2 = 1079.700000

sdw12-2 -> sdw11-2 = 1059.900000

sdw14-2 -> sdw13-2 = 1076.560000

sdw16-2 -> sdw15-2 = 1071.720000

Summary:

sum = 19245.47 MB/sec

min = 1041.33 MB/sec

max = 1094.28 MB/sec

avg = 1069.19 MB/sec

median = 1071.72 MB/sec

Here are steps to run gpcheckperf

1 - shutdown the database 

gpstop -af

2 - on mdw, run gpcheckperf for each interconnect

gpcheckperf -f /home/gpadmin/gpconfigs/hostfile_gpdb_ic1 -r N -d /tmp

gpcheckperf -f /home/gpadmin/gpconfigs/hostfile_gpdb_ic2 -r N -d /tmp

3 - restart the database

gpstart -a