Installing the Greenplum Load Tools on commodity server

Running the Load Tools Installer

The Greenplum Database load tools installer installs the following data loading tools:

• Greenplum parallel file distribution program (gpfdist)

• Greenplum data loading utility (gpload)

To install the Greenplum Database load tools

1. Download the appropriate greenplum-loaders-4.2.x.x-PLATFORM.bin.zip installer package for your platform from EMC Download Center. The currently supported platforms are RedHat Linux 32-bit, RedHat Linux 64-bit, and Solaris 64-bit.

2. Unzip the installer: 

unzip greenplum-loaders-4.2.x.x-PLATFORM.bin.zip

3. Run the installer:

/bin/bash greenplum-loaders-4.2.x.x-PLATFORM.bin

4. The installer will prompt you to accept the license agreement and to provide an installation path. For the installation path, be sure to enter an absolute path if you 

choose not to accept the default location (for example, /mydir/gp-loader-tools). The load tools are installed into greenplum-db-4.2.x.x by default

Note: Your Greenplum Database load tools installation contains the following files and directories:

• bin — data loading command-line tools and library files (gpfdist and gpload)

• docs — documentation files

• greenplum_loaders_path.sh — environment variables

Configuring the Command-Line Load Tools

As a convenience, a greenplum_loaders_path.sh file is provided in your load tools installation directory following installation. It has the following environment 

variable settings:

GREENPLUM_LOADERS_HOME — The installation directory of the Greenplum Database load tools.

PATH — The path to the data loading command-line utilities.

LD_LIBRARY_PATH — The path to additional Python library files needed for gpload. You can source this file in your user’s startup shell profile (such as .bashrc or 

.bash_profile). 

For example, you could add a line similar to the following to your chosen profile files (making sure the right install path is used):

source greenplum-db-4.2.x.x/greenplum_loaders_path.sh

After editing the chosen profile file, source it as the correct user to make the changes active. For example:

source ~/.bashrc

Additional Connection Environment Variables

The Greenplum load tools require several connection parameters in order to be able to connect to a Greenplum Database instance. In order to save some typing on the command-line, you can create the following environment variables in your preferred profile file (such as .bashrc).

• PGDATABASE — The name of the default Greenplum database to connect to.

• PGHOST — The Greenplum master host name or IP address. 

• PGPORT — The port number that the Greenplum master instance is running on.

• PGUSER — The default database role name to use for login

Enabling Greenplum Database for Remote Client Connections

In order for Greenplum Database to be able to accept remote client connections, you must configure your Greenplum Database master so that connections are allowed from the client hosts and database users that will be connecting to Greenplum Database.

To enable remote client connections

1. Make sure that the pg_hba.conf file of the Greenplum Database master is correctly configured to allow connections from the users to the database(s) using 

the authentication method you want. See the section on Editing the pg_hba.conf File in the Greenplum Database Administrator Guide and the section on Client 

Authentication in the PostgreSQL documentation for details. Make sure the authentication method you choose is supported by the client tool you are using.

2. If you edited pg_hba.conf file, the change requires a server reload (using the gpstop -u command) to take effect.

3. Make sure that the databases and roles you are using to connect exist in the system 

and that the roles have the correct privileges to the database objects.