gpfdists (a secured version of gpfdist protocol)


The gpfdists protocol is a secure version of gpfdist, which enables encrypted communication and secure identification of the file server and the Greenplum Database to protect against attacks such as eavesdropping and man-in-the-middle attacks.

- There may be a little performance impact seen on the gpfdists process, but i would suggest to test it on your cluster for a similar load while using gpfdist and gpfdists. Apologies, but we do not have a benchmark numbers, and the variation percentage may vary based on the volume of data, cluster size etc.

- Below are the steps which are required to implement gpfdists. 

Step 1: Create a folder under the segment data directory & master data directory with a name called gpfdists and move the below files already created:

- The client certificate file, client.crt

- The client private key file, client.key

- The trusted certificate authorities, root.crt

Note: You can identify the segment data directory loaction using the below sql:

select fselocation,hostname from pg_filespace_entry pf, gp_segment_configuration gp where pf.fsedbid=gp.dbid;

Step 2 : Create an external table with gpfdists protocol. Example:

CREATE EXTERNAL TABLE ext_expenses ( name text,

date date, amount float4, category text, desc1 text )

LOCATION ('gpfdists://etlhost-1:8081/*.txt',



Step 3: Put data under the path specified. Example:

Create a file call a1.txt under /var/load_files with the required delimiters

Step 4: Execute gpfdist service:

gpfdist -d /var/load_files -p 8081 --ssl $MASTER_DATA_DIRECTORY/gpfdists

Step 5: Fetch data from external table.

Note: You can also use gpload with ssl option true, More details on YAML structure on the administration guide.

- security is a feature / option provided by gpfdist service. We will not be able to block the usage of execution of gpfdist. Use having the priviliges to execute gpfdist can run the service without ssl options. Security must be implemented at user level as well.

Note: You may further use iptables to strengthen the access to IP / port on which data is served via gpfdist.


Testing gpfdists - Nov 01, 2014 4:12:2 PM