GPDB Data Unloading Options

posted Nov 21, 2014, 6:24 AM by Sachchida Ojha
The gadget spec URL could not be found
1. COPY TO: Copy to command is quick and easy postgresql way to unload data. For example  

COPY ( SELECT * FROM sales) TO ‘/tmp/sales.dat’;
The gadget spec URL could not be found
Advantage:
  1. Quick and easy
  2. Recommended for small unloads. This method is not recommended for bulk unloads
  3. Can filter output using SELECT
  4. Unload to file or standard output
  5. Is not parallel uses a single process on the master
  6. Source file must be accessible by the master
The gadget spec URL could not be found
The gadget spec URL could not be found
2. External Writeable: The advantage of Greenplum writable external table are,
  1.  High speed bulk unloads
  2.  File based or web based
  3.  Uses parallel unloading using gpfdist protocol 
  4.  Local file, named pipes, applications 
Allows only INSERT operations
Unload Using a File Based Writable External Table
=# CREATE WRITABLE EXTERNAL TABLE unload_expenses
( LIKE expenses )
LOCATION ('gpfdist://etlhost-1:8081/expenses1.out',
'gpfdist://etlhost-2:8081/expenses2.out')
FORMAT 'TEXT' (DELIMITER ',')
DISTRIBUTED BY (exp_id);
psql –c ‘INSERT INTO unload_expenses SELECT * FROM expenses;’
The gadget spec URL could not be found
Unload Using a Writable External Web Table
Command based
– Use EXECUTE to specify a shell command, script, or application
=# CREATE WRITABLE EXTERNAL WEB TABLE output (output text)
EXECUTE 'export PATH=$PATH:/home/gpadmin/programs; myprogram.sh'
FORMAT 'TEXT'
DISTRIBUTED RANDOMLY;
The gadget spec URL could not be found
3. gpcrondump: 

backup data using Greenplum crondump which dmps database into SQL script files then restore. For example,

gpcrondump -x sachi

The gadget spec URL could not be found

The gadget spec URL could not be found


Comments