How to alter table distribution policy in greenplum?

Post date: Nov 02, 2012 11:25:17 PM

Altering Table Distribution

ALTER TABLE provides options to change the distribution policy of a table. When the distribution options of a table change, the table data is redistributed on disk, which can be resource intensive. There is also an option to redistribute table data using the existing distribution policy.

Changing the Distribution Policy

You can use the ALTER TABLE command to change the distribution policy for a table. For partitioned tables, changes to the distribution policy recursively apply to the child partitions. This operation preserves the ownership and all other attributes of the table. For example, the following command redistributes the table sales across all segments using the customer_id column as the distribution key:

ALTER TABLE sales SET DISTRIBUTED BY (customer_id);

When you change the hash distribution of a table, table data is automatically redistributed. However, changing the distribution policy to a random distribution will not cause the data to be redistributed. For example:

ALTER TABLE sales SET DISTRIBUTED RANDOMLY;

Redistributing Table Data

To redistribute table data for tables with a random distribution policy (or when the hash distribution policy has not changed) use REORGANIZE=TRUE. This sometimes may be necessary to correct a data skew problem, or when new segment resources have been added to the system. For example:

ALTER TABLE sales SET WITH (REORGANIZE=TRUE);

This command rebalances a table evenly across all segments using the current distribution policy (including random distribution).