tablespace in Greenplum database

Post date: Dec 06, 2012 7:54:4 PM

GP uses tablespace and filespace to maintain corresponding space. Including commands for your reference (took out additional text):

[gpadmin@mdw sachi]$ gpfilespace -o gpfilespace_config

Enter a name for this filespace

> sachi_disk1

Checking your configuration:

Your system has 4 hosts with 6 primary and 6 mirror segments per host.

Your system has 1 hosts with 0 primary and 0 mirror segments per host.

Configuring hosts: [sdw4, sdw1, sdw2, sdw3]

primary location 1> /data1

primary location 2> /data1

primary location 3> /data1

primary location 4> /data1

primary location 5> /data1

primary location 6> /data1

Please specify 6 locations for the mirror segments, one per line:

mirror location 1> /data1

mirror location 2> /data1

mirror location 3> /data1

mirror location 4> /data1

mirror location 5> /data1

mirror location 6> /data1

Configuring hosts: [mdw]

Enter a file system location for the master

master location> /data

Creating configuration file... [created]

To add this filespace to the database please run the command:

gpfilespace --config /home/gpadmin/sachi/gpfilespace_config

[gpadmin@mdw sachi]$ ls -ltr

total 4

-rw-rw-r-- 1 gpadmin gpadmin 1118 Jul 17 18:26 gpfilespace_config

[gpadmin@mdw sachi]$  gpfilespace --config /home/gpadmin/sachi/gpfilespace_config

A tablespace requires a file system location to store its database files. A filespace is a collection of file system locations for all components in a Greenplum system (primary segment, mirror segment and master instances). Once a filespace is created, it can be used by one or more tablespaces.


Reading Configuration file: '/home/gpadmin/sachi/gpfilespace_config'

Performing validation on paths


Connecting to database

Filespace "sachi_disk1" successfully created

[gpadmin@mdw sachi]$ psql -d postgres

psql (8.2.15)

Type "help" for help.

postgres=# create tablespace sachi_tbsp filespace sachi_disk1;


postgres=# grant create on tablespace sachi_tbsp to gpadmin;


postgres=# create table ummc_1 ( x integer) tablespace sachi_tbsp;

NOTICE:  Table doesn't have 'DISTRIBUTED BY' clause -- Using column named 'x' as the Greenplum Database data distribution key for this table.

HINT:  The 'DISTRIBUTED BY' clause determines the distribution of data. Make sure column(s) chosen are the optimal data distribution key to minimize skew.


Most of the above is self explanatory, attaching a PDF doc for your reference (pages 72 and 76).

Database partitioning for UMMC

Connection to database is based on user, defined by Postgres role. Role is synonymous to user. pg_hba covers external connectivity from database point of view and similar to a simple firewall from database perspective. First record in pg_hba.conf determines connections, not the last entry. We are probably a bit familiar with pg_hba.conf file by now.

User is authenticated by password or as a Linux OS user or through SSL certificate. There is a corresponding backend process for each connection and only serves one active session at a time. Access is based on:

User need to have login for accessing GP (Postgres) and connect for reaching a database. Users and groups are roles in Postgres, the major difference is that user can have login access. 

General recommendation: