tablespace in Greenplum database

Post date: Dec 06, 2012 7:54:4 PM

GP uses tablespace and filespace to maintain corresponding space. Including commands for your reference (took out additional text):

[gpadmin@mdw sachi]$ gpfilespace -o gpfilespace_config

Enter a name for this filespace

> sachi_disk1

Checking your configuration:

Your system has 4 hosts with 6 primary and 6 mirror segments per host.

Your system has 1 hosts with 0 primary and 0 mirror segments per host.

Configuring hosts: [sdw4, sdw1, sdw2, sdw3]

primary location 1> /data1

primary location 2> /data1

primary location 3> /data1

primary location 4> /data1

primary location 5> /data1

primary location 6> /data1

Please specify 6 locations for the mirror segments, one per line:

mirror location 1> /data1

mirror location 2> /data1

mirror location 3> /data1

mirror location 4> /data1

mirror location 5> /data1

mirror location 6> /data1

Configuring hosts: [mdw]

Enter a file system location for the master

master location> /data

Creating configuration file... [created]

To add this filespace to the database please run the command:

gpfilespace --config /home/gpadmin/sachi/gpfilespace_config

[gpadmin@mdw sachi]$ ls -ltr

total 4

-rw-rw-r-- 1 gpadmin gpadmin 1118 Jul 17 18:26 gpfilespace_config

[gpadmin@mdw sachi]$  gpfilespace --config /home/gpadmin/sachi/gpfilespace_config

A tablespace requires a file system location to store its database files. A filespace is a collection of file system locations for all components in a Greenplum system (primary segment, mirror segment and master instances). Once a filespace is created, it can be used by one or more tablespaces.

 

Reading Configuration file: '/home/gpadmin/sachi/gpfilespace_config'

Performing validation on paths

..............................................................................

Connecting to database

Filespace "sachi_disk1" successfully created

[gpadmin@mdw sachi]$ psql -d postgres

psql (8.2.15)

Type "help" for help.

postgres=# create tablespace sachi_tbsp filespace sachi_disk1;

CREATE TABLESPACE

postgres=# grant create on tablespace sachi_tbsp to gpadmin;

GRANT

postgres=# create table ummc_1 ( x integer) tablespace sachi_tbsp;

NOTICE:  Table doesn't have 'DISTRIBUTED BY' clause -- Using column named 'x' as the Greenplum Database data distribution key for this table.

HINT:  The 'DISTRIBUTED BY' clause determines the distribution of data. Make sure column(s) chosen are the optimal data distribution key to minimize skew.

CREATE TABLE

Most of the above is self explanatory, attaching a PDF doc for your reference (pages 72 and 76).

Database partitioning for UMMC

Connection to database is based on user, defined by Postgres role. Role is synonymous to user. pg_hba covers external connectivity from database point of view and similar to a simple firewall from database perspective. First record in pg_hba.conf determines connections, not the last entry. We are probably a bit familiar with pg_hba.conf file by now.

User is authenticated by password or as a Linux OS user or through SSL certificate. There is a corresponding backend process for each connection and only serves one active session at a time. Access is based on:

User need to have login for accessing GP (Postgres) and connect for reaching a database. Users and groups are roles in Postgres, the major difference is that user can have login access. 

General recommendation: