Greenplum DBA FAQ


MADLib and MADLib pivot function

posted Oct 6, 2016, 12:25 PM by Sachchida Ojha

Apache MADlib is a SQL-based open source library for scalable in-database analytics that supports Greenplum Database. The library offers data scientists numerous distributed implementations of mathematical, statistical and machine learning methods, including many utilities for data transformation.

New utilities have been added in the recent MADlib 1.9.1 release, including:

Pivot: data summarization tool that can do basic OLAP type operations
Sessionization: time-oriented session reconstruction on a data set comprising a sequence of events
Prediction metrics: set of metrics to evaluate the quality of predictions of a model

For more details

https://madlib.incubator.apache.org/docs/latest/group__grp__pivot.html

https://blog.pivotal.io/big-data-pivotal/products/new-tools-to-shape-data-in-apache-madlib

How to Install MADLib in Greenplum database


Checking list of security definer functions in GPDB

posted May 11, 2016, 4:55 AM by Sachchida Ojha

testdb=# \d pg_proc
       Table "pg_catalog.pg_proc"
     Column     |   Type    | Modifiers 
----------------+-----------+-----------
 proname        | name      | not null
 pronamespace   | oid       | not null
 proowner       | oid       | not null
 prolang        | oid       | not null
 proisagg       | boolean   | not null
 prosecdef      | boolean   | not null
 proisstrict    | boolean   | not null
 proretset      | boolean   | not null
 provolatile    | "char"    | not null
 pronargs       | smallint  | not null
 prorettype     | oid       | not null
 proiswin       | boolean   | not null
 proargtypes    | oidvector | not null
 proallargtypes | oid[]     | 
 proargmodes    | "char"[]  | 
 proargnames    | text[]    | 
 prosrc         | text      | 
 probin         | bytea     | 
 proacl         | aclitem[] | 
 prodataaccess  | "char"    | not null
Indexes:
    "pg_proc_oid_index" UNIQUE, btree (oid)
    "pg_proc_proname_args_nsp_index" UNIQUE, btree (proname, proargtypes, pronamespace)


testdb=# 

psql> select proname   from pg_proc where prosecdef     is TRUE;

Connecting as another user without their password

posted Jan 15, 2016, 12:41 AM by Sachchida Ojha

In Oracle we use to run (from privileged user)

alter session set current_schema=abc;

Lets discuss how we can achieve same thing in Greenplum using SET ROLE and with SET SESSION AUTHORIZATION.

SET ROLE command sets the current role identifier of the current session.

This command sets the current role identifier of the current SQL-session context to be rolename. The role name may be written as either an identifier or a string literal. After SET ROLE, permissions checking for SQL commands is carried out as though the named role were the one that had logged in originally.

The specified rolename must be a role that the current session user is a member of. If the session user is a superuser, any role can be selected. The NONE and RESET forms reset the current role identifier to be the current session role identifier. These forms may be executed by any user.

Parameters SESSION - Specifies that the command takes effect for the current session. This is the default.LOCAL -Specifies that the command takes effect for only the current transaction. After COMMIT or ROLLBACK, the session-level setting takes effect again. Note that SET LOCAL will appear to have no effect if it is executed outside of a transaction.
rolename -The name of a role to us -e for permissions checking in this session. NONE and RESET Reset the current role identifier to be the current session role identifier (that of the role used to log in).

Using this command, it is possible to either add privileges or restrict privileges. If the session user role has the INHERITS attribute, then it automatically has all the privileges of every role that it could SET ROLE to; in this case SET ROLE effectively drops all the privileges assigned directly to the session user and to the other roles it is a member of, leaving only the privileges available to the named role. On the other hand, if the session user role has the NOINHERITS attribute, SET ROLE drops the privileges assigned directly to the session user and instead acquires the privileges available to the named role.

In particular, when a superuser chooses to SET ROLE to a non-superuser role, she loses her superuser privileges.

SET ROLE has effects comparable to SET SESSION AUTHORIZATION, but the privilege checks involved are quite different. Also, SET SESSION AUTHORIZATION determines which roles are allowable for later SET ROLE commands, whereas changing roles with SET ROLE does not change the set of roles allowed to a later SET ROLE.

Examples

gpadmin=# SELECT SESSION_USER, CURRENT_USER;
 session_user | current_user 
--------------+--------------
 gpadmin      | gpadmin
(1 row)

gpadmin=# SET ROLE 'sachi';
SET
gpadmin=> SELECT SESSION_USER, CURRENT_USER;
 session_user | current_user 
--------------+--------------
 gpadmin      | sachi
(1 row)

gpadmin=> 

Another way of doing this and little extra is with SET SESSION AUTHORIZATION command.

SET SESSION AUTHORIZATION command sets the session role identifier and the current role identifier of the current session.

Synopsis
SET [SESSION | LOCAL] SESSION AUTHORIZATION rolename
SET [SESSION | LOCAL] SESSION AUTHORIZATION DEFAULT
RESET SESSION AUTHORIZATION

Description
This command sets the session role identifier and the current role identifier of the current SQL-session context to be rolename. The role name may be written as either an identifier or a string literal. Using this command, it is possible, for example, to temporarily become an unprivileged user and later switch back to being a superuser.The session role identifier is initially set to be the (possibly authenticated) role name provided by the client. The current role identifier is normally equal to the session user identifier, but may change temporarily in the context of setuid functions and similar mechanisms; it can also be changed by SET ROLE. The current user identifier is relevant for permission checking.

The session user identifier may be changed only if the initial session user (the authenticated user) had the superuser privilege. Otherwise, the command is accepted only if it specifies the authenticated user name.

The DEFAULT and RESET forms reset the session and current user identifiers to be the originally authenticated user name. These forms may be executed by any user.

Parameters
SESSION
Specifies that the command takes effect for the current session. This is the default.
LOCAL
Specifies that the command takes effect for only the current transaction. After COMMIT or ROLLBACK, the session-level setting takes effect again. Note that SET LOCAL will appear to have no effect if it is executed outside of a transaction.
rolename
The name of the role to assume.
NONE
RESET
Reset the session and current role identifiers to be that of the role used to log in.

Examples

gpadmin=# SELECT SESSION_USER, CURRENT_USER;
 session_user | current_user 
--------------+--------------
 gpadmin      | gpadmin
(1 row)

gpadmin=# SET ROLE 'sachi';
SET
gpadmin=> SELECT SESSION_USER, CURRENT_USER;
 session_user | current_user 
--------------+--------------
 gpadmin      | sachi
(1 row)

gpadmin=> SET ROLE NONE;
SET
gpadmin=# SELECT SESSION_USER, CURRENT_USER;
 session_user | current_user 
--------------+--------------
 gpadmin      | gpadmin
(1 row)

gpadmin=# SET SESSION AUTHORIZATION 'sachi';
SET
gpadmin=> SELECT SESSION_USER, CURRENT_USER;
 session_user | current_user 
--------------+--------------
 sachi        | sachi
(1 row)

gpadmin=> 


Enable PQO for a specific user/role

posted Jan 14, 2016, 7:19 PM by Sachchida Ojha   [ updated May 20, 2016, 5:53 PM ]

gpadmin=# select * from pg_roles where rolname='sachi';
-[ RECORD 1 ]-----+---------
rolname           | sachi
rolsuper          | f
rolinherit        | t
rolcreaterole     | f
rolcreatedb       | f
rolcatupdate      | f
rolcanlogin       | f
rolconnlimit      | -1
rolpassword       | ********
rolvaliduntil     | 
rolconfig         | 
rolresqueue       | 6055
oid               | 26097
rolcreaterextgpfd | f
rolcreaterexthttp | f
rolcreatewextgpfd | f
rolcreaterexthdfs | f
rolcreatewexthdfs | f
gpadmin=# alter role sachi set optimizer=on;
ALTER ROLE
gpadmin=#  select * from pg_roles where rolname='sachi';
-[ RECORD 1 ]-----+---------------
rolname           | sachi
rolsuper          | f
rolinherit        | t
rolcreaterole     | f
rolcreatedb       | f
rolcatupdate      | f
rolcanlogin       | f
rolconnlimit      | -1
rolpassword       | ********
rolvaliduntil     | 
rolconfig         | {optimizer=on}
rolresqueue       | 6055
oid               | 26097
rolcreaterextgpfd | f
rolcreaterexthttp | f
rolcreatewextgpfd | f
rolcreaterexthdfs | f
rolcreatewexthdfs | f

gpadmin=# 

optimizer_minidump : 

The PQO generates minidump files to describe the optimization context for a given query. The minidump files are used by Pivotal support to analyze Greenplum Database issues. The information in the file is not in a format that can be easily used by customers for debugging or troubleshooting. The minidump file is located under the master data directory and uses the following naming format:

Minidump_date_time.mdp

The minidump file contains this query related information:
  • Catalog objects including data types, tables, operators, and statistics required by the Pivotal Query Optimizer
  • An internal representation (DXL) of the query
  • An internal representation (DXL) of the plan produced by the Pivotal Query Optimizer
  • System configuration information passed to the Pivotal Query Optimizer such as server configuration parameters, cost and statistics configuration, and number of segments
  • A stack trace of errors generated while optimizing the query
Setting this parameter to ALWAYS generates a minidump for all queries. Pivotal recommends that you set this parameter to ONERROR in production environments to minimize total optimization time.

Value Range Default        Set Classifications
ONERROR ONERROR master
ALWAYS         session
                                                reload

Enable PQO (Pivotal Query Optimizer) at the database level

posted Jan 14, 2016, 5:53 PM by Sachchida Ojha   [ updated May 20, 2016, 5:52 PM ]

Step 1: Check the status of the optimizer

gpadmin=# show optimizer;
 optimizer 
-----------
 off
(1 row)

gpadmin=# show optimizer_control;
 optimizer_control 
-------------------
 on
(1 row)

Step 1: Check state of the database (optional)

[gpadmin@gpdb-sandbox ~]$ gpstate
20160114:12:38:42:169214 gpstate:gpdb-sandbox:gpadmin-[INFO]:-Starting gpstate with args: 
20160114:12:38:42:169214 gpstate:gpdb-sandbox:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 4.3.6.1 build 2'
20160114:12:38:42:169214 gpstate:gpdb-sandbox:gpadmin-[INFO]:-master Greenplum Version: 'PostgreSQL 8.2.15 (Greenplum Database 4.3.6.1 build 2) on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.4.2 compiled on Oct  1 2015 15:14:22'
20160114:12:38:42:169214 gpstate:gpdb-sandbox:gpadmin-[INFO]:-Obtaining Segment details from master...
20160114:12:38:42:169214 gpstate:gpdb-sandbox:gpadmin-[INFO]:-Gathering data from segments...
20160114:12:38:43:169214 gpstate:gpdb-sandbox:gpadmin-[INFO]:-Greenplum instance status summary
20160114:12:38:43:169214 gpstate:gpdb-sandbox:gpadmin-[INFO]:-----------------------------------------------------
20160114:12:38:43:169214 gpstate:gpdb-sandbox:gpadmin-[INFO]:-   Master instance                                = Active
20160114:12:38:43:169214 gpstate:gpdb-sandbox:gpadmin-[INFO]:-   Master standby                                 = No master standby configured
20160114:12:38:43:169214 gpstate:gpdb-sandbox:gpadmin-[INFO]:-   Total segment instance count from metadata     = 2
20160114:12:38:43:169214 gpstate:gpdb-sandbox:gpadmin-[INFO]:-----------------------------------------------------
20160114:12:38:43:169214 gpstate:gpdb-sandbox:gpadmin-[INFO]:-   Primary Segment Status
20160114:12:38:43:169214 gpstate:gpdb-sandbox:gpadmin-[INFO]:-----------------------------------------------------
20160114:12:38:43:169214 gpstate:gpdb-sandbox:gpadmin-[INFO]:-   Total primary segments                         = 2
20160114:12:38:43:169214 gpstate:gpdb-sandbox:gpadmin-[INFO]:-   Total primary segment valid (at master)        = 2
20160114:12:38:43:169214 gpstate:gpdb-sandbox:gpadmin-[INFO]:-   Total primary segment failures (at master)     = 0
20160114:12:38:43:169214 gpstate:gpdb-sandbox:gpadmin-[INFO]:-   Total number of postmaster.pid files missing   = 0
20160114:12:38:43:169214 gpstate:gpdb-sandbox:gpadmin-[INFO]:-   Total number of postmaster.pid files found     = 2
20160114:12:38:43:169214 gpstate:gpdb-sandbox:gpadmin-[INFO]:-   Total number of postmaster.pid PIDs missing    = 0
20160114:12:38:43:169214 gpstate:gpdb-sandbox:gpadmin-[INFO]:-   Total number of postmaster.pid PIDs found      = 2
20160114:12:38:43:169214 gpstate:gpdb-sandbox:gpadmin-[INFO]:-   Total number of /tmp lock files missing        = 0
20160114:12:38:43:169214 gpstate:gpdb-sandbox:gpadmin-[INFO]:-   Total number of /tmp lock files found          = 2
20160114:12:38:43:169214 gpstate:gpdb-sandbox:gpadmin-[INFO]:-   Total number postmaster processes missing      = 0
20160114:12:38:43:169214 gpstate:gpdb-sandbox:gpadmin-[INFO]:-   Total number postmaster processes found        = 2
20160114:12:38:43:169214 gpstate:gpdb-sandbox:gpadmin-[INFO]:-----------------------------------------------------
20160114:12:38:43:169214 gpstate:gpdb-sandbox:gpadmin-[INFO]:-   Mirror Segment Status
20160114:12:38:43:169214 gpstate:gpdb-sandbox:gpadmin-[INFO]:-----------------------------------------------------
20160114:12:38:43:169214 gpstate:gpdb-sandbox:gpadmin-[INFO]:-   Mirrors not configured on this array
20160114:12:38:43:169214 gpstate:gpdb-sandbox:gpadmin-[INFO]:-----------------------------------------------------

Step 3: Turn on optimizer_analyze_root_partition 
[gpadmin@gpdb-sandbox ~]$ gpconfig -c optimizer_analyze_root_partition -v on --masteronly
20160114:12:39:18:169310 gpconfig:gpdb-sandbox:gpadmin-[INFO]:-completed successfully

Step 4: Turn on the optimizer
[gpadmin@gpdb-sandbox ~]$ gpconfig -c optimizer -v on --masteronly
20160114:12:39:59:169532 gpconfig:gpdb-sandbox:gpadmin-[INFO]:-completed successfully

Step 5: Apply the changes
[gpadmin@gpdb-sandbox ~]$ gpstop -u
20160114:12:40:23:169600 gpstop:gpdb-sandbox:gpadmin-[INFO]:-Starting gpstop with args: -u
20160114:12:40:23:169600 gpstop:gpdb-sandbox:gpadmin-[INFO]:-Gathering information and validating the environment...
20160114:12:40:23:169600 gpstop:gpdb-sandbox:gpadmin-[INFO]:-Obtaining Greenplum Master catalog information
20160114:12:40:23:169600 gpstop:gpdb-sandbox:gpadmin-[INFO]:-Obtaining Segment details from master...
20160114:12:40:23:169600 gpstop:gpdb-sandbox:gpadmin-[INFO]:-Greenplum Version: 'postgres (Greenplum Database) 4.3.6.1 build 2'
20160114:12:40:23:169600 gpstop:gpdb-sandbox:gpadmin-[INFO]:-Signalling all postmaster processes to reload
Step 6: Check the status
[gpadmin@gpdb-sandbox ~]$ psql
psql (8.2.15)
Type "help" for help.

gpadmin=# show optimizer;
 optimizer 
-----------
 on
(1 row)

gpadmin=# 

Select runs but insert failing in Greenplum database

posted Jun 13, 2015, 8:23 AM by Sachchida Ojha   [ updated May 20, 2016, 5:52 PM ]

Few days back I faced a unique problem. In one environment of Greenplum database system, inserts on a table (ITAS) was failing but select was not. When I executed the same ITAS in other Greenplum env. it was successful.

Lets understand the problem and then potential solution.

To learn more about this issue including internal GP database structures used to diagnose this issue Sign our Gold Members Club Now.

PostgreSQL's/Greenplum DB MVCC transaction semantics depend on being able to compare transaction ID (XID) numbers: a row version with an insertion XID greater than the current transaction's XID is ""in the future"" and should not be visible to the current transaction. But since transaction IDs have limited size (32 bits at this writing) a cluster that runs for a long time (more than 4 billion transactions) would suffer transaction ID wraparound: the XID counter wraps around to zero, and all of a sudden transactions that were in the past appear to be in the future which means their outputs become invisible. Refer to http://www.postgresql.org/docs/9.4/static/routine-vacuuming.html

There could be 3 possible reason for insert failure.

1. Missing table privilege / No access to table.
2. No space on disks to accept new rows
3. something else

We have checked the first two conditions but that was not met. So started thinking about something else. Before we go further, we need to understand that GP (MPP database) is built on top of PostgreSQL. Both provides MVCC (Multi version concurrent control). 

Multiversion concurrency control (MCC or MVCC), is a concurrency control method commonly used by database management systems to provide concurrent access to the database and in programming languages to implement transactional memory.

If someone is reading from a database at the same time as someone else is writing to it, it is possible that the reader will see a half-written or inconsistent piece of data. There are several ways of solving this problem, known as concurrency control methods. The simplest way is to make all readers wait until the writer is done, which is known as a lock. This can be very slow, so MVCC takes a different approach: each user connected to the database sees a snapshot of the database at a particular instant in time. Any changes made by a writer will not be seen by other users of the database until the changes have been completed (or, in database terms: until the transaction has been committed.)

When an MVCC database needs to update an item of data, it will not overwrite the old data with new data, but instead mark the old data as obsolete and add the newer version elsewhere. Thus there are multiple versions stored, but only one is the latest. This allows readers to access the data that was there when they began reading, even if it was modified or deleted part way through by someone else. It also allows the database to avoid the overhead of filling in holes in memory or disk structures but requires (generally) the system to periodically sweep through and delete the old, obsolete data objects. For a document-oriented database it also allows the system to optimize documents by writing entire documents onto contiguous sections of disk—when updated, the entire document can be re-written rather than bits and pieces cut out or maintained in a linked, non-contiguous database structure.

MVCC provides point in time consistent views. Read transactions under MVCC typically use a timestamp or transaction ID to determine what state of the DB to read, and read these versions of the data. Read and write transactions are thus isolated from each other without any need for locking. Writes create a newer version, while concurrent reads access the older version.

Before we proceed further, lets understand these 2 parameters associated with transaction.

xid_stop_limit

The number of transaction IDs prior to the ID where transaction ID wraparound occurs. When this limit is reached, Greenplum Database stops creating new transactions to avoid data loss due to transaction ID wraparound.
VALUE RANGEDEFAULTSET CLASSIFICATIONS
integer 10000000 - 20000000001000000000local

system

restart


xid_warn_limit
The number of transaction IDs prior to the limit specified by xid_stop_limit. When Greenplum Database reaches this limit, it issues a warning to perform a VACUUM operation to avoid data loss due to transaction ID wraparound.
VALUE RANGEDEFAULTSET CLASSIFICATIONS
integer 10000000 - 2000000000500000000local

system

restart


Greenplum Database stop accepting connections and client receiving the following errorDatabase is not accepting commands to avoid wraparound data loss in database "dbname"

You may also see the following warning Server: host.localdomain, database: pgdb WARNING: database "pgdb" must be vacuumed within 1461689882 transactions (seg65 slice2 myhost:16727 pid=156528) HINT: To avoid a database shutdown, execute a full-database VACUUM in "pgdb"

There are 2 configuration settings in GPDB that effect this: xid_warn_limit and xid_stop_limit. xid_warn_limit defaults to 500 million and impacts when the warnings are generated and xid_stop_limit defaults to 1 billion and impacts when the database stops accepting connections. These are both hidden configuration parameters are not recommended to be modified in most cases

To find out the remaining tables and databases which need to be vacuumed you can run the following queries.

--GP Version

SELECT -1, datname, age(datfrozenxid) 
FROM pg_database 
UNION ALL 
SELECT gp_segment_id, datname, age(datfrozenxid)
FROM gp_dist_random('pg_database') 
ORDER BY 3 DESC ;


-- PostgreSQL version

 SELECT datname, age(datfrozenxid) FROM pg_database;


--GP Version

SELECT coalesce(n.nspname, '<Missing schema>'), relname, relkind, relstorage, age(relfrozenxid) 
FROM pg_class c  LEFT JOIN pg_namespace n ON c.relnamespace = n.oid 
WHERE relkind = 'r' 
AND relstorage NOT IN ('x') 
ORDER BY 5 DESC ;

-- PostgreSQL version

SELECT c.oid::regclass as table_name, greatest(age(c.relfrozenxid),age(t.relfrozenxid)) as age 
FROM pg_class c 
LEFT JOIN pg_class t ON c.reltoastrelid = t.oid 
WHERE c.relkind IN ('r', 'm'); 

Note: The age column measures the number of transactions from the cutoff XID to the current transaction's XID.

Please remember to vacuum every table in every database on the Greenplum Cluster at least once every two billion transactions. Some table won't be vacuumed using the default vacuum command in psql . Instead, they could be: Tables under schema/namespace pg_temp_xxxx, where xxxx isa number Some external table generated by gpload .

New Parameters in GPDB 4.3 and onward

posted May 27, 2015, 4:13 PM by Sachchida Ojha

There were 22 new parameters introduced in Greenplum starting GPDB 4.3.0. Mostly these parameters are related to specific features. Click here to learn more about these 22 parameters.

List Append Only / Append Organized table details in Greenplum

posted May 27, 2015, 3:12 PM by Sachchida Ojha   [ updated May 30, 2015, 1:39 PM ]

The gadget spec URL could not be found
sachi=# SELECT n.nspname AS schema_name ,c.relname AS table_name FROM pg_catalog.pg_appendonly a ,pg_catalog.pg_class c ,pg_catalog.pg_namespace n WHERE c.oid = a.relid AND n.oid = c.relnamespace AND a.compresstype IS NULL ;
 schema_name | table_name 
-------------+------------
 public      | bar
 public      | bar1
(2 rows)

sachi=# \d bar
Append-Only Table "public.bar"
 Column |  Type   | Modifiers 
--------+---------+-----------
 a      | integer | 
 b      | text    | 
Compression Type: None
Compression Level: 0
Block Size: 32768
Checksum: t
Distributed by: (a)
The gadget spec URL could not be found
sachi=# \d bar1
Append-Only Table "public.bar1"
 Column |  Type   | Modifiers 
--------+---------+-----------
 a      | integer | 
 b      | text    | 
Compression Type: None
Compression Level: 0
Block Size: 32768
Checksum: t
Distributed by: (a)

The gadget spec URL could not be found

sachi=# SELECT n.nspname AS schema_name ,c.relname AS table_name FROM pg_catalog.pg_appendonly a ,pg_catalog.pg_class c ,pg_catalog.pg_namespace n WHERE c.oid = a.relid AND n.oid = c.relnamespace ;
 schema_name | table_name 
-------------+------------
 public      | bar
 public      | bar1
 public      | foo
 public      | sales
 public      | xyz
(5 rows)
The gadget spec URL could not be found
sachi=# \d foo
Append-Only Table "public.foo"
 Column |  Type   | Modifiers 
--------+---------+-----------
 a      | integer | 
 b      | text    | 
Compression Type: zlib
Compression Level: 5
Block Size: 32768
Checksum: t
Distributed by: (a)

sachi=# \d sales
Append-Only Columnar Table "public.sales"
 Column |     Type      | Modifiers 
--------+---------------+-----------
 id     | integer       | 
 date   | date          | 
 amt    | numeric(10,2) | 
Checksum: t
Distributed by: (id)

sachi=# \d xyz
       Append-Only Columnar Table "public.xyz"
  Column   |            Type             | Modifiers 
-----------+-----------------------------+-----------
 date_hour | timestamp without time zone | 
 mv        | character varying(255)      | 
 visits    | numeric                     | 
Checksum: t
Distributed by: (date_hour, mv, visits)

The gadget spec URL could not be found

sachi=# SELECT n.nspname AS schema_name ,c.relname AS table_name, a.compresstype FROM pg_catalog.pg_appendonly a ,pg_catalog.pg_class c ,pg_catalog.pg_namespace n WHERE c.oid = a.relid AND n.oid = c.relnamespace ;
 schema_name | table_name | compresstype 
-------------+------------+--------------
 public      | bar        | 
 public      | bar1       | 
 public      | foo        | zlib
 public      | sales      | quicklz
 public      | xyz        | quicklz
(5 rows)

sachi=# \d pg_catalog.pg_appendonly
    Table "pg_catalog.pg_appendonly"
     Column      |   Type   | Modifiers 
-----------------+----------+-----------
 relid           | oid      | not null
 blocksize       | integer  | not null
 safefswritesize | integer  | not null
 compresslevel   | smallint | not null
 majorversion    | smallint | not null
 minorversion    | smallint | not null
 checksum        | boolean  | not null
 compresstype    | text     | 
 columnstore     | boolean  | 
 segrelid        | oid      | 
 segidxid        | oid      | 
 blkdirrelid     | oid      | 
 blkdiridxid     | oid      | 
 version         | integer  | 
 visimaprelid    | oid      | 
 visimapidxid    | oid      | 
Indexes:
    "pg_appendonly_relid_index" UNIQUE, btree (relid)
The gadget spec URL could not be found
sachi=# \d pg_catalog.pg_class                                                                                                                                    
Table "pg_catalog.pg_class"
     Column     |   Type    | Modifiers 
----------------+-----------+-----------
 relname        | name      | not null
 relnamespace   | oid       | not null
 reltype        | oid       | not null
 relowner       | oid       | not null
 relam          | oid       | not null
 relfilenode    | oid       | not null
 reltablespace  | oid       | not null
 relpages       | integer   | not null
 reltuples      | real      | not null
 reltoastrelid  | oid       | not null
 reltoastidxid  | oid       | not null
 relaosegrelid  | oid       | not null
 relaosegidxid  | oid       | not null
 relhasindex    | boolean   | not null
 relisshared    | boolean   | not null
 relkind        | "char"    | not null
 relstorage     | "char"    | not null
 relnatts       | smallint  | not null
 relchecks      | smallint  | not null
 reltriggers    | smallint  | not null
 relukeys       | smallint  | not null
 relfkeys       | smallint  | not null
 relrefs        | smallint  | not null
 relhasoids     | boolean   | not null
 relhaspkey     | boolean   | not null
 relhasrules    | boolean   | not null
 relhassubclass | boolean   | not null
 relfrozenxid   | xid       | not null
 relacl         | aclitem[] | 
 reloptions     | text[]    | 
Indexes:
    "pg_class_oid_index" UNIQUE, btree (oid)
    "pg_class_relname_nsp_index" UNIQUE, btree (relname, relnamespace)
The gadget spec URL could not be found

sachi=# SELECT n.nspname AS schema_name ,c.relname AS table_name, c.reltype,c.relowner,c.relpages, c.relkind,c.relstorage,c.relhasindex,a.compresstype, a.compresslevel,a.columnstore FROM pg_catalog.pg_appendonly a ,pg_catalog.pg_class c ,pg_catalog.pg_namespace n WHERE c.oid = a.relid AND n.oid = c.relnamespace ;
 schema_name | table_name | reltype | relowner | relpages | relkind | relstorage | relhasindex | compresstype | compresslevel | columnstore 
-------------+------------+---------+----------+----------+---------+------------+-------------+--------------+---------------+-------------
 public      | bar        |   37322 |    17146 |        0 | r       | a          | f           |              |             0 | f
 public      | bar1       |   37354 |    17146 |        0 | r       | a          | f           |              |             0 | f
 public      | foo        |   37386 |    17146 |        0 | r       | a          | f           | zlib         |             5 | f
 public      | sales      |   37446 |    17146 |        1 | r       | c          | f           | quicklz      |             1 | t
 public      | xyz        |   37476 |    17146 |        0 | r       | c          | f           | quicklz      |             1 | t
(5 rows)

sachi=# 

The gadget spec URL could not be found
The gadget spec URL could not be found

The gadget spec URL could not be found

The gadget spec URL could not be found

The gadget spec URL could not be found

The gadget spec URL could not be found

Spill or workfile configuration parameters

posted May 25, 2015, 9:17 AM by Sachchida Ojha

Greenplum introduced 2 server configuration parameter. 

1. gp_workfile_limit_per_query – This sets the maximum disk size (in kilobytes) an individual query is allowed to use for creating temporary spill files at each segment. 
The default value is 0, which means a limit is not enforced. 

2. gp_workfile_limit_per_segment – Sets the maximum total disk size (in kilobytes) that all running queries are allowed to use for creating temporary spill files at each 
segment. The default value is 0, which means a limit is not enforced.

Summary Of New Features in Greenplum Database 4.3.X.X

posted May 24, 2015, 6:24 AM by Sachchida Ojha   [ updated May 20, 2016, 5:25 PM ]


Summary of new Features (Family 4.3.X.X)
1. Query Optimizer and Management of query execution (4.3.5.0, 4.3.4, 4.3.2 )

2. Incremental Analyze (4.3.5.0)

3. High Availability (4.3)

4. Append-Optimized Tables and compaction information (4.3 and 4.3.5)

5. Workfile Disk Spill Space (4.3, 4.3.1, 4.3.2)

6. Database Security, Authentication and Encryption (4.3.2, 4.3.4, 4.3.5.1)

7. Utilities ( gpload, gpfdist, gpssh) enhancements
(4.3.3, 4.3.4,  4.3.4.1, 4.3.5.1)

8. New Server Configuration parameters
(4.3.3 and 4.3.4.1)

9. Table storage options and distribution policy
(4.3.4)

10. External table support to Hadoop distributions and MapR
(4.3.1, 4.3.2, 4.3.3, 4.3.4, 4.3.4.1, 4.3.5.1)

11. Greenplum Database Extension Enhancements (4.3.3)

12. Enhancements in Managing and Migrating Greenplum Database Objects and Data (4.3.1, 4.3.2, 4.3.3)

13. Improved Command Center Performance (4.3.2 and 4.3.4.1)

14. Enhancement in Restoring Data (4.3.1)

15. Organization of Greenplum Database Administration Documentation (4.3.1)


1-10 of 217