Greenplum Database New Features -  Append-Optimized Tables and compaction information

Append-optimized tables were introduced in GPDB 4.3.0.0.  Append-optimized tables are similar to Append-only tables with additional benefits which allows UPDATE and DELETE operations on the table data. 

Note: When migrating Greenplum Database from 4.2.x.x to 4.3, append-only tables are migrated to append-optimized tables. GPDB 4.3.2.1 Fixes the problem of incorrect conversion of append-only tables to append-optimized tables during an upgrade from Greenplum Database 4.2.x to a Greenplum Database 4.3.x release prior to 4.3.2.1. The incorrect conversion causes append-optimizied table inconsistencies in the upgraded Greenplum Database system.

Append-Optimized Tables and compaction information (4.3 and 4.3.5)

Append-optimized tables are best with denormalized fact tables in a data warehouse environment, where the data is static after it is loaded. Moving large fact tables to an append-only storage model eliminates the storage overhead of the per-row update visibility information, saving about 20 bytes per row.

Limitations: Append-optimized tables cannot be used with the following functionalities.

a) Transactions with serializable isolation levels

b) Updatable cursors

With append-optimized tables, you use the VACUUM command to reclaim the storage capacity from table data that was deleted or updated.

Lets look at views and tables which provides more information about these tables.

sachi=# select * from pg_views where viewname like '%append%';

schemaname | viewname | viewowner | definition

------------+---------------------+-----------+-------------------------------------------------------------------------------------------------------------

---------------------------------------------------------------

gp_toolkit | __gp_is_append_only | gpadmin | SELECT pgc.oid AS iaooid, CASE WHEN (pgao.relid IS NULL) THEN false ELSE true END AS iaotype FROM (pg_class

pgc LEFT JOIN pg_appendonly pgao ON ((pgc.oid = pgao.relid)));

(1 row)

sachi=# \d gp_toolkit.__gp_is_append_only

View "gp_toolkit.__gp_is_append_only"

Column | Type | Modifiers

---------+---------+-----------

iaooid | oid |

iaotype | boolean |

View definition:

SELECT pgc.oid AS iaooid,

CASE

WHEN pgao.relid IS NULL THEN false

ELSE true

END AS iaotype

FROM pg_class pgc

LEFT JOIN pg_appendonly pgao ON pgc.oid = pgao.relid;

sachi=# select * from pg_tables where tablename like '%append%';

schemaname | tablename | tableowner | tablespace | hasindexes | hasrules | hastriggers

------------+----------------------------+------------+------------+------------+----------+-------------

pg_catalog | pg_appendonly | gpadmin | | t | f | f

pg_catalog | pg_appendonly_alter_column | gpadmin | | t | f | f

(2 rows)

sachi=# select * from pg_catalog.pg_appendonly;

relid | blocksize | safefswritesize | compresslevel | majorversion | minorversion | checksum | compresstype | columnstore | segrelid | segidxid | blkdirrel

id | blkdiridxid | version | visimaprelid | visimapidxid

-------+-----------+-----------------+---------------+--------------+--------------+----------+--------------+-------------+----------+----------+----------

---+-------------+---------+--------------+--------------

(0 rows)

sachi=# \d pg_proc

Table "pg_catalog.pg_proc"

Column | Type | Modifiers

----------------+-----------+-----------

proname | name | not null

pronamespace | oid | not null

proowner | oid | not null

prolang | oid | not null

proisagg | boolean | not null

prosecdef | boolean | not null

proisstrict | boolean | not null

proretset | boolean | not null

provolatile | "char" | not null

pronargs | smallint | not null

prorettype | oid | not null

proiswin | boolean | not null

proargtypes | oidvector | not null

proallargtypes | oid[] |

proargmodes | "char"[] |

proargnames | text[] |

prosrc | text |

probin | bytea |

proacl | aclitem[] |

prodataaccess | "char" | not null

Indexes:

"pg_proc_oid_index" UNIQUE, btree (oid)

"pg_proc_proname_args_nsp_index" UNIQUE, btree (proname, proargtypes, pronamespace)

sachi=# select * from pg_proc where proname like '%append%';

proname | pronamespace | proowner | prolang | proisagg | prosecdef | proisstrict | proretset | provolatile | pronargs | prorettype | proiswin | proarg

types | proallargtypes | proargmodes | proargnames | prosrc | probin | proacl | prodataaccess

--------------+--------------+----------+---------+----------+-----------+-------------+-----------+-------------+----------+------------+----------+-------

------+----------------+-------------+-------------+------------+--------+--------+---------------

array_append | 11 | 10 | 12 | f | f | f | f | i | 2 | 2277 | f | 2277 2

283 | | | | array_push | - | | n

(1 row)

sachi=#

When upgrading from Greenplum Database 4.2.x to a 4.3.x release, Pivotal recommends upgrading to Greenplum Database 4.2.3.1. Upgrading from 4.2.x to 4.3.2.1 does not causes append-optimizied table inconsistencies.

If you are upgrading Greenplum Database from 4.3.x to 4.3.2.1, you run the fix_ao_upgrade.py utility to check Greenplum Database for the upgrade issue and fix the upgrade issue. The utility is in this Greenplum Database 4.3.2.1 directory:

$GPHOME/share/postgresql/upgrade

Syntax for fix_ao_upgrade.py:

fix_ao_upgrade.py {-h master_host | --host=master_host} {-p master_port | --port=master_port} [-u user | --user=user ] [--report] [-v | --verbose] [--help]

Options

-r | --report

Report inconsistencies without making any changes.

-h master_host | --host=master_host

Greenplum Database master hostname or IP address.

-p master_port | --port= master_port

Greenplum Database master port.

-u user | --user= user

User name to connect to Greenplum Database. The user must be a Greenplum Database superuser. Default is gpadmin.

v | --verbose

Verbose output that includes table names.

--help

Show the help message and exit.

If you specify the optional --report option, the utility displays a report of inconsistencies in the Greenplum Database system. No changes to Greenplum Database system are made. If you specify the --verbose option with --report, the table names that are affected by the inconsistencies are included in the report.

Append-optimized Compaction Information

In Greenplum Database 4.3.5.0, the new the function __gp_aovisimap_compaction_info(oid) displays compaction information for an append-optimized table. The information is for the on-disk data files on Greenplum Database segments that store the table data. You can use the information to determine the data files that will be compacted by a VACUUM operation on an append-optimized table.

Note: Until a VACUUM operation deletes the row from the data file, deleted or updated data rows occupy physical space on disk even though they are hidden to new transactions. The configuration parameter gp_appendonly_compactioncontrols the functionality of the VACUUM command.