Greenplum Database New Features - Append-Optimized Tables and compaction information
Append-optimized tables were introduced in GPDB 4.3.0.0. Append-optimized tables are similar to Append-only tables with additional benefits which allows UPDATE and DELETE operations on the table data.
Note: When migrating Greenplum Database from 4.2.x.x to 4.3, append-only tables are migrated to append-optimized tables. GPDB 4.3.2.1 Fixes the problem of incorrect conversion of append-only tables to append-optimized tables during an upgrade from Greenplum Database 4.2.x to a Greenplum Database 4.3.x release prior to 4.3.2.1. The incorrect conversion causes append-optimizied table inconsistencies in the upgraded Greenplum Database system.
Append-Optimized Tables and compaction information (4.3 and 4.3.5)
Append-optimized tables are best with denormalized fact tables in a data warehouse environment, where the data is static after it is loaded. Moving large fact tables to an append-only storage model eliminates the storage overhead of the per-row update visibility information, saving about 20 bytes per row.
Limitations: Append-optimized tables cannot be used with the following functionalities.
a) Transactions with serializable isolation levels
b) Updatable cursors
With append-optimized tables, you use the VACUUM command to reclaim the storage capacity from table data that was deleted or updated.
Lets look at views and tables which provides more information about these tables.
sachi=# select * from pg_views where viewname like '%append%';
schemaname | viewname | viewowner | definition
------------+---------------------+-----------+-------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------
gp_toolkit | __gp_is_append_only | gpadmin | SELECT pgc.oid AS iaooid, CASE WHEN (pgao.relid IS NULL) THEN false ELSE true END AS iaotype FROM (pg_class
pgc LEFT JOIN pg_appendonly pgao ON ((pgc.oid = pgao.relid)));
(1 row)
sachi=# \d gp_toolkit.__gp_is_append_only
View "gp_toolkit.__gp_is_append_only"
Column | Type | Modifiers
---------+---------+-----------
iaooid | oid |
iaotype | boolean |
View definition:
SELECT pgc.oid AS iaooid,
CASE
WHEN pgao.relid IS NULL THEN false
ELSE true
END AS iaotype
FROM pg_class pgc
LEFT JOIN pg_appendonly pgao ON pgc.oid = pgao.relid;
sachi=# select * from pg_tables where tablename like '%append%';
schemaname | tablename | tableowner | tablespace | hasindexes | hasrules | hastriggers
------------+----------------------------+------------+------------+------------+----------+-------------
pg_catalog | pg_appendonly | gpadmin | | t | f | f
pg_catalog | pg_appendonly_alter_column | gpadmin | | t | f | f
(2 rows)
sachi=# select * from pg_catalog.pg_appendonly;
relid | blocksize | safefswritesize | compresslevel | majorversion | minorversion | checksum | compresstype | columnstore | segrelid | segidxid | blkdirrel
id | blkdiridxid | version | visimaprelid | visimapidxid
-------+-----------+-----------------+---------------+--------------+--------------+----------+--------------+-------------+----------+----------+----------
---+-------------+---------+--------------+--------------
(0 rows)
sachi=# \d pg_proc
Table "pg_catalog.pg_proc"
Column | Type | Modifiers
----------------+-----------+-----------
proname | name | not null
pronamespace | oid | not null
proowner | oid | not null
prolang | oid | not null
proisagg | boolean | not null
prosecdef | boolean | not null
proisstrict | boolean | not null
proretset | boolean | not null
provolatile | "char" | not null
pronargs | smallint | not null
prorettype | oid | not null
proiswin | boolean | not null
proargtypes | oidvector | not null
proallargtypes | oid[] |
proargmodes | "char"[] |
proargnames | text[] |
prosrc | text |
probin | bytea |
proacl | aclitem[] |
prodataaccess | "char" | not null
Indexes:
"pg_proc_oid_index" UNIQUE, btree (oid)
"pg_proc_proname_args_nsp_index" UNIQUE, btree (proname, proargtypes, pronamespace)
sachi=# select * from pg_proc where proname like '%append%';
proname | pronamespace | proowner | prolang | proisagg | prosecdef | proisstrict | proretset | provolatile | pronargs | prorettype | proiswin | proarg
types | proallargtypes | proargmodes | proargnames | prosrc | probin | proacl | prodataaccess
--------------+--------------+----------+---------+----------+-----------+-------------+-----------+-------------+----------+------------+----------+-------
------+----------------+-------------+-------------+------------+--------+--------+---------------
array_append | 11 | 10 | 12 | f | f | f | f | i | 2 | 2277 | f | 2277 2
283 | | | | array_push | - | | n
(1 row)
sachi=#
When upgrading from Greenplum Database 4.2.x to a 4.3.x release, Pivotal recommends upgrading to Greenplum Database 4.2.3.1. Upgrading from 4.2.x to 4.3.2.1 does not causes append-optimizied table inconsistencies.
If you are upgrading Greenplum Database from 4.3.x to 4.3.2.1, you run the fix_ao_upgrade.py utility to check Greenplum Database for the upgrade issue and fix the upgrade issue. The utility is in this Greenplum Database 4.3.2.1 directory:
$GPHOME/share/postgresql/upgrade
Syntax for fix_ao_upgrade.py:
fix_ao_upgrade.py {-h master_host | --host=master_host} {-p master_port | --port=master_port} [-u user | --user=user ] [--report] [-v | --verbose] [--help]
Options
-r | --report
Report inconsistencies without making any changes.
-h master_host | --host=master_host
Greenplum Database master hostname or IP address.
-p master_port | --port= master_port
Greenplum Database master port.
-u user | --user= user
User name to connect to Greenplum Database. The user must be a Greenplum Database superuser. Default is gpadmin.
v | --verbose
Verbose output that includes table names.
--help
Show the help message and exit.
If you specify the optional --report option, the utility displays a report of inconsistencies in the Greenplum Database system. No changes to Greenplum Database system are made. If you specify the --verbose option with --report, the table names that are affected by the inconsistencies are included in the report.
Append-optimized Compaction Information
In Greenplum Database 4.3.5.0, the new the function __gp_aovisimap_compaction_info(oid) displays compaction information for an append-optimized table. The information is for the on-disk data files on Greenplum Database segments that store the table data. You can use the information to determine the data files that will be compacted by a VACUUM operation on an append-optimized table.
Note: Until a VACUUM operation deletes the row from the data file, deleted or updated data rows occupy physical space on disk even though they are hidden to new transactions. The configuration parameter gp_appendonly_compactioncontrols the functionality of the VACUUM command.