What is Free Space Map? How to size it?

Post date: Jan 24, 2014 12:42:17 AM

What is Free Space Map? Expired rows are held in the free space map. The free space map must be sized large enough to hold all expired rows in your database. If not, a regular VACUUM command cannot reclaim space occupied by expired rows that overflow the free space map.

VACUUM FULL reclaims all expired row space, but it is an expensive operation and can take an unacceptably long time to finish on large, distributed Greenplum Database tables. If the free space map overflows, you can recreate the table with a CREATE TABLE AS statement and drop the old table. Greenplum recommends not using VACUUM FULL.

For append-optimized tables, VACUUM requires enough available disk space to accommodate the new segment file during the VACUUM process. If the ratio of hidden rows to total rows in a segment file is less than a threshold value (10, by default), the segment file is not compacted. The threshold value can be configured with the gp_appendonly_compaction_threshold Greenplum Database server configuration parameter. 

VACUUM FULL ignores the threshold and rewrites the segment file regardless of the ratio. VACUUM can be disabled for append-optimized tables using the gp_appendonly_compaction server configuration parameter. 

If a concurrent serializable transaction is detected when an append-optimized table is being vacuumed, the current and subsequent segment files are not compacted. If a segment file has been compacted but a concurrent serializable transaction is detected in the transaction that drops the original segment file, the drop is skipped. This could leave one or two segment files in an “awaiting drop” state after the vacuum has completed.

Free Space Map Parameters

These parameters control the sizing of the free space map, which contains expired rows. Use VACUUM to reclaim the free space map disk space.

1. max_fsm_pages

2. max_fsm_relations

 

Notes : VACUUM cannot be executed inside a transaction block.

Greenplum recommends that active production databases be vacuumed frequently (at least nightly), in order to remove expired rows. After adding or deleting a large number of rows, it may be a good idea to issue a VACUUM ANALYZE command for the affected table. This will update the system catalogs with the results of all recent changes, and allow the Greenplum query planner to make better choices in planning queries.

Important: Regular PostgreSQL has a separate optional server process called the autovacuum daemon, whose purpose is to automate the execution of VACUUM and ANALYZE commands. This feature is currently disabled in Greenplum Database.

VACUUM causes a substantial increase in I/O traffic, which can cause poor performance for other active sessions. Therefore, it is advisable to vacuum the database at low usage times.

A VACUUM FULL will reclaim all expired row space, but is a very expensive operation and may take an unacceptably long time to finish on large, distributed Greenplum Database tables. If you do get into a situation where the free space map has overflowed, it may be more timely to recreate the table with a CREATE TABLE AS statement and drop the old table. A VACUUM FULL is not recommended in Greenplum Database.