What is Free Space Map? How to size it?

posted Jan 23, 2014, 4:42 PM by Sachchida Ojha
What is Free Space Map? Expired rows are held in the free space map. The free space map must be sized large enough to hold all expired rows in your database. If not, a regular VACUUM command cannot reclaim space occupied by expired rows that overflow the free space map.

VACUUM FULL reclaims all expired row space, but it is an expensive operation and can take an unacceptably long time to finish on large, distributed Greenplum Database tables. If the free space map overflows, you can recreate the table with a CREATE TABLE AS statement and drop the old table. Greenplum recommends not using VACUUM FULL.

For append-optimized tables, VACUUM requires enough available disk space to accommodate the new segment file during the VACUUM process. If the ratio of hidden rows to total rows in a segment file is less than a threshold value (10, by default), the segment file is not compacted. The threshold value can be configured with the gp_appendonly_compaction_threshold Greenplum Database server configuration parameter. 

VACUUM FULL ignores the threshold and rewrites the segment file regardless of the ratio. VACUUM can be disabled for append-optimized tables using the gp_appendonly_compaction server configuration parameter. 

If a concurrent serializable transaction is detected when an append-optimized table is being vacuumed, the current and subsequent segment files are not compacted. If a segment file has been compacted but a concurrent serializable transaction is detected in the transaction that drops the original segment file, the drop is skipped. This could leave one or two segment files in an “awaiting drop” state after the vacuum has completed.

Free Space Map Parameters
These parameters control the sizing of the free space map, which contains expired rows. Use VACUUM to reclaim the free space map disk space.

1. max_fsm_pages
2. max_fsm_relations

 Parameter Value Range Default  Description Set Classifications
 max_fsm_pages integer > 16 * max_fsm_relations 200000 Sets the maximum number of disk pages for which free space will be tracked in the shared free-space map. Six bytes of shared memory are consumed for each page slot.local
system
restart
 max_fsm_relations integer 1000 Sets the maximum number of relations for which free space will be tracked in the shared memory free-space map. Should be set to a value larger than the total number of: tables+indexes+system tables. It costs about 60 bytes of memory for each relation per segment instance. It is better to allow some room for overhead and set too high rather than too low.local
system
restart

 
Notes : VACUUM cannot be executed inside a transaction block.
Greenplum recommends that active production databases be vacuumed frequently (at least nightly), in order to remove expired rows. After adding or deleting a large number of rows, it may be a good idea to issue a VACUUM ANALYZE command for the affected table. This will update the system catalogs with the results of all recent changes, and allow the Greenplum query planner to make better choices in planning queries.

Important: Regular PostgreSQL has a separate optional server process called the autovacuum daemon, whose purpose is to automate the execution of VACUUM and ANALYZE commands. This feature is currently disabled in Greenplum Database.

VACUUM causes a substantial increase in I/O traffic, which can cause poor performance for other active sessions. Therefore, it is advisable to vacuum the database at low usage times.

A VACUUM FULL will reclaim all expired row space, but is a very expensive operation and may take an unacceptably long time to finish on large, distributed Greenplum Database tables. If you do get into a situation where the free space map has overflowed, it may be more timely to recreate the table with a CREATE TABLE AS statement and drop the old table. A VACUUM FULL is not recommended in Greenplum Database.


Comments