gpcheckcat

[gpadmin@sachi lib]$ pwd

/usr/local/greenplum-db/bin/lib

[gpadmin@sachi lib]$ ./gpcheckcat

Usage: gpcheckcat [<option>] [dbname]

-?

-B parallel: number of worker threads

-g dir : generate SQL to rectify catalog corruption, put it in dir

-p port : DB port number

-P passwd : DB password

-U uname : DB User Name

-v : verbose

-A : all databases

-S option : shared table options (none, only)

-O : Online

-l : list all tests

-R test : run this particular test

-C catname : run cross consistency, FK and ACL tests for this catalog table

Run gpcheckcat on Greenplum Database

gpcheckcat runs multiple tests. Most of them cannot be run in concurrency with other workload statements or the results will not be usable. We recommend restarting the database in restricted mode for running gpcheckcat, otherwise gpcheckcat may report inconsistencies due to ongoing database operations and may provide a higher number of inconsistencies. If you run gpcheckcat without stopping database activity, run it with "-O" (online) parameter. See the end for clarification which tests are safe for running in online mode.

Running gpcheckcat is recommended to be done periodically so inconsistencies can be identified early and repaired before they affect database functionality.

a. Stop the database using below command.gpstop -M fast

b. Restart the database in restricted mode.gpstart -aR

(Note: Check for scripts or tools that connect to the database with superuser privileges and stop them - otherwise they will still be able to work and this is not advisable)

c. Identify orphaned temp schemas and create related scripts

d. Review and run the script for each database, as generated in the previous step (c.)

e. After the drop statements are executed, verify that all the temporary schemas are cleaned by re-running step c. again. It should not generate any drop statements

f. gpcheckcat log file is located in /home/gpadmin/gpAdminLogs/gpcheckcat_.log. There is only one log file for the day, gpcheckcat will append log entries if the file already exists. If you want to get a log file from only current execution, rename the existing log file (for example add ".1" to the name, etc.) - gpcheckcat will create the file again

g. Run gpcheckcat (recommended to run with nohup, in the background and redirect the stdout/stderr to a file)

Replace DBNAME below with the actual name of your database:export PGDATABASE=DBNAME nohup $GPHOME/bin/lib/gpcheckcat -v > /home/gpadmin/gpcheckcat_$(date +%Y%m%d).log 2>&1 &

h. After the above gpcheckcat script is completed

the log file /home/gpadmin/gpcheckcat_`date +"%Y%m%d"`.log will contain the standard output for the tool and can be reviewed for errors

the log file /home/gpadmin/gpAdminLogs/gpcheckcat_.log will contain the details of the checks (row-by-row inconsistencies, etc.)

After finishing running gpcheckcat, collect both log files together with the gpcheckcat command you have used and send them to Support for analysis. Depending on the outcome, maintenance window may be needed to fix catalog issues. In some cases database can continue functioning properly despite the inconsistencies, in other cases operations may fail because of inconsistencies.

Note: gpcheckcat tests marked with online/offline running:

duplicate: Check for duplicate entries (online OK)

missing_extraneous: Cross consistency check for missing or extraneous entries (online OK)

inconsistent: Cross consistency check for master segment inconsistency (online OK)

foreign_key: Check foreign keys (online OK)

acl: Cross consistency check for access control privileges (online OK)

persistent: Check persistent tables (offline only)

pgclass: Check pg_class entry that does not have any correspond pg_attribute entry (offline only)

namespace: Check for leaked temporary schema and missing schema definition (offline only)

distribution_policy: Check constraints on randomly distributed tables (offline only)

dependency: Check for dependency on non-existent objects (offline only)

owner: Check table ownership that is inconsistent with the master database (online OK)

part_integrity: Check pg_partition branch integrity, partition with oids, partition distribution policy (online OK)

part_constraint: Check constraints on partitioned tables (online OK)

duplicate_persistent: Check for duplicate gp_persistent_relation_node entries (online OK)

Background

What are catalog inconsistencies: Catalog inconsistencies are inconsistencies between database system tables on different levels. In general there are three types of inconsistencies:

Inconsistencies in system tables on a segment level (example: inconsistency between system table that contains table data and system table that contains column data; or example: system table contains duplicates in a column that is supposed to be unique)

Inconsistencies between same system table across segments (example: system table is missing row on one segment, but all other segments have this row; or example: values of specific row column data are different across segments (table owner, table access privileges, etc.))

Persistent Table inconsistencies - inconsistencies in persistence object state and filesystem objects on a segment (example: there are no running transactions, all transactions are complete, but there is object that is marked as "creation incomplete" in Persistent Tables; or example: file exists in database directory but there is no corresponding object existing in the database system tables)