MONITORING DATABASE LOG FILES

Post date: Dec 04, 2012 5:31:44 PM

Monitoring database logs is a very important activity and can help in investigating issues as well as preventing database problems.

Greenplum Database maintains log files in which it records events during database activity. Monitoring the Greenplum Database log files for errors can reveal important information about issues in the database. Understanding and providing the proper log files to EMC Support and Greenplum Engineering usually speeds up the investigation process.

Greenplum Database provides gp_toolkit to allow database administrators to examine log file entries. You can mine log file data using external tables mapped to the log files. See the Greenplum Database Administration Guide for details.

As Greenplum Database is a Massively Parallel Processing (MPP) database, it consists of multiple instances. Every segment instance (master, standby, and segment) has a log file associated with it. Greenplum Database rotates log files every day at midnight and generates a new log file on every database start. 

The timestamp in the log file reflects the time at which the log file was rotated (midnight = 00:00:00) or when the database started.

The instance log file is located in the pg_log subdirectory of the data directory: <data_directory>/pg_log. See pg_filespace_entry for data directory locations. The format for database log file names is gpdb-YYYYMM-DD_HHMISS.log, where YYYY is year, MM is month, DD is day, HH is hour, MI is minutes and SS is seconds.

The master log file is located in $MASTER_DATA_DIRECTORY/pg_log. The master log file contains log entries about activity on the master segment; statements, errors, and information such as FTS messages.

Database log files are CSV (commaseparated values) text files. Entries can be multi-line; for example, a multi-line SQL query or a multi-line stack trace. Following are the

log file elements in order:

<timestamp>, <database user>, <database name>, <process pid>, <thread id>, <host>, <port>, <session time>, <transaction id>,

<session id> (in conXXX format), <command number within session>, <segment>, <plan slice>, <distributed transaction id>,

<local transaction id>, <subtransaction id>, <severity level>, <state>, <message text>, <details>, <hint>, <query>,

<query position>, <context>, <debugging information>, <cursor position>, <funciton>, <file>, <line>, <process stack>

One of the most important values is <severity level>. Possible values are PANIC, FATAL, ERROR, WARNING, LOG, NOTICE, and INFO.

You can control the logging actions based on severity, such as FATAL, or topic, such as setting the log_duration parameter to ON to log statement durations. See the log_min_messages parameter in the Greenplum Database Administration Guide for information about logging actions based on severity level. See the log_* parameters in the parameter list in the Greenplum Database Administration Guide for information about logging actions based on topics. 

Use the <session id> value to correlate log entries with sessions in pg_stat_activity.

The value of conXYZ in the log file can be related to pg_stat_activity.sess_id (XYZ). For example, a session with pg_stat_activity.sess_id=12345 will have log entries marked con12345. 

The <segment> value is most useful in the master log file as it reports from which segment the log entry was received. The master segment is segment=-1.

The best way to analyze and review log files for errors is to examine them directly and provide the files and selected log entries to EMC Support for further analysis. Greenplum Database also provides several objects in the gp_toolkit schema to allow database administrators to examine log file entries. You can mine log file data using external tables mapped to the log files. See the Greenplum Database Administration Guide for details.