MONITORING DISK SPACE in Greenplum

Post date: Oct 09, 2012 8:36:13 PM

It is essential to ensure that Greenplum Database host file system usage does not exceed 70% of the available space. Disk space monitoring helps protect database performance and avoid failures caused by full disk space.

General Monitoring

gp_toolkit Administrative Schema: The Greenplum Database includes the gp_toolkit schema, which collects system information through log files and operating system commands. You can run a query against the gp_toolkit to quickly view free disk space. Results are shown in bytes.

=# SELECT distinct dfhostname, dfspace,dfdevice FROM gp_toolkit.gp_disk_free ORDER BY dfhostname;

Returns the following results:

postgres=# select distinct dfhostname,dfdevice,dfspace from gp_toolkit.gp_disk_free;

dfhostname | dfdevice | dfspace

-----------+------------+-----------

fips4 | /dev/sdb1 | 145284456

fips3 | /dev/sdb1 | 145282396

(2 rows)

Learn more about the gp_toolkit schema in the Greenplum Database Administration Guide: The gp_toolkit Administrative Schema, available on Powerlink.

Shell Commands: Check free disk space with the shell command df –h:

From the master server, run the gpssh (broadcast) utility to check free disk space on all hosts in your cluster:

$ gpssh -f hostfile df -h

[rh55-qavm62] Filesystem Size Used Avail Use% Mounted on

[rh55-qavm62] /dev/sda1 9.7G 2.9G 6.4G 32% /

[rh55-qavm62] /dev/sdb1 140G 137G 3.5G 98% /data

...

[rh55-qavm62] 916G 421G 448G 49% /home/malshp

Data Computing Appliance Disk Monitoring

ConnectEMC Dial-Home: The Data Computing Appliance (DCA) supports dial-home functionality through the ConnectEMC support utility that collects and sends event data indicating system errors to EMC Global Services.

Greenplum Performance Monitor: Performance Monitor includes a web-based user interface that displays query performance and health metrics. The „Health‟ tab displays disk utilization information. Greenplum Performance Monitor is pre-installed on every DCA.

DCA Health Monitoring: Each DCA from DCA Software Version 1.1.0.0 includes extensive health monitoring software. This software collects Greenplum Database and system hardware information, including disk utilization. When free disk space reaches certain thresholds, the software sends the following data to EMC Support:

Disk Space Warning: If a host‟s file system usage exceeds 80% of capacity, the system sends a message with the severity of Warning.

Disk Space Error: If a host‟s file system usage exceeds 90% of capacity, the system sends a message with the severity of Error.

Note: The DCA health monitoring software checks all of the hosts‟ file systems. (DCAs have multiple file systems.)

Disk Space History using diskspace_history table in the gpperfmon database

select weekday,

"month",

"year",

"day" ,

-- hostname,

-- filesystem,

sum(round((bytes_used / power(1024,4))::numeric,2)) used_tb,

sum(round((bytes_available / power(1024,4))::numeric,2)) available_tb,

sum(round((total_bytes / power(1024,4))::numeric,2)) total_tb

from

(

select date_trunc('hour', ctime) ctime_hour,

to_char(ctime, 'mm') as "month",

to_char(ctime, 'yyyy') as "year",

to_char(ctime, 'ddd') as "day",

to_char(ctime, 'Day') as weekday,

to_char(ctime, 'hh24') as "hour",

hostname,

filesystem,

bytes_used,

bytes_available,

total_bytes,

row_number() over (partition by to_char(ctime, 'yyyy'),to_char(ctime, 'ddd'), hostname order by total_bytes desc) space_used_rank

from diskspace_history

where ctime >= '2010-01-01'::timestamp

--ctime between date_trunc('day',localtimestamp- interval '6 day') and date_trunc('day',localtimestamp)

and filesystem in ('/data1','/data2')

and hostname not like 'etl%'

) a

where space_used_rank = 1

group by weekday,day,month,year

order by year,month,weekday,day;

==========================================================================

Sample disk space data for Full DCA

==========================================================================

select hostname,filesystem,

round(total_bytes/power(1024,4)::numeric,2) totbyteinTB,

round(bytes_used/power(1024,4)::numeric,2) usedinTB,

round(bytes_available/power(1024,4)::numeric,2) availinTB

from diskspace_now

where filesystem in ('/data1','/data2')

order by round(substr(hostname,4,3)::numeric,2),filesystem;

"sdw1";"/data1";2.68;1.29;1.39

"sdw1";"/data2";2.68;1.29;1.39

"sdw2";"/data1";2.68;1.29;1.39

"sdw2";"/data2";2.68;1.30;1.38

"sdw3";"/data1";2.68;1.29;1.39

"sdw3";"/data2";2.68;1.28;1.40

"sdw4";"/data1";2.68;1.30;1.38

"sdw4";"/data2";2.68;1.29;1.39

"sdw5";"/data1";2.68;1.31;1.37

"sdw5";"/data2";2.68;1.29;1.39

"sdw6";"/data1";2.68;1.33;1.35

"sdw6";"/data2";2.68;1.30;1.38

"sdw7";"/data1";2.68;1.34;1.34

"sdw7";"/data2";2.68;1.31;1.37

"sdw8";"/data1";2.68;1.30;1.38

"sdw8";"/data2";2.68;1.38;1.30

"sdw9";"/data1";2.68;1.28;1.40

"sdw9";"/data2";2.68;1.29;1.39

"sdw10";"/data1";2.68;1.28;1.40

"sdw10";"/data2";2.68;1.28;1.40

"sdw11";"/data1";2.68;1.28;1.40

"sdw11";"/data2";2.68;1.29;1.39

"sdw12";"/data1";2.68;1.30;1.38

"sdw12";"/data2";2.68;1.28;1.40

"sdw13";"/data1";2.68;1.28;1.39

"sdw13";"/data2";2.68;1.28;1.40

"sdw14";"/data1";2.68;1.30;1.38

"sdw14";"/data2";2.68;1.29;1.39

"sdw15";"/data1";2.68;1.28;1.40

"sdw15";"/data2";2.68;1.28;1.40

"sdw16";"/data1";2.68;1.28;1.40

"sdw16";"/data2";2.68;1.30;1.38

================================================================================================

Note: Greenplum DCA manuals says that Full RAC DCA has 4 GPDB modules and each module has

1. 9TB or 31 TB capacity (uncompressed)

2. 4 server/Module

Each Server contains

1. 2 Sockets/12 Cores, 48GB RAM

2. 12x600 GB or 2 TB Storage

If you sum the /data1 and /data2 mounts size of all segment servers /2 (8 primary, 8 mirror) comes to 42.88 for all primary. Greenplum reserves approx 20% space for transaction logs, spill files etc so usable capacity of database is approx 36TB.

================================================================================================

Below scripts works per database (need to specify database name), you will need to give appropriate path for primary segment directories.

. /usr/local/greenplum/greenplum-db/greenplum_path.sh

( echo -n "<pre>" && \

psql -e -X -h hostname -d dbname -f <(cat <<EOF

\\timing on

DROP EXTERNAL WEB TABLE IF EXISTS dbsysreports.disk_usage;

CREATE EXTERNAL WEB TABLE dbsysreports.disk_usage (

size_bytes int8,

path text

)

EXECUTE

E'find /data/ -type d -iname base | xargs -I{} du --block-size 1 --all --separate-dirs {} 2>/dev/null || true' ON HOST FORMAT 'TEXT' (DELIMITER E'\\t');

DROP TABLE IF EXISTS dbsysreports.disk_usage_aot;

CREATE TABLE dbsysreports.disk_usage_aot

WITH (ORIENTATION=COLUMN, APPENDONLY=TRUE, COMPRESSTYPE=QUICKLZ, COMPRESSLEVEL=1)

AS SELECT * FROM dbsysreports.disk_usage

DISTRIBUTED RANDOMLY;

-- db size

INSERT INTO dbsysreports.cluster_size (ctime, bytes)

SELECT

now(),

SUM(size_bytes) AS raw_size

FROM dbsysreports.disk_usage_aot

WHERE path ~ E'^/data/vol[0-9]+/gpdb_p[0-9]/.*/base/[0-9]+/[0-9]+\\\\.?[0-9]*?#39;

-- schema report

INSERT INTO dbsysreports.schema_sizes (ctime, schemaname, bytes)

SELECT

now(),

REGEXP_REPLACE(pn.nspname, '^pg_temp_[0-9]+#39;, 'pg_temp') AS schema,

SUM(size_bytes)::BIGINT as bytes

FROM

(

SELECT

SUM(size_bytes) AS size_bytes, /* to merge 52345.1 style relfilenodes */

REGEXP_REPLACE(path,

E'^/data/vol[0-9]+/gpdb_p([0-9])/.*/base/[0-9]+/[0-9]+\\\\.?[0-9]*?#39;,

E'\\\\1') AS content,

REGEXP_REPLACE(path,

E'^/data/vol[0-9]+/gpdb_p[0-9]/.*/base/([0-9]+)/[0-9]+\\\\.?[0-9]*?#39;,

E'\\\\1') AS database_oid,

REGEXP_REPLACE(path,

E'^/data/vol[0-9]+/gpdb_p[0-9]/.*/base/[0-9]+/([0-9]+)\\\\.?[0-9]*?#39;,

E'\\\\1') AS relfilenode

FROM dbsysreports.disk_usage_aot

WHERE path ~ E'^/data/vol[0-9]+/gpdb_p[0-9]/.*/base/[0-9]+/[0-9]+\\\\.?[0-9]*?#39;

GROUP BY 2, 3, 4

) AS rfd

LEFT JOIN pg_class pc ON pc.relfilenode = rfd.relfilenode

LEFT JOIN pg_database pd ON pd.oid = rfd.database_oid

LEFT JOIN pg_namespace pn ON pn.oid = pc.relnamespace

WHERE pd.datname = 'dbname'

GROUP BY 1,2;

Disk Cleanup activity - Finding space used by backup, database and log files.

========================================================

The following will help in finding out what is taking the space. You should do all this while logged in as gpadmin to mdw.

1. If you have used gpssh before, you may already have a seghosts file, if not create one with the output of the below SQL. Basically the file should have the names of segments hosts one per line.

select distinct hostname from gp_segment_configuration where content > 0 order by replace(hostname, 'sdw', '')::numeric;

In my subsequent scripts I will assume this file as $GPHOME/hosts.seg

2. Find out total space used by primary segment databases (excluding log files and local backup files)

[gpadmin@sachi ~]$gpssh -f $GPHOME/hosts.seg "du –h --exclude=*pg_log* --exclude=*db_dump* -s /data[12]/primary/gpseg*"

3. Find out total space used by log files of primary segment databases

[gpadmin@sachi ~]$gpssh -f $GPHOME/hosts.seg "du –h -s /data[12]/primary/gpseg*/pg_log*"

3. Find out total space used by backup files of primary segment databases

[gpadmin@sachi ~]$gpssh -f $GPHOME/hosts.seg "du –h -s /data[12]/primary/gpseg*/db_dumps*"

4. Find out total space used by un-cleaned temporary spool files. You can check using:

gpssh -f ~/hostfile "du -h -s /data[12]/primary/gpseg*/base/*/pgsql_tmp"

Sample output

$gpssh -f ~/hostfile "du -h -s /data[12]/primary/gpseg*/pg_log*"

..............

[sdw10] 40M /data1/primary/gpseg54/pg_log

[sdw10] 40M /data1/primary/gpseg55/pg_log

[sdw10] 40M /data1/primary/gpseg56/pg_log

[sdw10] 40M /data2/primary/gpseg57/pg_log

................

[ sdw1] 39M /data1/primary/gpseg1/pg_log

[ sdw1] 40M /data1/primary/gpseg2/pg_log

..............

[gpadmin@ms00597 ~]$ gpssh -f ~/hostfile "du -h -s /data[12]/primary/gpseg*/db_dumps*"

[sdw16] 20K /data1/primary/gpseg90/db_dumps

[sdw16] 20K /data1/primary/gpseg91/db_dumps

[sdw16] 28K /data1/primary/gpseg92/db_dumps

[sdw16] 28K /data2/primary/gpseg93/db_dumps

.........

[ sdw9] 24K /data2/primary/gpseg52/db_dumps

[ sdw9] 28K /data2/primary/gpseg53/db_dumps

2. There is catalog corruption resulting in tables in segment level which are not detected by master.

3. The gp_toolkit SQL was not run as gpadmin database user, so it failed to capture all tables and schemas.

4. There are multiple databases and we analyzed space of only one database.

sachi=# \l+

List of databases

-----------+---------+----------+---------------------+--------+------------+---------------------------

: =c/gpadmin

: gpadmin=CTc/gpadmin

: sachi=CTc/gpadmin

: gpuser=CTc/gpadmin

: gpadmin=CTc/gpadmin

: gpadmin=CTc/gpadmin

(7 rows)

===============================================================================================

List of database and their sizes

===============================================================================================

sachi=# \l

List of databases

Name | Owner | Encoding | Access privileges

-----------+---------+----------+---------------------

gpadmin | gpadmin | UTF8 |

gpperfmon | gpadmin | UTF8 | gpadmin=CTc/gpadmin : =c/gpadmin

himanshu | gpadmin | UTF8 |

template0 | gpadmin | UTF8 | =c/gpadmin : gpadmin=CTc/gpadmin

template1 | gpadmin | UTF8 | =c/gpadmin : gpadmin=CTc/gpadmin

(7 rows)

sachi=# \l+

List of databases

-----------+---------+----------+---------------------+--------+------------+---------------------------

(7 rows)

===============================================================================================

ssh sdw1

du -h -s /data1

du -h -s /data1/primary

du -h -s /data1/mirror

du -h -s /data1/primary/gpseg*

du -h -s /data1/primary/gpseg0/*

ls -l /data1/primary/gpseg0/~ddboost

ls -lR /data1/primary/gpseg0/~ddboost

ssh sdw1 du -h -s /data1/primary/gpseg*/gp_dump*

ssh sdw1 du -h -s /data1/primary/gpseg*/--ddboost

ls -l *host*

cat seg_hostfile

gpssh -f seg_hostfile "du -h -s /data1/primary/gpseg*/--ddboost"

gpssh -f seg_hostfile "du -h -s

$ ssh sdw1

[gpadmin@sdw1 ~]$ du -h -s /data1

[gpadmin@sdw1 ~]$ du -h -s /data1/primary

[gpadmin@sdw1 ~]$ du -h -s /data1/mirror

[gpadmin@sdw1 ~]$ du -h -s /data1/primary/gpseg*

[gpadmin@sdw1 ~]$ du -h -s /data1/primary/gpseg0/*

[gpadmin@ms001 ~]$ ssh sdw1 du -h -s /data1/primary/gpseg*/gp_dump*

[gpadmin@ms0017 ~]$ gpssh -f seg_hostfile "du -h -s /data1/primary/gpseg*/--ddboost"

[gpadmin@ms0017 ~]$ gpssh -f seg_hostfile "du -h -s /data1/primary/gpseg*/gp_dump*20130415131834*"

[gpadmin@sdw1 ~]$ ls -lR /data1/primary/gpseg0/--ddboost

From the master ssh sdw1

[gpadmin@sdw1 ~]$ls -lR /data1/primary/gpseg0/--ddboost

[gpadmin@sdw1 ~]$ du -h -s /data1/primary/gpseg0/--ddboost

[gpadmin@sdw1 ~]$ du -h -s /data1/primary/gpseg0/*