Backup and Restore in Greenplum

Various options used by Greenplum DBA's to backup and restore the database.

1. PostgreSQL ways (Non parallel backup and restore)
2. Greenplum ways (parallel backup and restore)
3. Using data domain

gpcrondump and gpdbrestore

posted Oct 17, 2016, 10:40 AM by Sachchida Ojha

[gpadmin@mdw ~]$ gpcrondump --help

COMMAND NAME: gpcrondump


Writes out a database to SQL script files. The script files can be used

to restore the database using the gpdbrestore utility. The gpcrondump

utility can be called directly or from a crontab entry.


*****************************************************

SYNOPSIS

*****************************************************


gpcrondump -x database_name

     [-s <schema> | -S <schema> | -t <schema>.<table> | -T <schema>.<table>]

     [--table-file=<filename> | --exclude-table-file=<filename>]

     [--schema-file=<filename> | --exclude-schema-file=<filename>]

     [-u backup_directory] [-R post_dump_script] [--incremental]

     [ -K <timestamp> [--list-backup-files] ]

     [--prefix <prefix_string> [--list-filter-tables] ]

     [ -c  [ --cleanup-date yyyymmdd  |  --cleanup-total n ] ]

     [-z] [-r] [-f <free_space_percent>] [-b] [-h] [-j | -k]

     [-g] [-G] [-C] [-d <master_data_directory>] [-B <parallel_processes>]

     [-a] [-q] [-y <reportfile>] [-l <logfile_directory>]

     [--email-file <path_to_file> ] [-v]

     { [-E encoding] [--inserts | --column-inserts] [--oids]

     [--no-owner | --use-set-session-authorization]

     [--no-privileges] [--rsyncable]

     { [--ddboost [--replicate --max-streams <max_IO_streams>

     [--ddboost-skip-ping] ] ] } |

     { [--netbackup-service-host <netbackup_server>

     --netbackup-policy <netbackup_policy>

     --netbackup-schedule <netbackup_schedule>

     [--netbackup-block-size <size> ]

     [--netbackup-keyword <keyword> ] } }


gpcrondump --ddboost-host <ddboost_hostname>

    [--ddboost-host ddboost_hostname ... ]

     --ddboost-user <ddboost_user>

     --ddboost-backupdir <backup_directory>

     [--ddboost-remote] [--ddboost-skip-ping]


gpcrondump --ddboost-config-remove


gpcrondump -o  [ --cleanup-date yyyymmdd  |  --cleanup-total n ]


gpcrondump -?


gpcrondump --version



*****************************************************

DESCRIPTION

*****************************************************


The gpcrondump utility dumps the contents of a database into SQL script

files, which can then be used to restore the database schema and user

data at a later time using gpdbrestore. During a dump operation, users

will still have full access to the database.


By default, dump files are created in their respective master and

segment data directories in a directory named db_dumps/YYYYMMDD. The

data dump files are compressed by default using gzip.


gpcrondump allows you to schedule routine backups of a Greenplum

database using cron (a scheduling utility for UNIX operating systems).

Cron jobs that call gpcrondump should be scheduled on the master host.


WARNING: Backing up a database with gpcrondump while simultaneously

running ALTER TABLE might cause gpcrondump to fail.


**********************

Data Domain Boost

**********************


gpcrondump is used to schedule Data Domain Boost backup and restore

operations. gpcrondump is also used to set or remove one-time

credentials for Data Domain Boost.



**********************

NetBackup

**********************


Greenplum Database must be configured to communicate with the Symantec

NetBackup master server that is used to backup the database. See the

"Greenplum Database Administrator Guide" for information on

configuring Greenplum Database and NetBackup and backing up and

restoring with NetBackup.



**********************

Return Codes

**********************


The following is a list of the codes that gpcrondump returns.

   0 - Dump completed with no problems

   1 - Dump completed, but one or more warnings were generated

   2 - Dump failed with a fatal error



**********************

EMAIL NOTIFICATIONS

**********************


To have gpcrondump send email notifications with the completion status

after a back up operation completes, you must place a file named

mail_contacts in the home directory of the Greenplum superuser (gpadmin)

or in the same directory as the gpcrondump utility ($GPHOME/bin). This

file should contain one email address per line. gpcrondump will issue a

warning if it cannot locate a mail_contacts file in either location. If

both locations have a mail_contacts file, then the one in $HOME takes

precedence.


You can customize the email Subject and From lines of the email

notifications that gpcrondump sends after a back up completes for a

database. You specify the option --email-file with the location of a

YAML file that contains email Subject and From lines that gpcrondump

uses. See the section "File Format for Customized Emails" for

information about the format of the YAML file.


NOTE: The UNIX mail utility must be running on Greenplum Database host

and must be configured to allow the Greenplum superuser (gpadmin) to

send email.



**********************

Limitations

**********************


* NetBackup is not compatible with DDBoost. Both NetBackup and DDBoost

  cannot be used in a single back up operation.


* For incremental back up sets, a full backup and associated incremental

  backups, the backup set must be on a single device. For example, a

  backup set must all be on a file system. The backup set cannot have some

  backups on the local file system and others on a Data Domain system or a

  NetBackup system.



*****************************************************

OPTIONS

*****************************************************


-a (do not prompt)


 Do not prompt the user for confirmation.



-b (bypass disk space check)


 Bypass disk space check. The default is to check for available disk

 space, unless --ddboost is specified. When using Data Domain Boost, this

 option is always enabled.


 Note: Bypassing the disk space check generates a warning message. With a

 warning message, the return code for gpcrondump is 1 if the dump is

 successful. (If the dump fails, the return code is 2, in all cases.)



-B <parallel_processes>


 The number of segments to check in parallel for pre/post-dump

 validation. If not specified, the utility will start up to 60 parallel

 processes depending on how many segment instances it needs to dump.



-c (clear old dump files first)


 Specify this option to delete old backups before performing a back up.

 In the db_dumps directory, the directory where the name is the oldest

 date is deleted. If the directory name is the current date, the

 directory is not deleted. The default is to not delete old backup files.


 The deleted directory might contain files from one or more backups.


 WARNING: Before using this option, ensure that incremental backups

 required to perform the restore are not deleted. The gpdbrestore utility

 option --list-backup lists the backup sets required to perform a backup.


 You can specify the option --cleanup-date or --cleanup-total to specify

 backup sets to delete.


 If --ddboost is specified, only the old files on Data Domain Boost are

 deleted.


 This option is not supported with the -u option.



-C (clean catalog before restore)


 Clean out the catalog schema prior to restoring database objects.

 gpcrondump adds the DROP command to the SQL script files when creating

 the backup files. When the script files are used by the gpdbrestore

 utility to restore database objects, the DROP commands remove existing

 database objects before restoring them.


 If --incremental is specified and the files are on NFS storage, the -C

 option is not supported. The database objects are not dropped if the -C

 option is specified.



--cleanup-date=yyyymmdd


 Remove backup sets for the date yyyy-mm-dd. The date format is yyyymmdd.

 If multiple backup sets were created on the date, all the backup sets

 for that date are deleted. If no backup sets are found, gpcrondump

 returns a warning message and no backup sets are deleted. If the -c

 option is specified, the backup process continues.


 Valid only with the -c or -o option.


 WARNING: Before using this option, ensure that incremental backups

 required to perform the restore are not deleted. The gpdbrestore utility

 option --list-backup lists the backup sets required to perform a backup.



--cleanup-total=n


 Remove the n oldest backup sets based on the backup timestamp. If there

 are fewer than n backup sets, gpcrondump returns a warning message and

 no backup sets are deleted. If the -c option is specified, the backup

 process continues.


 Valid only with the -c or -o option.


 WARNING: Before using this option, ensure that incremental backups

 required to perform the restore are not deleted. The gpdbrestore utility

 option --list-backup lists the backup sets required to perform a backup.



--column-inserts


 Dump data as INSERT commands with column names.


 If --incremental is specified, this option is not supported.



-d <master_data_directory>


 The master host data directory. If not specified, the value set for

 $MASTER_DATA_DIRECTORY will be used.



--ddboost [--replicate --max-streams <max_IO_streams>

  [--ddboost-skip-ping] ]


 Use Data Domain Boost for this backup. Before using Data Domain Boost,

 set up the Data Domain Boost credential, as described in the next option

 below.


 The following option is recommended if --ddboost is specified.


 * -z option (uncompressed)

    Backup compression (turned on by default) should be turned off

    with the -z option. Data Domain Boost will deduplicate and

    compress the backup data before sending it to the Data Domain system.


 --replicate --max-streams <max_IO_streams> is optional. If you specify

 this option, gpcrondump replicates the backup on the remote Data Domain

 server after the backup is complete on the primary Data Domain server.

 <max_IO_streams> specifies the maximum number of Data Domain I/O streams

 that can be used when replicating the backup set on the remote Data

 Domain server from the primary Data Domain server.


 You can use gpmfr to replicate a backup if replicating a backup with

 gpcrondump takes a long time and prevents other backups from occurring.

 Only one instance of gpcrondump can be running at a time. While

 gpcrondump is being used to replicate a backup, it cannot be used to

 create a backup.


 You can run a mixed backup that writes to both a local disk and Data

 Domain. If you want to use a backup directory on your local disk other

 than the default, use the -u option. Mixed backups are not supported

 with incremental backups. For more information about mixed backups and

 Data Domain Boost, see "Backing Up and Restoring Databases" in the

 "Greenplum Database Administrator Guide."


 IMPORTANT: Never use the Greenplum Database default backup options with

 Data Domain Boost.


 To maximize Data Domain deduplication benefits, retain at least 30 days

 of backups.


 NOTE: The -b, -c, -f, -G, -g, -R, and -u options change if --ddboost is

 specified. See the options for details.


 The DDBoost backup options are not supported if the NetBackup options

 are specified.



--ddboost-host ddboost_hostname

  [--ddboost-host ddboost_hostname ... ]

  --ddboost-user <ddboost_user>

  --ddboost-backupdir <backup_directory>

  [--ddboost-remote] [--ddboost-skip-ping]


 Sets the Data Domain Boost credentials. Do not combine this options with

 any other gpcrondump options. Do not enter just one part of this option.


 <ddboost_hostname> is the IP address (or hostname associated to the IP)

 of the host. There is a 30-character limit. If you use two or more

 network connections to connect to the Data Domain system, specify each

 connection with the --ddboost-host option.


 <ddboost_user> is the Data Domain Boost user name. There is a 30-character

 limit.


 <backup_directory> is the location for the backup files, configuration

 files, and global objects on the Data Domain system. The location on the

 system is GPDB/<backup_directory>.


 --ddboost-remote is optional. Indicates that the configuration

 parameters are for the remote Data Domain system that is used for backup

 replication Data Domain Boost managed file replication.


 Example:

   gpcrondump --ddboost-host 172.28.8.230 --ddboost-user

     ddboostusername --ddboost-backupdir gp_production


 After running gpcrondump with these options, the system verifies the

 limits on the host and user names and prompts for the Data Domain Boost

 password. Enter the password when prompted; the password is not echoed

 on the screen. There is a 40-character limit on the password that can

 include lowercase letters (a-z), uppercase letters (A-Z), numbers (0-9),

 and special characters ($, %, #, +, etc.).


 The system verifies the password. After the password is verified, the

 system creates encrypted DDBOOST_CONFIG files in the user's home

 directory.


 In the example, the --ddboost-backupdir option specifies the backup

 directory gp_production in the Data Domain Storage Unit GPDB.


 NOTE: If there is more than one operating system user using Data Domain

 Boost for backup and restore operations, repeat this configuration

 process for each of those users.


 IMPORTANT: Set up the Data Domain Boost credential before running any

 Data Domain Boost backups with the --ddboost option, described above.



--ddboost-config-remove


 Removes all Data Domain Boost credentials from the master and all

 segments on the system. Do not enter this option with any other

 gpcrondump option.



--ddboost-skip-ping


 Specify this option to skip the ping of a Data Domain system. When

 working with a Data Domain system, ping is used to ensure that the Data

 Domain system is reachable. If the Data Domain system is configured to

 block ICMP ping probes, specify this option.



--dump-stats


 Dump optimizer statistics from pg_statistic. Statistics are dumped in the

 master data directory to db_dumps/YYYYMMDD/gp_statistics_1_1_<timestamp>.


 If --ddboost is specified, the backup is located on the default storage

 unit in the directory specified by --ddboost-backupdir when the Data

 Domain Boost credentials were set.



-E <encoding>


 Character set encoding of dumped data. Defaults to the encoding of the

 database being dumped. See the Greenplum Database Reference Guide for

 the list of supported character sets.



-email-file <path_to_file>


 Specify the fully-qualified location of the YAML file that contains the

 customized Subject and From lines that are used when gpcrondump sends

 notification emails about a database back up.


 See the section "File Format for Customized Emails" for information

 about the format of the YAML file.



-f <free_space_percent>


 When checking that there is enough free disk space to create the dump

 files, specifies a percentage of free disk space that should remain

 after the dump completes. The default is 10 percent.


 NOTE: This is option is not supported if --ddboost or --incremental is

 specified.



-g (copy config files)


 Secure a copy of the master and segment configuration files

 postgresql.conf, pg_ident.conf, and pg_hba.conf. These configuration

 files are dumped in the master or segment data directory to

 db_dumps/YYYYMMDD/config_files_<timestamp>.tar.


 If --ddboost is specified, the backup is located on the default storage

 unit in the directory specified by --ddboost-backupdir when the Data

 Domain Boost credentials were set.



-G (dump global objects)


 Use pg_dumpall to dump global objects such as roles and tablespaces.

 Global objects are dumped in the master data directory to

 db_dumps/YYYYMMDD/gp_global_1_1_<timestamp>.


 If --ddboost is specified, the backup is located on the default storage

 unit in the directory specified by --ddboost-backupdir when the Data

 Domain Boost credentials were set.



-h (record dump details)


 Record details of database dump in database table

 public.gpcrondump_history in database supplied via -x option. Utility

 will create table if it does not currently exist.



--incremental (backup changes to append-optimized tables)


 Adds an incremental backup to a backup set. When performing an

 incremental backup, the complete backup set created prior to the

 incremental backup must be available. The complete backup set includes

 the following backup files:


 * The last full backup before the current incremental backup


 * All incremental backups created between the time of the full backup

   the current incremental backup


 An incremental backup is similar to a full back up except for

 append-optimized tables, including column-oriented tables. An

 append-optimized table is backed up only if at least one of the

 following operations was performed on the table after the last backup.

   ALTER TABLE

   INSERT

   UPDATE

   DELETE

   TRUNCATE

   DROP and then re-create the table


 For partitioned append-optimized tables, only the changed table

 partitions are backed up.


 The -u option must be used consistently within a backup set that

 includes a full and incremental backups. If you use the -u option with a

 full backup, you must use the -u option when you create incremental

 backups that are part of the backup set that includes the full backup.


 You can create an incremental backup for a full backup of set of

 database tables. When you create the full backup, specify the --prefix

 option to identify the backup. To include a set of tables in the full

 backup, use either the -t option or --table-file option. To exclude a

 set of tables, use either the -T option or the --exclude-table-file

 option. See the description of the option for more information on its

 use.


 To create an incremental backup based on the full backup of the set of

 tables, specify the option --incremental and the --prefix option with

 the string specified when creating the full backup. The incremental

 backup is limited to only the tables in the full backup.


 WARNING: gpcrondump does not check for available disk space prior to

 performing an incremental backup.


 IMPORTANT: An incremental back up set, a full backup and associated

 incremental backups, must be on a single device. For example, a the

 backups in a backup set must all be on a file system or must all be on a

 Data Domain system.



--inserts


 Dump data as INSERT, rather than COPY commands.


 If --incremental is specified, this option is not supported.



-j (vacuum before dump)


 Run VACUUM before the dump starts.



-K <timestamp> [--list-backup-files]


 Specify the <timestamp> that is used when creating a backup. The

 <timestamp> is 14-digit string that specifies a date and time in the

 format yyyymmddhhmmss. The date is used for backup directory name. The

 date and time is used in the backup file names. If -K <timestamp> is not

 specified, a timestamp is generated based on the system time.


 When adding a backup to set of backups, gpcrondump returns an error if

 the <timestamp> does not specify a date and time that is more recent

 than all other backups in the set.


 --list-backup-files is optional. When you specify both this option and

 the -K <timestamp> option, gpcrondump does not perform a backup.

 gpcrondump creates two text files that contain the names of the files

 that will be created when gpcrondump backs up a Greenplum database. The

 text files are created in the same location as the backup files.


 The file names use the timestamp specified by the -K <timestamp> option

 and have the suffix _pipes and _regular_files. For example:


  gp_dump_20130514093000_pipes

  gp_dump_20130514093000_regular_files


 The _pipes file contains a list of file names that be can be created as

 named pipes. When gpcrondump performs a backup, the backup files will

 generate into the named pipes. The _regular_files file contains a list

 of backup files that must remain regular files. gpcrondump and

 gpdbrestore use the information in the regular files during backup and

 restore operations. To backup a complete set of Greenplum Database

 backup files, the files listed in the _regular_files file must also be

 backed up after the completion of the backup job.


 To use named pipes for a backup, you need to create the named pipes on

 all the Greenplum Database and make them writeable before running

 gpcrondump.


 If --ddboost is specified, -K <timestamp> [--list-backup-files] is not

 supported.



-k (vacuum after dump)


 Run VACUUM after the dump has completed successfully.



-l <logfile_directory>


 The directory to write the log file. Defaults to ~/gpAdminLogs.



--netbackup-block-size <size>


 Specify the block size of data being transferred to the Symantec

 NetBackup server. The default is 512 bytes. NetBackup options are not

 supported if DDBoost backup options are specified.



--netbackup-keyword <keyword>


 Specify a keyword for the backup that is transferred to the Symantec

 NetBackup server. NetBackup adds the keyword property and the specified

 <keyword> value to the NetBackup .img files that are created for the

 backup.


 The minimum length is 1 character, and the maximum length is 100

 characters.


 NetBackup options are not supported if DDBoost backup options are

 specified.



--netbackup-service-host netbackup_server


 The NetBackup master server that Greenplum Database connects to when

 backing up to NetBackup. NetBackup options are not supported if DDBoost

 backup options are specified.



--netbackup-policy <netbackup_policy>


 The name of the NetBackup policy created for backing up Greenplum

 Database. NetBackup options are not supported if DDBoost backup options

 are specified..



--netbackup-schedule <netbackup_schedule>


 The name of the NetBackup schedule created for backing up Greenplum

 Database. NetBackup options are not supported if DDBoost backup options

 are specified.



--no-owner


 Do not output commands to set object ownership.



--no-privileges


 Do not output commands to set object privileges (GRANT/REVOKE commands).



-o (clear old dump files only)


 Clear out old dump files only, but do not run a dump. This will remove

 the oldest dump directory except the current date's dump directory.

 All dump sets within that directory will be removed.


 WARNING: Before using this option, ensure that incremental backups

 required to perform the restore are not deleted. The gpdbrestore utility

 option --list-backup lists the backup sets required to perform a

 restore.


 You can specify the option --cleanup-date or --cleanup-total to specify

 backup sets to delete.


 If --ddboost is specified, only the old files on Data Domain Boost are

 deleted.


 If --incremental is specified, this option is not supported.



--oids


 Include object identifiers (oid) in dump data.


 If --incremental is specified, this option is not supported.



--prefix <prefix_string> [--list-filter-tables ]


 Prepends <prefix_string> followed by an underscore character (_) to the

 names of all the backup files created during a backup.


 --list-filter-tables is optional. When you specify both options,

 gpcrondump does not perform a backup. For the full backup created by

 gpcrondump that is identified by the <prefix-string>, the tables that were

 included or excluded for the backup are listed. You must also specify

 the --incremental option if you specify the --list-filter-tables option.


 If --ddboost is specified, --prefix <prefix_string> [--list-filter-tables]

 is not supported.



-q (no screen output)


 Run in quiet mode. Command output is not displayed on the screen, but is

 still written to the log file.



-r (rollback on failure)


 Rollback the dump files (delete a partial dump) if a failure is

 detected. The default is to not rollback.


 Note: This option is not supported if --ddboost is specified.



-R <post_dump_script>


 The absolute path of a script to run after a successful dump operation.

 For example, you might want a script that moves completed dump files to

 a backup host. This script must reside in the same location on the

 master and all segment hosts.



--rsyncable


 Passes the --rsyncable flag to the gzip utility to synchronize the

 output occasionally, based on the input during compression. This

 synchronization increases the file size by less than 1% in most cases.

 When this flag is passed, the rsync(1) program can synchronize

 compressed files much more efficiently. The gunzip utility cannot

 differentiate between a compressed file created with this option, and

 one created without it.


-s <schema_name>


 Dump all the tables that are qualified by the specified schema in the

 database. The -s option can be specified multiple times. System catalog

 schemas are not supported. If you want to specify multiple schemas, you

 can also use the --schema-file=<filename> option in order not to exceed

 the maximum token limit.


 Only a set of tables or set of schemas can be specified. For example,

 the -s option cannot be specified with the -t option.


 If --incremental is specified, this option is not supported.



-S <schema_name>


 A schema name to exclude from the database dump. The -S option can be

 specified multiple times. If you want to specify multiple schemas, you

 can also use the --exclude-schema-file=<filename> option in order not to

 exceed the maximum token limit.


 Only a set of tables or set of schemas can be specified. For example,

 this option cannot be specified with the -t option.


 If --incremental is specified, this option is not supported.



-t <schema>.<table>


 Dump only the named table in this database. The -t option can be

 specified multiple times. If you want to specify multiple tables, you

 can also use the --table-file=<filename> option in order not to exceed the

 maximum token limit.


 Only a set of tables or set of schemas can be specified. For example,

 this option cannot be specified with the -s option.


 If --incremental is specified, this option is not supported.



-T <schema>.<table>


 A table name to exclude from the database dump. The -T option can be

 specified multiple times. If you want to specify multiple tables, you

 can also use the --exclude-table-file=<filename> option in order not to

 exceed the maximum token limit.


 Only a set of tables or set of schemas can be specified. For example,

 this option cannot be specified with the -s option.


 If --incremental is specified, this option is not supported.



--exclude-schema-file=<filename>


 Excludes all the tables that are qualified by the specified schemas

 listed in the <filename> from the database dump. The file <filename>

 contains any number of schemas, listed one per line.


 Only a set of tables or set of schemas can be specified. For example,

 this option cannot be specified with the -t option.


 If --incremental is specified, this option is not supported.



--exclude-table-file=<filename>


 Excludes all tables listed in the <filename> from the database dump. The

 file <filename> contains any number of tables, listed one per line.


 Only a set of tables or set of schemas can be specified. For example,

 this option cannot be specified with the -s option.


 If --incremental is specified, this option is not supported.



--schema-file=<filename>


 Dumps only the tables that are qualified by the schemas listed in the

 <filename>. The file <filename> contains any number of schemas, listed one

 per line.


 Only a set of tables or set of schemas can be specified. For example,

 this option cannot be specified with the -t option.


 If --incremental is specified, this option is not supported.



--table-file=<filename>


 Dumps only the tables listed in the <filename>. The file <filename> contains

 any number of tables, listed one per line.


 If --incremental is specified, this option is not supported.



-u <backup_directory>


 Specifies the absolute path where the backup files will be placed on

 each host. If the path does not exist, it will be created, if possible.

 If not specified, defaults to the data directory of each instance to be

 backed up. Using this option may be desirable if each segment host has

 multiple segment instances as it will create the dump files in a

 centralized location rather than the segment data directories.


 Note: This option is not supported if --ddboost is specified.



--use-set-session-authorization


 Use SET SESSION AUTHORIZATION commands instead of ALTER OWNER commands

 to set object ownership.



-v | --verbose


 Specifies verbose mode.



--version (show utility version)


 Displays the version of this utility.



-x <database_name>


 Required. The name of the Greenplum database to dump. Specify multiple times for

 multiple databases.



-y <reportfile>


 This option is deprecated and will be removed in a future release. If

 specified, a warning message is returned stating that the -y option is

 deprecated.


 Specifies the full path name where a copy of the backup job log file is

 placed on the master host. The job log file is created in the master

 data directory or if running remotely, the current working directory.



-z (no compression)


 Do not use compression. Default is to compress the dump files using

 gzip.


 We recommend this option (-z) be used for NFS and Data Domain Boost

 backups.



-? (help)


 Displays the online help.



*****************************************************

File Format for Customized Emails

*****************************************************


You can configure gpcrondump to send an email notification after a back

up operation completes for a database. To customize the From and Subject

lines of the email that are sent for a database, you create a YAML file

and specify the location of the file with the option --email-file. In

the YAML file, you can specify a different From and Subject line for

each database that gpcrondump backs up. This is the format of the YAML

file to specify a custom From and Subject line for a database:


EMAIL_DETAILS:

    -

        DBNAME: <database_name>

        FROM: <from_user>

        SUBJECT: <subject_text>



When email notification is configured for gpcrondump, the <from_user> and

the <subject_text> are the strings that gpcrondump uses in the email

notification after completing the back up for <database_name>.


This example YAML file specifies different From and Subject lines for

the databases testdb100 and testdb200.


EMAIL_DETAILS:

    -

        DBNAME: testdb100

        FROM: RRP_MPE2_DCA_1

        SUBJECT: backup completed for Database 'testdb100'


    -

        DBNAME: testdb200

        FROM: Report_from_DCDDEV_host

        SUBJECT: Completed backup for database 'testdb200'



*****************************************************

EXAMPLES

*****************************************************


Call gpcrondump directly and dump mydatabase (and global objects):


  gpcrondump -x mydatabase -c -g -G



A crontab entry that runs a backup of the sales database (and global

objects) nightly at one past midnight:


  01 0 * * * /home/gpadmin/gpdump.sh >> gpdump.log



The content of dump script gpdump.sh is:


#!/bin/bash

  export GPHOME=/usr/local/greenplum-db

  export MASTER_DATA_DIRECTORY=/data/gpdb_p1/gp-1

  . $GPHOME/greenplum_path.sh

  gpcrondump -x sales -c -g -G -a -q



This example creates two text files, one with the suffix _pipes and the

other with _regular_files. The _pipes file contain the file names that

can be named pipes when you backup the Greenplum database mytestdb.


  gpcrondump -x mytestdb -K 20131030140000 --list-backup-files



To use incremental backup with a set of database tables, you must

create a full backup of the set of tables and specify the --prefix

option to identify the backup set. The following example uses the

--table-file option to create a full backup of the set of files listed

in the file user-tables. The prefix user_backup identifies the backup

set.


  gpcrondump -x mydatabase --table-file=user-tables --prefix user_backup



To create an incremental backup for the full backup created in the

previous example, specify the --incremental option and the option

--prefix user_backup to identify backup set. This example creates an

incremental backup.


  gpcrondump -x mydatabase --incremental --prefix user_backup



This command lists the tables that were included or excluded for the

full backup.


  gpcrondump -x mydatabase --incremental --prefix user_backup --list-filter-tables



This command backs up the database customer and specifies a NetBackup

policy and schedule that are defined on the NetBackup master server

nbu_server1. A block size of 1024 bytes is used to transfer data to the

NetBackup server.


  gpcrondump -x customer --netbackup-service-host=nbu_server1

    --netbackup-policy=gpdb_cust --netbackup-schedule=gpdb_backup

    --netbackup-block-size=1024



*****************************************************

SEE ALSO

*****************************************************


gpdbrestore

[gpadmin@mdw ~]$



===============================================================================================================================

x

[gpadmin@mdw ~]$ gpdbrestore --help

COMMAND NAME: gpdbrestore


Restores a database from a set of dump files generated by gpcrondump.


*****************************************************

SYNOPSIS

*****************************************************


gpdbrestore { -t <timestamp_key> { [-L]

              | [--netbackup-service-host <netbackup_server>

                [--netbackup-block-size <size>] ] }

              | -b <YYYYMMDD>

              | -R <hostname>:<path_to_dumpset>

              | -s <database_name> }

     [--noplan]  [--noanalyze]  [-u <backup_directory>] [--list-backup]

     [--prefix <prefix_string> ] [--report-status-dir <report_directory> ]

     [-T <schema>.<table> [-T ...]] [--table-file <file_name>]

     [--truncate] [-e] [-G]

     [-B <parallel_processes>]

     [-d <master_data_directory>] [-a] [-q] [-l <logfile_directory>]

     [-v] [--ddboost ]

     [-S <schema_name> [-S ...]]

     [--redirect <database_name> ]

     [--change-schema=<schema_name> ]


gpdbrestore -?


gpdbrestore --version



*****************************************************

DESCRIPTION

*****************************************************


The gpdbrestore utility recreates the data definitions (schema) and userdata in a Greenplum database using the script files created by

gpcrondump operations.


When you restore from an incremental backup, the gpdbrestore utility assumes the complete backup set is available. The complete backup set

includes the following backup files:


* The last full backup before the specified incremental backup


* All incremental backups created between the time of the full backup

  the specified incremental backup


The gpdbrestore utility provides the following functionality:


* Automatically reconfigures for compression.


* Validates the number of dump files are correct (For primary only,

  mirror only, primary and mirror, or a subset consisting of some mirror

  and primary segment dump files).


* If a failed segment is detected, restores to active segment instances.


* Except when restoring data from a NetBackup server, you do not need to

  know the complete timestamp key (-t) of the backup set to restore.

  Additional options are provided to instead give just a date (-b), backup

  set directory location (-R), or database name (-s) to restore.


* The -R option allows the ability to restore from a backup set located

  on a host outside of the Greenplum Database array (archive host).

  Ensures that the correct dump file goes to the correct segment instance.


* Identifies the database name automatically from the backup set.


* Allows you to restore particular tables only (-T option) instead of

  the entire database. Note that single tables are not automatically

  dropped or truncated prior to restore.


  Performs an ANALYZE operation on the tables that are restored. You can

  disable the ANALYZE operation by specifying the option --noanalyze.



* Can restore global objects such as roles and tablespaces (-G option).


* Detects if the backup set is primary segments only or primary and

  mirror segments and passes the appropriate options to gp_restore.


* Allows you to drop the target database before a restore in a single

 operation.



*****************************************************

Restoring a Database from NetBackup

*****************************************************


Greenplum Database must be configured to communicate with the Symantec

NetBackup master server that is used to restore database data. See

"Backing Up and Restoring Databases" in the "Greenplum Database

Administrator Guide" for information about configuring Greenplum

Database and NetBackup.


When restoring from NetBackup server, you must specify the timestamp of

the backup with the -t option.


NetBackup is not compatible with DDBoost. Both NetBackup and DDBoost

cannot be used in a single back up operation.



*****************************************************

Restoring a Database with Named Pipes

*****************************************************


If you used named pipes when you backed up a database with gpcrondump,

named pipes with the backup data must be available when restoring the

database from the backup.



*****************************************************

Error Reporting

*****************************************************


gpdbrestore does not report errors automatically. After the restore is

completed, check the report status files to verify that there are no

errors. The restore status files are stored in the db_dumps/<date>/

directory by default.



*****************************************************

OPTIONS

*****************************************************

-a (do not prompt)


 Do not prompt the user for confirmation.



-b <YYYYMMDD>


 Looks for dump files in the segment data directories on the Greenplum

 Database array of hosts in db_dumps/<YYYYMMDD>. If --ddboost is specified,

 the systems looks for dump files on the Data Domain Boost host.



-B <parallel_processes>


 The number of segments to check in parallel for pre/post-restore

 validation. If not specified, the utility will start up to 60 parallel

 processes depending on how many segment instances it needs to restore.



--change-schema=schema-name


 Optional. Restores tables from a backup created with gpcrondump to a

 different schema. The specified schema must exist in the database.

 If the schema does not exist, the utility returns an error. System

 catalog schemas are not supported.


 You must specify tables to restore with the -T and --table-file options.

 If a table that is being restored exists in schema-name, the utility

 returns a warning and attempts to append the data to the table from the

 backup. You can specify the --trunctate option to truncate table data

 before restoring data to the table from the backup.



-d <master_data_directory>


 Optional. The master host data directory. If not specified, the value

 set for $MASTER_DATA_DIRECTORY will be used.



--ddboost


 Use Data Domain Boost for this restore, if the --ddboost option was

 passed when the data was dumped. Before using Data Domain Boost, make

 sure the one-time Data Domain Boost credential setup is complete. See

 the Greenplum Database System Administrator Guide for details.


 If you backed up Greenplum Database configuration files with the

 gpcrondump option -g and specified the --ddboost option, you must

 manually restore the backup from the Data Domain system. The

 configuration files must be restored for the Greenplum Database master

 and all the hosts and segments. The backup location on the Data Domain

 system is the directory GPDB/<backup_directory>/<date>. The

 <backup_directory> is set when you specify the Data Domain credentials

 with gpcrondump.


 This option is not supported if --netbackup-service-host is specified.



-e (drop target database before restore)


 Drops the target database before doing the restore and then recreates

 it.



-G [include|only]


 Restores global objects such as roles and tablespaces if the global

 object dump file db_dumps/<date>/gp_global_1_1_<timestamp> is found in

 the master data directory.


 Specify either "-G only" to only restore the global objects dump file

 or "-G include" to restore global objects along with a normal restore.

 Defaults to "include" if neither argument is provided.



-l <logfile_directory>


 The directory to write the log file. Defaults to ~/gpAdminLogs.



--list-backup


 Lists the set of full and incremental backup sets required to perform a

 restore based on the <timestamp_key> specified with the -t option and the

 location of the backup set.


 This option is supported only if the <timestamp_key> is for an incremental

 backup.



-L (list tablenames in backup set)


 When used with the -t option, lists the table names that exist in the

 named backup set and exits. Does not perform a restore.



-m (restore metadata only)


 Performs a restore of database metadata (schema and table definitions, SET

 statements, and so forth) without restoring data.  If the --restore-stats or

 -G options are provided as well, statistics or globals will also be restored.


 The --noplan and --noanalyze options are not supported in conjunction with

 this option, as they affect the restoration of data and no data is restored.



--netbackup-block-size <size>


 Specify the block size, in bytes, of data being transferred from the

 Symantec NetBackup server. The default is 512 bytes.


 NetBackup options are not supported if DDBoost backup options are

 specified.



--netbackup-service-host <netbackup_server>


 The NetBackup master server that Greenplum Database connects to when

 backing up to NetBackup. If you specify this option, you must specify

 the timestamp of the backup with the -t option.


 This option is not supported with any of the these options:

 -R, -s, -b, -L, or --ddboost.



--noanalyze


 The ANALYZE command is not run after a successful restore. The default

 is to run the ANALYZE command on restored tables. This option is useful

 if running ANALYZE on the tables requires a significant amount of time.


 If this option is specified, you should run ANALYZE manually on

 restored tables. Failure to run ANALYZE following a restore might result

 in poor database performance.



--noplan


 Restores only the data backed up during the incremental backup specified

 by the timestamp_key. No other data from the complete backup set are

 restored. The full backup set containing the incremental backup must be

 available.


 If the timestamp_key specified with the -t option does not reference an

 incremental backup, an error is returned.



--prefix <prefix_string>


 If you specified the gpcrondump option --prefix <prefix_string> to create

 the backup, you must specify this option with the <prefix_string> when

 restoring the backup.


 If you created a full backup of a set of tables with gpcrondump and

 specified a prefix, you can use gpcrondump with the options

 --list-filter-tables and --prefix <prefix_string> to list the tables

 that were included or excluded for the backup.



-q (no screen output)


 Run in quiet mode. Command output is not displayed on the screen, but is

 still written to the log file.



-R <hostname>:<path_to_dumpset>


 Allows you to provide a hostname and full path to a set of dump files.

 The host does not have to be in the Greenplum Database array of hosts,

 but must be accessible from the Greenplum master.



--redirect <database_name>


 The name of the database where the data is restored. Specify this option

 to restore data to a database that is different than the database

 specified during back up. If <database_name> does not exist, it is

 created.



--report-status-dir <report_directory>


 Specifies the absolute path to the directory on the each Greenplum

 Database host (master and segment hosts) where gpdbrestore writes report

 status files for a restore operation. If <report_directory> does not exist

 or is not writable, gpdbrestore returns an error and stops.


 If this option is not specified and the -u option is specified, report

 status files are written to the location specified by the -u option if

 the -u location is writable. If the location specified by -u option is

 not writable, the report status files are written to segment data

 directories.



--restore-stats [include|only]


 Restores optimizer statistics if the statistics dump file

 db_dumps/<date>/gp_statistics_1_1_<timestamp> is found in the master data

 directory. Setting this option automatically skips the final analyze step,

 so it is not necessary to also set the --noanalyze flag in conjunction with

 this one.


 Specify "--restore-stats only" to only restore the statistics dump file or

 "--restore-stats include" to restore statistics along with a normal restore.

 Defaults to "include" if neither argument is provided.


 If "--restore-stats only" is specified along with the -e option, an error

 is returned.



-s <database_name>


 Looks for latest set of dump files for the given database name in the

 segment data directories db_dumps directory on the Greenplum Database

 array of hosts.



-t <timestamp_key>


 The 14 digit timestamp key that uniquely identifies a backup set of data

 to restore. It is of the form YYYYMMDDHHMMSS. Looks for dump files

 matching this timestamp key in the segment data directories db_dumps

 directory on the Greenplum Database array of hosts.



-T <schema>.<table_name>


 Table names to restore, specify multiple times for multiple tables. The

 named table(s) must exist in the backup set of the database being restored.

 Existing tables are not automatically truncated before data is restored

 from backup. If your intention is to replace existing data in the table

 from backup, truncate the table prior to running gpdbrestore -T.



-S <schema>


 Schema names to restore, specify multiple times for multiple schemas.

 Existing tables are not automatically truncated before data is restored

 from backup. If your intention is to replace existing data in the table

 from backup, truncate the table prior to running gpdbrestore -S.



--table-file <file_name>


 Specify a file <file_name> that contains a list of table names to restore.

 The file contains any number of table names, listed one per line. See

 the -T option for information about restoring specific tables.



--truncate


 Truncate table data before restoring data to the table from the backup.

 This option is supported only when restoring a set of tables with the

 option -T or --table-file.

 This option is not supported with the -e option.



-u <backup_directory>


 Specifies the absolute path to the directory containing the db_dumps

 directory on each host. If not specified, defaults to the data directory

 of each instance to be backed up. Specify this option if you specified a

 backup directory with the gpcrondump option -u when creating a backup

 set.


 If <backup_directory> is not writable, backup operation report status

 files are written to segment data directories. You can specify a

 different location where report status files are written with the

 --report-status-dir option.


 NOTE: This option is not supported if --ddboost is specified.



-v | --verbose


 Specifies verbose mode.



--version (show utility version)


 Displays the version of this utility.



-? (help)


 Displays the online help.



*****************************************************

EXAMPLES

*****************************************************


Restore the sales database from the latest backup files generated by

gpcrondump (assumes backup files are in the segment data directories in

db_dumps):


  gpdbrestore -s sales



Restore a database from backup files that reside on an archive host

outside the Greenplum Database array (command issued on the Greenplum

master host):


  gpdbrestore -R archivehostname:/data_p1/db_dumps/20080214



Restore global objects only (roles and tablespaces):


gpdbrestore -G



NOTE: The -R option is not supported when restoring a backup set

that includes incremental backups.


If you restore from a backup set that contains an incremental backup, all the files in

the backup set must be available to gpdbrestore. For example, the following

timestamp keys specify a backup set. 20120514054532 is the full backup and the

others are incremental.


  20120514054532

  20120714095512

  20120914081205

  20121114064330

  20130114051246


The following gbdbrestore command specifies the timestamp key 20121114064330.

The incremental backup with the timestamps 20120714095512 and 20120914081205

and the full backup must be available to perform a restore.


  gpdbrestore -t 20121114064330



The following gbdbrestore command uses the --noplan option to restore only the

data that was backed up during the incremental backup with the timestamp key

20121114064330. Data in the previous incremental backups and the data in the full

backup are not restored.


  gpdbrestore -t 20121114064330 --noplan



This gpdbrestore command restores Greenplum Database data from the data

managed by NetBackup master server nbu_server1. The option -t

20130530090000 specifies the timestamp generated by gpcrondump when the

backup was created. The -e option specifies that the target database is

dropped before it is restored.


  gpdbrestore -t 20130530090000 -e  --netbackup-service-host=nbu_server1



*****************************************************

SEE ALSO

*****************************************************


gpcrondump

[gpadmin@mdw ~]$

How to copy the gpperfmon data from DCA1 to DCA2

posted Jan 15, 2016, 10:53 AM by Sachchida Ojha   [ updated May 20, 2016, 5:48 PM ]

Task:  copy the gpperfmon V1 data from DCA1 V1 to DCA2  V2.  Both DCA's are accessible from a commodity server.

Here are the very simple steps. Since those tables are not big so we used power of copy commands.

1. Updated the .bashrc and created two aliases psqldca1gpperfmon and psqldca2gpperfmon

2. Connect to DCA2 V2 and create V1 tables which you wanted to move. For example

create table system_history_v1 as select * from system_history where 1=2;

create table database_history_v1 as select * from database_history where 1=2;

create table emcconnect_history_v1 as select * from emcconnect_history where 1=2;

create table queries_history_v1 as select * from queries_history where 1=2;

create table log_alert_history_v1 as select * from log_alert_history where 1=2;

create table diskspace_history_v1 as select * from diskspace_history where 1=2;

 

 3. Connect to DCA1 V1 and execute the following command.

 

 psqldca1gpperfmon -c "copy public.system_history to stdout " | psqldca2gpperfmon -c "copy system_history_v1 from stdin";

 psqldca1gpperfmon -c "copy public.queries_history  to stdout " | psqldca2gpperfmon -c "copy queries_history_v1 from stdin";

 psqldca1gpperfmon -c "copy public.database_history to stdout " | psqldca2gpperfmon -c "copy database_history_v1 from stdin";

 psqldca1gpperfmon -c "copy public.emcconnect_history to stdout " | psqldca2gpperfmon -c "copy emcconnect_history_v1 from stdin";

 psqldca1gpperfmon -c "copy public.log_alert_history to stdout " | psqldca2gpperfmon -c "copy log_alert_history_v1 from stdin";

 psqldca1gpperfmon -c "copy public.diskspace_history to stdout " | psqldca2gpperfmon -c "copy diskspace_history_v1 from stdin";



Note: Same procedure can be used to copy small tables from one database to another database on the same server or different server.  DO NOT USE this approach for large tables. Use external table approach, parallel backup/restore, gptransfer utility for large tables.


Dump files created during (gp_dump) greenplum parallel backup command

posted Jan 17, 2015, 12:37 PM by Sachchida Ojha   [ updated May 20, 2016, 5:49 PM ]

Note: gp_dump is deprecated in later versions. use gpcrondump.
 Dump File Description
 In the master  
  gp_catalog_1_<dbid>_timestamp> System catalog tables
  gp_cdatabase_1_<dbid>_timestamp> CREATE DATABASE STATEMENT
  gp_dump_1_<dbid>_timestamp> Database Schemas -- DDL to create schema and database objects
  gp_dump_status_1_<dbid>_timestamp> Log file
  
 In the segment 
   gp_dump_0_<dbid>_timestamp> Data for the segment - default uses COPY Statements
   gp_dump_status_0_<dbid>_timestamp> log file

Note: each dump file has the timestamp key which identifies the backup set. This timestamp is needed by a restore to identify the backup set.

GPDBRESTORE Example 1- backup set of tables and restore it to a new database

posted Jan 11, 2015, 4:22 PM by Sachchida Ojha   [ updated May 20, 2016, 5:49 PM ]

a) Backup list of tables using gpcrondump and a specified location

[gpadmin@sachi ~]$ gpcrondump  -c --table-file="tablist" -u /disk4/backup
20150111:17:23:47:022685 gpcrondump:sachi:gpadmin-[INFO]:-Starting gpcrondump with args: -c --table-file=tablist -u /disk4/backup
20150111:17:23:47:022685 gpcrondump:sachi:gpadmin-[INFO]:-Setting dump database to value of $PGDATABASE which is sachi
20150111:17:23:47:022685 gpcrondump:sachi:gpadmin-[INFO]:----------------------------------------------------
20150111:17:23:47:022685 gpcrondump:sachi:gpadmin-[INFO]:-Master Greenplum Instance dump parameters
20150111:17:23:47:022685 gpcrondump:sachi:gpadmin-[INFO]:----------------------------------------------------
20150111:17:23:47:022685 gpcrondump:sachi:gpadmin-[INFO]:-Dump type                            = Single database, specific table
20150111:17:23:47:022685 gpcrondump:sachi:gpadmin-[INFO]:----------------------------------------------------
20150111:17:23:47:022685 gpcrondump:sachi:gpadmin-[INFO]:-Table file name                      = tablist
20150111:17:23:47:022685 gpcrondump:sachi:gpadmin-[INFO]:----------------------------------------------------
20150111:17:23:47:022685 gpcrondump:sachi:gpadmin-[INFO]:-Database to be dumped                = sachi
20150111:17:23:47:022685 gpcrondump:sachi:gpadmin-[INFO]:-Dump directory                       = /disk4/backup
20150111:17:23:47:022685 gpcrondump:sachi:gpadmin-[INFO]:-Master port                          = 5432
20150111:17:23:47:022685 gpcrondump:sachi:gpadmin-[INFO]:-Master data directory                = /home/gpmaster/gpsne-1
20150111:17:23:47:022685 gpcrondump:sachi:gpadmin-[INFO]:-Run post dump program                = Off
20150111:17:23:47:022685 gpcrondump:sachi:gpadmin-[INFO]:-Rollback dumps                       = Off
20150111:17:23:47:022685 gpcrondump:sachi:gpadmin-[INFO]:-Dump file compression                = On
20150111:17:23:47:022685 gpcrondump:sachi:gpadmin-[INFO]:-Clear old dump files                 = On
20150111:17:23:47:022685 gpcrondump:sachi:gpadmin-[INFO]:-Update history table                 = Off
20150111:17:23:47:022685 gpcrondump:sachi:gpadmin-[INFO]:-Secure config files                  = Off
20150111:17:23:47:022685 gpcrondump:sachi:gpadmin-[INFO]:-Dump global objects                  = Off
20150111:17:23:47:022685 gpcrondump:sachi:gpadmin-[INFO]:-Vacuum mode type                     = Off
20150111:17:23:47:022685 gpcrondump:sachi:gpadmin-[INFO]:-Ensuring remaining free disk         > 10

Continue with Greenplum dump Yy|Nn (default=N):
> y
20150111:17:23:49:022685 gpcrondump:sachi:gpadmin-[INFO]:-Directory /disk4/backup/db_dumps/20150111 exists
20150111:17:23:49:022685 gpcrondump:sachi:gpadmin-[INFO]:-Checked /disk4/backup on master
20150111:17:23:49:022685 gpcrondump:sachi:gpadmin-[INFO]:-Directory /home/gpmaster/gpsne-1/db_dumps/20150111 exists
20150111:17:23:49:022685 gpcrondump:sachi:gpadmin-[INFO]:-Checked /home/gpmaster/gpsne-1 on master
20150111:17:23:50:022685 gpcrondump:sachi:gpadmin-[INFO]:-Configuring for single-database, include-table dump
20150111:17:23:50:022685 gpcrondump:sachi:gpadmin-[INFO]:-Located table public.DIM_TAXPAYER in sachi database
20150111:17:23:50:022685 gpcrondump:sachi:gpadmin-[INFO]:-Located table public.Foo in sachi database
20150111:17:23:50:022685 gpcrondump:sachi:gpadmin-[INFO]:-Located table public.a in sachi database
20150111:17:23:50:022685 gpcrondump:sachi:gpadmin-[INFO]:-Located table public.abc in sachi database
20150111:17:23:50:022685 gpcrondump:sachi:gpadmin-[INFO]:-Located table public.abctest in sachi database
20150111:17:23:51:022685 gpcrondump:sachi:gpadmin-[INFO]:-Located table public.dynamic_analyze_config in sachi database
20150111:17:23:51:022685 gpcrondump:sachi:gpadmin-[INFO]:-Located table public.h in sachi database
20150111:17:23:51:022685 gpcrondump:sachi:gpadmin-[INFO]:-Located table public.i in sachi database
20150111:17:23:51:022685 gpcrondump:sachi:gpadmin-[INFO]:-Located table public.sales in sachi database
20150111:17:23:51:022685 gpcrondump:sachi:gpadmin-[INFO]:-Adding compression parameter
20150111:17:23:51:022685 gpcrondump:sachi:gpadmin-[INFO]:-Dump command line gp_dump -p 5432 -U gpadmin --gp-d=/disk4/backup/db_dumps/20150111 --gp-r=/home/gpmaster/gpsne-1/db_dumps/20150111 --gp-s=p --gp-c sachi --table-file=tablist
20150111:17:23:51:022685 gpcrondump:sachi:gpadmin-[INFO]:-Starting dump process
20150111:17:24:02:022685 gpcrondump:sachi:gpadmin-[INFO]:-Dump process returned exit code 0
20150111:17:24:02:022685 gpcrondump:sachi:gpadmin-[INFO]:-Timestamp key = 20150111172357
20150111:17:24:02:022685 gpcrondump:sachi:gpadmin-[INFO]:-Checked master status file and master dump file.
20150111:17:24:02:022685 gpcrondump:sachi:gpadmin-[INFO]:-Preparing to remove dump 20150109 from all hosts
20150111:17:24:03:022685 gpcrondump:sachi:gpadmin-[WARNING]:-Error encountered during deletion of /disk1/gpdata1/gpsne0/db_dumps/20150109 on sachi
20150111:17:24:03:022685 gpcrondump:sachi:gpadmin-[WARNING]:-Error encountered during deletion of /disk2/gpdata2/gpsne1/db_dumps/20150109 on sachi
20150111:17:24:03:022685 gpcrondump:sachi:gpadmin-[INFO]:-Dump status report
20150111:17:24:03:022685 gpcrondump:sachi:gpadmin-[INFO]:----------------------------------------------------
20150111:17:24:03:022685 gpcrondump:sachi:gpadmin-[INFO]:-Target database                          = sachi
20150111:17:24:03:022685 gpcrondump:sachi:gpadmin-[INFO]:-Dump subdirectory                        = 20150111
20150111:17:24:03:022685 gpcrondump:sachi:gpadmin-[INFO]:-Clear old dump directories               = On
20150111:17:24:03:022685 gpcrondump:sachi:gpadmin-[INFO]:-Backup set deleted                       = 20150109
20150111:17:24:03:022685 gpcrondump:sachi:gpadmin-[INFO]:-Dump start time                          = 17:23:51
20150111:17:24:03:022685 gpcrondump:sachi:gpadmin-[INFO]:-Dump end time                            = 17:24:02
20150111:17:24:03:022685 gpcrondump:sachi:gpadmin-[INFO]:-Status                                   = COMPLETED
20150111:17:24:03:022685 gpcrondump:sachi:gpadmin-[INFO]:-Dump key                                 = 20150111172357
20150111:17:24:03:022685 gpcrondump:sachi:gpadmin-[INFO]:-Dump file compression                    = On
20150111:17:24:03:022685 gpcrondump:sachi:gpadmin-[INFO]:-Vacuum mode type                         = Off
20150111:17:24:03:022685 gpcrondump:sachi:gpadmin-[INFO]:-Exit code zero, no warnings generated
20150111:17:24:03:022685 gpcrondump:sachi:gpadmin-[INFO]:----------------------------------------------------
20150111:17:24:03:022685 gpcrondump:sachi:gpadmin-[WARNING]:-Found neither /usr/local/greenplum-db/./bin/mail_contacts nor /home/gpadmin/mail_contacts
20150111:17:24:03:022685 gpcrondump:sachi:gpadmin-[WARNING]:-Unable to send dump email notification
20150111:17:24:03:022685 gpcrondump:sachi:gpadmin-[INFO]:-To enable email notification, create /usr/local/greenplum-db/./bin/mail_contacts or /home/gpadmin/mail_contacts containing required email addresses

b) Restore tables using gpdbrestore from a specified location and create a new database

[gpadmin@sachi ~]$ gpdbrestore -R sachi:/disk4/backup/db_dumps/20150111
20150111:17:31:13:023833 gpdbrestore:sachi:gpadmin-[INFO]:-Starting gpdbrestore with args: -R sachi:/disk4/backup/db_dumps/20150111
20150111:17:31:14:023833 gpdbrestore:sachi:gpadmin-[ERROR]:-gpdbrestore error: Database test1 does not exist and -e option not supplied

Solution:
1. Open gp_cdatabase_1_1_20150111172357 and change the database name to new database test1.

[gpadmin@sachi 20150111]$ cat gp_cdatabase_1_1_20150111172357
--
-- Database creation
--
CREATE DATABASE test1 WITH TEMPLATE = template0 ENCODING = 'UTF8' OWNER = gpadmin;
[gpadmin@sachi 20150111]$ 

2. use -e option to create the database.

Now run the restore
[gpadmin@sachi ~]$ gpdbrestore -e test1 -R sachi:/disk4/backup/db_dumps/20150111
20150111:17:32:57:023990 gpdbrestore:sachi:gpadmin-[INFO]:-Starting gpdbrestore with args: -e test1 -R sachi:/disk4/backup/db_dumps/20150111
20150111:17:32:58:023990 gpdbrestore:sachi:gpadmin-[INFO]:-------------------------------------------
20150111:17:32:58:023990 gpdbrestore:sachi:gpadmin-[INFO]:-Greenplum database restore parameters
20150111:17:32:58:023990 gpdbrestore:sachi:gpadmin-[INFO]:-------------------------------------------
20150111:17:32:58:023990 gpdbrestore:sachi:gpadmin-[INFO]:-Restore type               = Full Database
20150111:17:32:58:023990 gpdbrestore:sachi:gpadmin-[INFO]:-Database to be restored    = test1
20150111:17:32:58:023990 gpdbrestore:sachi:gpadmin-[INFO]:-Drop and re-create db      = On
20150111:17:32:58:023990 gpdbrestore:sachi:gpadmin-[INFO]:-Restore method             = Remote host
20150111:17:32:58:023990 gpdbrestore:sachi:gpadmin-[INFO]:-Recovery hostname          = sachi
20150111:17:32:58:023990 gpdbrestore:sachi:gpadmin-[INFO]:-Remote recovery path       = /disk4/backup/db_dumps/20150111
20150111:17:32:58:023990 gpdbrestore:sachi:gpadmin-[INFO]:-Restore timestamp          = 20150111172357
20150111:17:32:58:023990 gpdbrestore:sachi:gpadmin-[INFO]:-Restore compressed dump    = On
20150111:17:32:58:023990 gpdbrestore:sachi:gpadmin-[INFO]:-Restore global objects     = Off
20150111:17:32:58:023990 gpdbrestore:sachi:gpadmin-[INFO]:-Array fault tolerance      = n

Continue with Greenplum restore Yy|Nn (default=N):
> y
20150111:17:33:11:023990 gpdbrestore:sachi:gpadmin-[INFO]:-Commencing remote database dump file recovery process, please wait...
20150111:17:33:11:023990 gpdbrestore:sachi:gpadmin-[INFO]:-Commencing remote copy from sachi to sachi:/home/gpmaster/gpsne-1/db_dumps/20150111
20150111:17:33:11:023990 gpdbrestore:sachi:gpadmin-[INFO]:-Creating directory /disk1/gpdata1/gpsne0/db_dumps/20150111 on sachi
20150111:17:33:11:023990 gpdbrestore:sachi:gpadmin-[INFO]:-Commencing remote copy from sachi to sachi:/disk1/gpdata1/gpsne0/db_dumps/20150111
20150111:17:33:11:023990 gpdbrestore:sachi:gpadmin-[INFO]:-Creating directory /disk2/gpdata2/gpsne1/db_dumps/20150111 on sachi
20150111:17:33:12:023990 gpdbrestore:sachi:gpadmin-[INFO]:-Commencing remote copy from sachi to sachi:/disk2/gpdata2/gpsne1/db_dumps/20150111
20150111:17:33:13:023990 gpdbrestore:sachi:gpadmin-[INFO]:-Invoking /home/gpmaster/gpsne-1/db_dumps/20150111/gp_cdatabase_1_1_20150111172357
20150111:17:33:34:023990 gpdbrestore:sachi:gpadmin-[INFO]:-gp_restore -i -h sachi -p 5432 -U gpadmin --gp-d=db_dumps/20150111 --gp-i --gp-k=20150111172357 --gp-r=db_dumps/20150111 --gp-l=p --gp-c -d test1
20150111:17:34:11:023990 gpdbrestore:sachi:gpadmin-[INFO]:-Commencing analyze of test1 database, please wait
20150111:17:34:17:023990 gpdbrestore:sachi:gpadmin-[INFO]:-Analyze of test1 completed without error
[gpadmin@sachi ~]$ 



Exploring gpdbrestore command

posted Jan 11, 2015, 10:27 AM by Sachchida Ojha   [ updated May 20, 2016, 5:50 PM ]

Get the syntax

[gpadmin@sachi ~]$ gpdbrestore --help
COMMAND NAME: gpdbrestore

A wrapper utility around gp_restore. Restores a database from a set of dump files generated by gpcrondump.
*****************************************************
SYNOPSIS
*****************************************************
gpdbrestore { -t <timestamp_key> [-L] 
              | -b YYYYMMDD 
              | -R <hostname>:<path_to_dumpset> 
              | -s <database_name> } 
     [-T <schema>.<table> [,...]] [-e] [-G] [-B <parallel_processes>] 
     [-d <master_data_directory>] [-a] [-q] [-l <logfile_directory>] 
     [-v] [-ddboost]

gpdbrestore -? 

gpdbrestore --version
*****************************************************
DESCRIPTION
*****************************************************
gpdbrestore is a wrapper around gp_restore, which provides some convenience and flexibility in restoring from a set of backup files created by gpcrondump. This utility provides the following additional functionality on top of gp_restore:

1. Automatically reconfigures for compression. 

2. Validates the number of dump files are correct (For primary only, mirror only, primary and mirror, or a subset consisting of some mirror and primary segment dump files). 

3. If a failed segment is detected, restores to active segment instances.

4.  Do not need to know the complete timestamp key (-t) of the backup set to restore. 
Additional options are provided to instead give just a date (-b), backup set directory location (-R), or database name (-s) to restore.

5. The -R option allows the ability to restore from a backup set located on a host outside of the Greenplum Database array (archive host). Ensures that the correct dump file goes to the correct segment instance.

6. Identifies the database name automatically from the backup set.

7. Allows you to restore particular tables only (-T option) instead of the entire database. Note that single tables are not automatically dropped or truncated prior to restore.

8. Can restore global objects such as roles and tablespaces (-G option).

9. Detects if the backup set is primary segments only or primary and mirror segments and passes the appropriate options to gp_restore.

10. Allows you to drop the target database before a restore in a single operation. 

Error Reporting

gpdbrestore does not report errors automatically. After the restore is completed, check the report status files to verify that there are no errors. The restore status files are stored in the db_dumps/<date>/ directory by default. 

*****************************************************
OPTIONS
*****************************************************
-a (do not prompt)
 Do not prompt the user for confirmation.

-b YYYYMMDD
 Looks for dump files in the segment data directories on the Greenplum Database array of hosts in db_dumps/YYYYMMDD.
 If --ddboost is specified, the system looks for dump files on the DD Boost host. 

-B <parallel_processes>
 The number of segments to check in parallel for pre/post-restore  validation. If not specified, the utility will start up to 60 parallel processes depending on how many segment instances it needs to restore.

-d <master_data_directory>
 Optional. The master host data directory. If not specified, the value set for $MASTER_DATA_DIRECTORY will be used.

--ddboost
 Use Data Domain DD Boost for this restore, if the --ddboost option was passed when the data was dumped. Before make sure the one-time DD Boost credential setup is completed.

-e (drop target database before restore)
 Drops the target database before doing the restore and then recreates it.

-G (restore global objects)
 Restores global objects such as roles and tablespaces if the global object dump file db_dumps/<date>/gp_global_1_1_<timestamp> is found  in the master data directory.

-l <logfile_directory>
 The directory to write the log file. Defaults to ~/gpAdminLogs.

-L (list tablenames in backup set)
 When used with the -t option, lists the table names that exist in the named backup set and exits. Does not do a restore.

-q (no screen output)
 Run in quiet mode. Command output is not displayed on the screen, but is still written to the log file.

-R <hostname>:<path_to_dumpset>
 Allows you to provide a hostname and full path to a set of dump files. The host does not have to be in the Greenplum Database array of hosts, but must be accessible from the Greenplum master.

-s <database_name>
 Looks for latest set of dump files for the given database name in  the segment data directories db_dumps directory on the Greenplum Database array of hosts.

-t <timestamp_key>
 The 14 digit timestamp key that uniquely identifies a backup set of data to restore. It is of the form YYYYMMDDHHMMSS. Looks for dump files matching this timestamp key in the segment data directories db_dumps directory on the Greenplum Database array of hosts.

-T <schema>.<table_name>
 A comma-separated list of specific table names to restore. The named table(s) must exist in the backup set of the database being restored. Existing tables are not automatically truncated before data is restored from backup. If your intention is to replace existing data in the table from backup, truncate the table prior 
 to running gpdbrestore -T.

-v | --verbose
Specifies verbose mode.

--version (show utility version)
 Displays the version of this utility.

-? (help)
 Displays the online help.


*****************************************************
EXAMPLES
*****************************************************

Example 1: 
Restore the sales database from the latest backup files generated by gpcrondump (assumes backup files are in the segment data directories in db_dumps):

[gpadmin@sachi ~]$gpdbrestore -s sales


Example 2: 
Restore a database from backup files that reside on an archive host outside the Greenplum Database array (command issued on the Greenplum master host):

[gpadmin@sachi ~]$gpdbrestore -R archivehostname:/data_p1/db_dumps/20080214


Example 3:
Restore global objects only (roles and tablespaces):

[gpadmin@sachi ~]$gpdbrestore -G







Backup a table or list of tables in Greenplum using gpcrondump

posted Jan 9, 2015, 2:16 PM by Sachchida Ojha   [ updated May 20, 2016, 5:51 PM ]

gpcrondump allows you to dump a specific table or list of tables.  
Note: By default (if no location option -u is specified) dump files will be created on master and segment server master data directories. If location option is specified, backup files is created in that directory.
[gpadmin@sachi disk4]$ mkdir backup
[gpadmin@sachi disk4]$ cd backup
[gpadmin@sachi backup]$ ls
[gpadmin@sachi backup]$ pwd
/disk4/backup
[gpadmin@sachi backup]$ cd
[gpadmin@sachi ~]$ cat tablist
public.DIM_TAXPAYER
public.Foo
public.a
public.abc
public.abctest
public.dynamic_analyze_config
public.h
public.i
public.sales
[gpadmin@sachi ~]$ gpcrondump --table-file="tablist" -u /disk4/backup
20150109:16:22:18:013929 gpcrondump:sachi:gpadmin-[INFO]:-Starting gpcrondump with args: --table-file=tablist -u /disk4/backup
20150109:16:22:18:013929 gpcrondump:sachi:gpadmin-[INFO]:-Setting dump database to value of $PGDATABASE which is sachi
20150109:16:22:18:013929 gpcrondump:sachi:gpadmin-[INFO]:----------------------------------------------------
20150109:16:22:18:013929 gpcrondump:sachi:gpadmin-[INFO]:-Master Greenplum Instance dump parameters
20150109:16:22:18:013929 gpcrondump:sachi:gpadmin-[INFO]:----------------------------------------------------
20150109:16:22:18:013929 gpcrondump:sachi:gpadmin-[INFO]:-Dump type                            = Single database, specific table
20150109:16:22:18:013929 gpcrondump:sachi:gpadmin-[INFO]:----------------------------------------------------
20150109:16:22:18:013929 gpcrondump:sachi:gpadmin-[INFO]:-Table file name                      = tablist
20150109:16:22:18:013929 gpcrondump:sachi:gpadmin-[INFO]:----------------------------------------------------
20150109:16:22:18:013929 gpcrondump:sachi:gpadmin-[INFO]:-Database to be dumped                = sachi
20150109:16:22:18:013929 gpcrondump:sachi:gpadmin-[INFO]:-Dump directory                       = /disk4/backup
20150109:16:22:18:013929 gpcrondump:sachi:gpadmin-[INFO]:-Master port                          = 5432
20150109:16:22:18:013929 gpcrondump:sachi:gpadmin-[INFO]:-Master data directory                = /home/gpmaster/gpsne-1
20150109:16:22:18:013929 gpcrondump:sachi:gpadmin-[INFO]:-Run post dump program                = Off
20150109:16:22:18:013929 gpcrondump:sachi:gpadmin-[INFO]:-Rollback dumps                       = Off
20150109:16:22:18:013929 gpcrondump:sachi:gpadmin-[INFO]:-Dump file compression                = On
20150109:16:22:18:013929 gpcrondump:sachi:gpadmin-[INFO]:-Clear old dump files                 = Off
20150109:16:22:18:013929 gpcrondump:sachi:gpadmin-[INFO]:-Update history table                 = Off
20150109:16:22:18:013929 gpcrondump:sachi:gpadmin-[INFO]:-Secure config files                  = Off
20150109:16:22:18:013929 gpcrondump:sachi:gpadmin-[INFO]:-Dump global objects                  = Off
20150109:16:22:18:013929 gpcrondump:sachi:gpadmin-[INFO]:-Vacuum mode type                     = Off
20150109:16:22:18:013929 gpcrondump:sachi:gpadmin-[INFO]:-Ensuring remaining free disk         > 10
Continue with Greenplum dump Yy|Nn (default=N):
> y
20150109:16:22:36:013929 gpcrondump:sachi:gpadmin-[INFO]:-Directory /disk4/backup/db_dumps/20150109 not found, will try to create
20150109:16:22:36:013929 gpcrondump:sachi:gpadmin-[INFO]:-Created /disk4/backup/db_dumps/20150109
20150109:16:22:36:013929 gpcrondump:sachi:gpadmin-[INFO]:-Checked /disk4/backup on master
20150109:16:22:36:013929 gpcrondump:sachi:gpadmin-[INFO]:-Directory /home/gpmaster/gpsne-1/db_dumps/20150109 not found, will try to create
20150109:16:22:36:013929 gpcrondump:sachi:gpadmin-[INFO]:-Created /home/gpmaster/gpsne-1/db_dumps/20150109
20150109:16:22:36:013929 gpcrondump:sachi:gpadmin-[INFO]:-Checked /home/gpmaster/gpsne-1 on master
20150109:16:22:37:013929 gpcrondump:sachi:gpadmin-[INFO]:-Configuring for single-database, include-table dump
20150109:16:22:38:013929 gpcrondump:sachi:gpadmin-[INFO]:-Located table public.DIM_TAXPAYER in sachi database
20150109:16:22:38:013929 gpcrondump:sachi:gpadmin-[INFO]:-Located table public.Foo in sachi database
20150109:16:22:38:013929 gpcrondump:sachi:gpadmin-[INFO]:-Located table public.a in sachi database
20150109:16:22:38:013929 gpcrondump:sachi:gpadmin-[INFO]:-Located table public.abc in sachi database
20150109:16:22:38:013929 gpcrondump:sachi:gpadmin-[INFO]:-Located table public.abctest in sachi database
20150109:16:22:38:013929 gpcrondump:sachi:gpadmin-[INFO]:-Located table public.dynamic_analyze_config in sachi database
20150109:16:22:38:013929 gpcrondump:sachi:gpadmin-[INFO]:-Located table public.h in sachi database
20150109:16:22:39:013929 gpcrondump:sachi:gpadmin-[INFO]:-Located table public.i in sachi database
20150109:16:22:39:013929 gpcrondump:sachi:gpadmin-[INFO]:-Located table public.sales in sachi database
20150109:16:22:39:013929 gpcrondump:sachi:gpadmin-[ERROR]:-gpcrondump error: No schema name supplied for table 
This error was reported due to the fact that there is a blank line at the bottom of the tablist file. Once I removed it error was gone.
[gpadmin@sachi ~]$ cat tablist
public.DIM_TAXPAYER
public.Foo
public.a
public.abc
public.abctest
public.dynamic_analyze_config
public.h
public.i
public.sales

[gpadmin@sachi ~]$ vi tablist 
remove the blank line.
[gpadmin@sachi ~]$ gpcrondump  -c --table-file="tablist" -u /disk4/backup
20150109:16:29:19:014787 gpcrondump:sachi:gpadmin-[INFO]:-Starting gpcrondump with args: -c --table-file=tablist -u /disk4/backup
20150109:16:29:19:014787 gpcrondump:sachi:gpadmin-[INFO]:-Setting dump database to value of $PGDATABASE which is sachi
20150109:16:29:19:014787 gpcrondump:sachi:gpadmin-[INFO]:----------------------------------------------------
20150109:16:29:19:014787 gpcrondump:sachi:gpadmin-[INFO]:-Master Greenplum Instance dump parameters
20150109:16:29:19:014787 gpcrondump:sachi:gpadmin-[INFO]:----------------------------------------------------
20150109:16:29:19:014787 gpcrondump:sachi:gpadmin-[INFO]:-Dump type                            = Single database, specific table
20150109:16:29:19:014787 gpcrondump:sachi:gpadmin-[INFO]:----------------------------------------------------
20150109:16:29:19:014787 gpcrondump:sachi:gpadmin-[INFO]:-Table file name                      = tablist
20150109:16:29:19:014787 gpcrondump:sachi:gpadmin-[INFO]:----------------------------------------------------
20150109:16:29:19:014787 gpcrondump:sachi:gpadmin-[INFO]:-Database to be dumped                = sachi
20150109:16:29:19:014787 gpcrondump:sachi:gpadmin-[INFO]:-Dump directory                       = /disk4/backup
20150109:16:29:19:014787 gpcrondump:sachi:gpadmin-[INFO]:-Master port                          = 5432
20150109:16:29:19:014787 gpcrondump:sachi:gpadmin-[INFO]:-Master data directory                = /home/gpmaster/gpsne-1
20150109:16:29:19:014787 gpcrondump:sachi:gpadmin-[INFO]:-Run post dump program                = Off
20150109:16:29:19:014787 gpcrondump:sachi:gpadmin-[INFO]:-Rollback dumps                       = Off
20150109:16:29:19:014787 gpcrondump:sachi:gpadmin-[INFO]:-Dump file compression                = On
20150109:16:29:19:014787 gpcrondump:sachi:gpadmin-[INFO]:-Clear old dump files                 = On
20150109:16:29:19:014787 gpcrondump:sachi:gpadmin-[INFO]:-Update history table                 = Off
20150109:16:29:19:014787 gpcrondump:sachi:gpadmin-[INFO]:-Secure config files                  = Off
20150109:16:29:19:014787 gpcrondump:sachi:gpadmin-[INFO]:-Dump global objects                  = Off
20150109:16:29:19:014787 gpcrondump:sachi:gpadmin-[INFO]:-Vacuum mode type                     = Off
20150109:16:29:19:014787 gpcrondump:sachi:gpadmin-[INFO]:-Ensuring remaining free disk         > 10
Continue with Greenplum dump Yy|Nn (default=N):
> y
20150109:16:29:24:014787 gpcrondump:sachi:gpadmin-[INFO]:-Directory /disk4/backup/db_dumps/20150109 exists
20150109:16:29:24:014787 gpcrondump:sachi:gpadmin-[INFO]:-Checked /disk4/backup on master
20150109:16:29:24:014787 gpcrondump:sachi:gpadmin-[INFO]:-Directory /home/gpmaster/gpsne-1/db_dumps/20150109 exists
20150109:16:29:24:014787 gpcrondump:sachi:gpadmin-[INFO]:-Checked /home/gpmaster/gpsne-1 on master
20150109:16:29:24:014787 gpcrondump:sachi:gpadmin-[INFO]:-Configuring for single-database, include-table dump
20150109:16:29:24:014787 gpcrondump:sachi:gpadmin-[INFO]:-Located table public.DIM_TAXPAYER in sachi database
20150109:16:29:24:014787 gpcrondump:sachi:gpadmin-[INFO]:-Located table public.Foo in sachi database
20150109:16:29:24:014787 gpcrondump:sachi:gpadmin-[INFO]:-Located table public.a in sachi database
20150109:16:29:24:014787 gpcrondump:sachi:gpadmin-[INFO]:-Located table public.abc in sachi database
20150109:16:29:25:014787 gpcrondump:sachi:gpadmin-[INFO]:-Located table public.abctest in sachi database
20150109:16:29:25:014787 gpcrondump:sachi:gpadmin-[INFO]:-Located table public.dynamic_analyze_config in sachi database
20150109:16:29:25:014787 gpcrondump:sachi:gpadmin-[INFO]:-Located table public.h in sachi database
20150109:16:29:25:014787 gpcrondump:sachi:gpadmin-[INFO]:-Located table public.i in sachi database
20150109:16:29:25:014787 gpcrondump:sachi:gpadmin-[INFO]:-Located table public.sales in sachi database
20150109:16:29:26:014787 gpcrondump:sachi:gpadmin-[INFO]:-Adding compression parameter
20150109:16:29:26:014787 gpcrondump:sachi:gpadmin-[INFO]:-Dump command line gp_dump -p 5432 -U gpadmin --gp-d=/disk4/backup/db_dumps/20150109 --gp-r=/home/gpmaster/gpsne-1/db_dumps/20150109 --gp-s=p --gp-c sachi --table-file=tablist
20150109:16:29:26:014787 gpcrondump:sachi:gpadmin-[INFO]:-Starting dump process
20150109:16:29:37:014787 gpcrondump:sachi:gpadmin-[INFO]:-Dump process returned exit code 0
20150109:16:29:37:014787 gpcrondump:sachi:gpadmin-[INFO]:-Timestamp key = 20150109162932
20150109:16:29:37:014787 gpcrondump:sachi:gpadmin-[INFO]:-Checked master status file and master dump file.
20150109:16:29:37:014787 gpcrondump:sachi:gpadmin-[INFO]:-Preparing to remove dump 20150108 from all hosts
20150109:16:29:38:014787 gpcrondump:sachi:gpadmin-[INFO]:-Dump status report
20150109:16:29:38:014787 gpcrondump:sachi:gpadmin-[INFO]:----------------------------------------------------
20150109:16:29:38:014787 gpcrondump:sachi:gpadmin-[INFO]:-Target database                          = sachi
20150109:16:29:38:014787 gpcrondump:sachi:gpadmin-[INFO]:-Dump subdirectory                        = 20150109
20150109:16:29:38:014787 gpcrondump:sachi:gpadmin-[INFO]:-Clear old dump directories               = On
20150109:16:29:38:014787 gpcrondump:sachi:gpadmin-[INFO]:-Backup set deleted                       = 20150108
20150109:16:29:38:014787 gpcrondump:sachi:gpadmin-[INFO]:-Dump start time                          = 16:29:26
20150109:16:29:38:014787 gpcrondump:sachi:gpadmin-[INFO]:-Dump end time                            = 16:29:37
20150109:16:29:38:014787 gpcrondump:sachi:gpadmin-[INFO]:-Status                                   = COMPLETED
20150109:16:29:38:014787 gpcrondump:sachi:gpadmin-[INFO]:-Dump key                                 = 20150109162932
20150109:16:29:38:014787 gpcrondump:sachi:gpadmin-[INFO]:-Dump file compression                    = On
20150109:16:29:38:014787 gpcrondump:sachi:gpadmin-[INFO]:-Vacuum mode type                         = Off
20150109:16:29:38:014787 gpcrondump:sachi:gpadmin-[INFO]:-Exit code zero, no warnings generated
20150109:16:29:38:014787 gpcrondump:sachi:gpadmin-[INFO]:----------------------------------------------------
20150109:16:29:38:014787 gpcrondump:sachi:gpadmin-[WARNING]:-Found neither /usr/local/greenplum-db/./bin/mail_contacts nor /home/gpadmin/mail_contacts
20150109:16:29:38:014787 gpcrondump:sachi:gpadmin-[WARNING]:-Unable to send dump email notification
20150109:16:29:38:014787 gpcrondump:sachi:gpadmin-[INFO]:-To enable email notification, create /usr/local/greenplum-db/./bin/mail_contacts or /home/gpadmin/mail_contacts containing required email addresses
[gpadmin@sachi ~]$ 
[gpadmin@sachi disk4]$ cd backup
[gpadmin@sachi backup]$ ls
db_dumps
[gpadmin@sachi backup]$ cd db_dumps/
[gpadmin@sachi db_dumps]$ ls
20150109
[gpadmin@sachi db_dumps]$ cd 20150109/
[gpadmin@sachi 20150109]$ ls
[gpadmin@sachi 20150109]$ cd ..
[gpadmin@sachi db_dumps]$ ls
20150109
[gpadmin@sachi db_dumps]$ cd 20150109/
[gpadmin@sachi 20150109]$ ls
gp_cdatabase_1_1_20150109162932  gp_dump_0_3_20150109162932.gz  gp_dump_1_1_20150109162932_post_data.gz  gp_dump_status_0_3_20150109162932
gp_dump_0_2_20150109162932.gz    gp_dump_1_1_20150109162932.gz  gp_dump_status_0_2_20150109162932        gp_dump_status_1_1_20150109162932
[gpadmin@sachi 20150109]$ ls -ltr
total 40
-rw-------. 1 gpadmin gpadmin  111 Jan  9 16:29 gp_cdatabase_1_1_20150109162932
-rw-------. 1 gpadmin gpadmin 4916 Jan  9 16:29 gp_dump_1_1_20150109162932.gz
-rw-------. 1 gpadmin gpadmin 2061 Jan  9 16:29 gp_dump_status_1_1_20150109162932
-rw-------. 1 gpadmin gpadmin  324 Jan  9 16:29 gp_dump_1_1_20150109162932_post_data.gz
-rw-------. 1 gpadmin gpadmin  944 Jan  9 16:29 gp_dump_status_0_3_20150109162932
-rw-------. 1 gpadmin gpadmin  944 Jan  9 16:29 gp_dump_status_0_2_20150109162932
-rw-------. 1 gpadmin gpadmin 5209 Jan  9 16:29 gp_dump_0_3_20150109162932.gz
-rw-------. 1 gpadmin gpadmin 3614 Jan  9 16:29 gp_dump_0_2_20150109162932.gz
[gpadmin@sachi 20150109]$ 





Exploring GPDB gpcrondump command and files created by gpcrondump

posted Jan 9, 2015, 7:07 AM by Sachchida Ojha

gp_dump is deprecated in later versions of GPDB. gpcrondump is practically used for gpdb parallel backups.
The gadget spec URL could not be found
1. Getting the command syntax

[gpadmin@sachi ~]$ gpcrondump --help
COMMAND NAME: gpcrondump

A wrapper utility for gp_dump, which can be called directly or from a crontab entry.

*****************************************************
SYNOPSIS
*****************************************************
gpcrondump -x <database_name> 
     [-s <schema> | -t <schema>.<table> | -T <schema>.<table>] 
     [--table-file="<filename>" | --exclude-table-file="<filename>"] 
     [-u <backup_directory>] [-R <post_dump_script>] 
     [-c] [-z] [-r] [-f <free_space_percent>] [-b] [-h] [-j | -k] 
     [-g] [-G] [-C] [-d <master_data_directory>] [-B <parallel_processes>] 
     [-a] [-q] [-y <reportfile>] [-l <logfile_directory>] [-v]
     { [-E <encoding>] [--inserts | --column-inserts] [--oids] 
       [--no-owner | --use-set-session-authorization] 
       [--no-privileges] [--rsyncable] [--ddboost] }
     
gpcrondump --ddboost-host <ddboost_hostname> --ddboost-user <ddboost_user>

gpcrondump --ddboost-config-remove

gpcrondump -o

gpcrondump -? 

gpcrondump --version

*****************************************************
DESCRIPTION
*****************************************************
gpcrondump is a wrapper utility for gp_dump. By default, dump files are created in their respective master and segment data directories in a directory named db_dumps/YYYYMMDD. The data dump files are compressed by default using gzip.

gpcrondump allows you to schedule routine backups of a Greenplum database using cron (a scheduling utility for UNIX operating systems). Cron jobs 
that call gpcrondump should be scheduled on the master host.

gpcrondump is used to schedule Data Domain Boost backup and restore operations. gpcrondump is also used to set or remove one-time 
credentials for Data Domain Boost.

**********************
Return Codes
**********************

The following is a list of the codes that gpcrondump returns.
   0 - Dump completed with no problems
   1 - Dump completed, but one or more warnings were generated
   2 - Dump failed with a fatal error

**********************************************
EMAIL NOTIFICATIONS
**********************************************
To have gpcrondump send out status email notifications, you must place a file named mail_contacts in the home directory of the Greenplum superuser (gpadmin) or in the same directory as the gpcrondump utility ($GPHOME/bin). This file should contain one email address per line. gpcrondump will issue a warning if it cannot locate a mail_contacts file in either location. If both locations have a mail_contacts file, then the one in $HOME takes precedence.
The gadget spec URL could not be found
*****************************************************
OPTIONS
*****************************************************
-a (do not prompt)
Do not prompt the user for confirmation.

-b (bypass disk space check)
Bypass disk space check. The default is to check for available disk space.

Note: Bypassing the disk space check generates a warning message.  With a warning message, the return code for gpcrondump is 1 if the  dump is successful. (If the dump fails, the return code is 2, in all cases.)

-B <parallel_processes>
 The number of segments to check in parallel for pre/post-dump validation.  If not specified, the utility will start up to 60 parallel processes  depending on how many segment instances it needs to dump.

-c (clear old dump files first)
 Clear out old dump files before doing the dump. The default is not to  clear out old dump files. This will remove all old dump directories in  the db_dumps directory, except for the dump directory of the current date.

-C (clean old catalog dumps)
 Clean out old catalog schema dump files prior to create.

--column-inserts
Dump data as INSERT commands with column names.

-d <master_data_directory>
 The master host data directory. If not specified, the value set for $MASTER_DATA_DIRECTORY will be used.

--ddboost
 Use Data Domain Boost for this backup. Before using Data Domain Boost, set up the Data Domain Boost credential, as described in the next option below. 
 The following option is recommended if --ddboost is specified.
* -z option (uncompressed)
Backup compression (turned on by default) should be turned off with the -z option. Data Domain Boost will deduplicate and compress the backup data before sending it to the Data Domain System.            
      
When running a mixed backup that backs up to both a local disk and to Data Domain, use the -u option to specify that the backup to the local disk does not  use the default directory. 

The -f, -G, -g, -R, and -u options change if --ddboost is specified. See the options for details.

Important: Never use the Greenplum Database default backup options with Data Domain Boost.  

To maximize Data Domain deduplication benefits, retain at least 30 days of backups.

--ddboost-host <ddboost_hostname> --ddboost-user <ddboost_user>

Sets the Data Domain Boost credentials. Do not combine this options with any other gpcrondump options. Do not enter just part of this option. 

<ddboost_hostname> is the IP address of the host. There is a 30-character limit.

<ddboost_user> is the Data Domain Boost user name. There is a 30-character limit.

Example:
gpcrondump --ddboost-host 172.28.8.230 --ddboost-user ddboostusername

After running gpcrondump with these options, the system verfies the limits on the host and user names and prompts for the Data Domain Boost password. 
Enter the password when prompted; the password is not echoed on the screen. There is a 40-character limit on the password that can include lowercase 
letters (a-z), uppercase letters (A-Z), numbers (0-9), and special characters ($, %, #, +, etc.).

The system verifies the password. After the password is verified, the system creates a file .ddconfig and copies it to all segments.

Note: If there is more than one operating system user using Data Domain Boost for backup and restore operations, repeat this configuration process for 
each of those users.

Important: Set up the Data Domain Boost credential before running any Data Domain Boost backups with the --ddboost option, described above.

--ddboost-config-remove
Removes all Data Domain Boost credentials from the master and all segments on the system. Do not enter this option with any other gpcrondump option.
The gadget spec URL could not be found

-E encoding
 Character set encoding of dumped data. Defaults to the encoding of  the database being dumped.

-f <free_space_percent>
 When doing the check to ensure that there is enough free disk space to create the dump files, specifies a percentage of free disk space that should remain after the dump completes. The default is 10 percent.
 -f is not supported if --ddboost is specified.

-g (copy config files)
 Secure a copy of the master and segment configuration files  postgresql.conf, pg_ident.conf, and pg_hba.conf. These  configuration files are dumped in the master or segment data directory to db_dumps/YYYYMMDD/config_files_<timestamp>.tar
 If --ddboost is specified, the files are located in the  db_dumps directory on the default storage unit. 


-G (dump global objects)
 Use pg_dumpall to dump global objects such as roles and tablespaces. Global objects are dumped in the master data directory to  db_dumps/YYYYMMDD/gp_global_1_1_<timestamp>.
 If --ddboost is specified, the files are located in the db_dumps directory on the default storage unit. 


-h (record dump details)
 Record details of database dump in database table public.gpcrondump_history in database supplied via -x option. Utility will create table if it does not currently exist.

--inserts
 Dump data as INSERT, rather than COPY commands.

-j (vacuum before dump)
 Run VACUUM before the dump starts.

-k (vacuum after dump)
Run VACUUM after the dump has completed successfully.

-l <logfile_directory>
 The directory to write the log file. Defaults to ~/gpAdminLogs.

--no-owner
Do not output commands to set object ownership.

--no-privileges
 Do not output commands to set object privileges (GRANT/REVOKE commands).

-o (clear old dump files only)
Clear out old dump files only, but do not run a dump. This will remove the oldest dump directory except the current date's dump directory. All dump sets within that directory will be removed.
 If --ddboost is specified, only the old files on DD Boost are deleted.

--oids
 Include object identifiers (oid) in dump data.

-q (no screen output)
 Run in quiet mode. Command output is not displayed on the screen,  but is still written to the log file.

-r (rollback on failure)
 Rollback the dump files (delete a partial dump) if a failure is detected. The default is to not rollback.
 -r is not supported if --ddboost is specified.

-R <post_dump_script>
 The absolute path of a script to run after a successful dump operation.  For example, you might want a script that moves completed dump files 
 to a backup host. This script must reside in the same location on  the master and all segment hosts.

--rsyncable
 Passes the --rsyncable flag to the gpzip utility to synchronize the output occasionally, based on the input during compression. This synchronization increases the file size by less than 1% in most cases. When this flag is passed, the rsync(1) program can synchronize compressed files much more efficiently. The gunzip utility cannot differentiate between a compressed file created with this option, and one created without it.  

 -s <schema_name>
 Dump only the named schema in the named database.

-t <schema>.<table_name>
 Dump only the named table in this database.
 The -t option can be specified multiple times.

-T <schema>.<table_name>
 A table name to exclude from the database dump. The -T option can be specified multiple times.

--exclude-table-file="<filename>"
Exclude all tables listed in <filename> from the database dump. The file <filename> contains any number of tables, listed one per line.

--table-file="<filename>"
Dump only the tables listed in <filename>.  The file <filename> contains any  number of tables, listed one per line.

-u <backup_directory>
 Specifies the absolute path where the backup files will be placed on each host. If the path does not exist, it will be created, if possible. If not specified, defaults to the data directory of each instance to be backed up. Using this option may be desirable if each segment host has multiple segment 
 instances as it will create the dump files in a centralized location rather than the segment data directories.
 -u is not supported if --ddboost is specified.

--use-set-session-authorization
Use SET SESSION AUTHORIZATION commands instead of ALTER OWNER commands to set object ownership.

-v | --verbose
 Specifies verbose mode.

--version (show utility version)
 Displays the version of this utility.

-x <database_name>
 Required. The name of the Greenplum database to dump. Multiple databases can be specified in a comma-separated list.

-y <reportfile>
 Specifies the full path name where the backup job log file will be placed on the master host. If not specified, defaults to the master data directory or if running remotely, the current working directory.

-z (no compression)
Do not use compression. Default is to compress the dump files using gzip. We recommend using this option for NFS and Data Domain Boost backups.

-? (help)
 Displays the online help.

*****************************************************
EXAMPLES
*****************************************************
The gadget spec URL could not be found
Example 1: Call gpcrondump directly and dump sachi (and global objects):

$gpcrondump -x sachi -c -g -G

here the options used are
-x -> database name
-c ->clear old dump files first
-g -> copy config files
-G -> dump global objects


Example 2: Using a cron job and shell script
A crontab entry that runs a backup of the sales database and global objects) nightly at one past midnight:

01 0 * * * /home/gpadmin/gpdump.sh >> gpdump.log

The content of dump script gpdump.sh is:

  #!/bin/bash
  export GPHOME=/usr/local/greenplum-db
  export MASTER_DATA_DIRECTORY=/data/gpdb_p1/gp-1
  . $GPHOME/greenplum_path.sh  
  gpcrondump -x sales -c -g -G -a -q 

here the options used are
-x -> database name
-c ->clear old dump files first
-g -> copy config files
-G -> dump global objects
-a -> do not prompt
-q -> no screen output


Example 3: Backup a schema (sachi2014) of database sachi and explore the output files

 $gpcrondump -x sachi -s sachi2014 

OUTPUT on the screen
20150108:18:46:31:017728 gpcrondump:sachi:gpadmin-[INFO]:-Starting gpcrondump with args: -x sachi -s sachi2014
20150108:18:46:31:017728 gpcrondump:sachi:gpadmin-[INFO]:----------------------------------------------------
20150108:18:46:31:017728 gpcrondump:sachi:gpadmin-[INFO]:-Master Greenplum Instance dump parameters
20150108:18:46:31:017728 gpcrondump:sachi:gpadmin-[INFO]:----------------------------------------------------
20150108:18:46:31:017728 gpcrondump:sachi:gpadmin-[INFO]:-Dump type                            = Single database
20150108:18:46:31:017728 gpcrondump:sachi:gpadmin-[INFO]:-Database to be dumped                = sachi
20150108:18:46:31:017728 gpcrondump:sachi:gpadmin-[INFO]:-Schema to be dumped                  = sachi2014
20150108:18:46:31:017728 gpcrondump:sachi:gpadmin-[INFO]:-Master port                          = 5432
20150108:18:46:31:017728 gpcrondump:sachi:gpadmin-[INFO]:-Master data directory                = /home/gpmaster/gpsne-1
20150108:18:46:31:017728 gpcrondump:sachi:gpadmin-[INFO]:-Run post dump program                = Off
20150108:18:46:31:017728 gpcrondump:sachi:gpadmin-[INFO]:-Rollback dumps                       = Off
20150108:18:46:31:017728 gpcrondump:sachi:gpadmin-[INFO]:-Dump file compression                = On
20150108:18:46:31:017728 gpcrondump:sachi:gpadmin-[INFO]:-Clear old dump files                 = Off
20150108:18:46:31:017728 gpcrondump:sachi:gpadmin-[INFO]:-Update history table                 = Off
20150108:18:46:31:017728 gpcrondump:sachi:gpadmin-[INFO]:-Secure config files                  = Off
20150108:18:46:31:017728 gpcrondump:sachi:gpadmin-[INFO]:-Dump global objects                  = Off
20150108:18:46:31:017728 gpcrondump:sachi:gpadmin-[INFO]:-Vacuum mode type                     = Off
20150108:18:46:31:017728 gpcrondump:sachi:gpadmin-[INFO]:-Ensuring remaining free disk         > 10

Continue with Greenplum dump Yy|Nn (default=N):
> y
20150108:18:46:35:017728 gpcrondump:sachi:gpadmin-[INFO]:-Directory /home/gpmaster/gpsne-1/db_dumps/20150108 exists ===>mark1
20150108:18:46:35:017728 gpcrondump:sachi:gpadmin-[INFO]:-Checked /home/gpmaster/gpsne-1 on master
20150108:18:46:35:017728 gpcrondump:sachi:gpadmin-[INFO]:-Configuring for single database dump
20150108:18:46:36:017728 gpcrondump:sachi:gpadmin-[INFO]:-Adding compression parameter
20150108:18:46:36:017728 gpcrondump:sachi:gpadmin-[INFO]:-Adding schema name sachi2014
20150108:18:46:36:017728 gpcrondump:sachi:gpadmin-[INFO]:-Dump command line gp_dump -p 5432 -U gpadmin --gp-d=db_dumps/20150108 --gp-r=/home/gpmaster/gpsne-1/db_dumps/20150108 --gp-s=p --gp-c -n "\"sachi2014\"" sachi
20150108:18:46:36:017728 gpcrondump:sachi:gpadmin-[INFO]:-Starting dump process
20150108:18:46:48:017728 gpcrondump:sachi:gpadmin-[INFO]:-Dump process returned exit code 0
20150108:18:46:48:017728 gpcrondump:sachi:gpadmin-[INFO]:-Timestamp key = 20150108184636
20150108:18:46:48:017728 gpcrondump:sachi:gpadmin-[INFO]:-Checked master status file and master dump file.
20150108:18:46:48:017728 gpcrondump:sachi:gpadmin-[INFO]:-Dump status report
20150108:18:46:48:017728 gpcrondump:sachi:gpadmin-[INFO]:----------------------------------------------------
20150108:18:46:48:017728 gpcrondump:sachi:gpadmin-[INFO]:-Target database                          = sachi
20150108:18:46:48:017728 gpcrondump:sachi:gpadmin-[INFO]:-Dump subdirectory                        = 20150108
20150108:18:46:48:017728 gpcrondump:sachi:gpadmin-[INFO]:-Clear old dump directories               = Off
20150108:18:46:48:017728 gpcrondump:sachi:gpadmin-[INFO]:-Dump start time                          = 18:46:36
20150108:18:46:48:017728 gpcrondump:sachi:gpadmin-[INFO]:-Dump end time                            = 18:46:48
20150108:18:46:48:017728 gpcrondump:sachi:gpadmin-[INFO]:-Status                                   = COMPLETED
20150108:18:46:48:017728 gpcrondump:sachi:gpadmin-[INFO]:-Dump key                                 = 20150108184636
20150108:18:46:48:017728 gpcrondump:sachi:gpadmin-[INFO]:-Dump file compression                    = On
20150108:18:46:48:017728 gpcrondump:sachi:gpadmin-[INFO]:-Vacuum mode type                         = Off
20150108:18:46:48:017728 gpcrondump:sachi:gpadmin-[INFO]:-Exit code zero, no warnings generated
20150108:18:46:48:017728 gpcrondump:sachi:gpadmin-[INFO]:----------------------------------------------------
20150108:18:46:48:017728 gpcrondump:sachi:gpadmin-[WARNING]:-Found neither /usr/local/greenplum-db/./bin/mail_contacts nor /home/gpadmin/mail_contacts
20150108:18:46:48:017728 gpcrondump:sachi:gpadmin-[WARNING]:-Unable to send dump email notification
20150108:18:46:48:017728 gpcrondump:sachi:gpadmin-[INFO]:-To enable email notification, create /usr/local/greenplum-db/./bin/mail_contacts or /home/gpadmin/mail_contacts containing required email addresses

The gadget spec URL could not be found
Now lets go to the master and segment servers and look at the files and content it created.

Note: gpcrondump creates a subdirectory based on the current date as shown in  mark1. if this directory already exist it uses the same one.
[gpadmin@sachi 20150108]$ pwd
/home/gpmaster/gpsne-1/db_dumps/20150108
[gpadmin@sachi 20150108]$ ls -ltr
total 180
-rw-------. 1 gpadmin gpadmin 111 Jan 8 18:46 gp_cdatabase_1_1_20150108184636
-rw-------. 1 gpadmin gpadmin 12452 Jan 8 18:46 gp_dump_1_1_20150108184636.gz
-rw-------. 1 gpadmin gpadmin 893 Jan 8 18:46 gp_dump_status_1_1_20150108184636  ---> status of the dump process
-rw-------. 1 gpadmin gpadmin 432 Jan 8 18:46 gp_dump_1_1_20150108184636_post_data.gz
-rw-rw-r--. 1 gpadmin gpadmin 961 Jan 8 18:46 gp_dump_20150108184636.rpt  ----> dump report file


-rw-------. 1 gpadmin gpadmin 1108 Jan 8 18:51 gp_restore_status_1_1_20150108184636 --> will discuss later. created when you restore this backup using gpdbrestore command

Note: I have copied the original files in bold to view the file contents without affecting the original
-rw-------. 1 gpadmin gpadmin 136936 Jan 8 21:12 sachi_1_1_20150108184636
-rw-------. 1 gpadmin gpadmin 992 Jan 8 21:12 sachi_1_1_20150108184636_post_data
-rw-------. 1 gpadmin gpadmin 111 Jan 8 21:17 sachitest_1_1_20150108184636
[gpadmin@sachi 20150108]$

There are 3 main files.
-rw-------. 1 gpadmin gpadmin 111 Jan 8 18:46 gp_cdatabase_1_1_20150108184636
-rw-------. 1 gpadmin gpadmin 12452 Jan 8 18:46 gp_dump_1_1_20150108184636.gz
-rw-------. 1 gpadmin gpadmin 432 Jan 8 18:46 gp_dump_1_1_20150108184636_post_data.gz

1. gp_cdatabase_1_1_20150108184636: This files contains script to create database. if database already exists it ignores it. does not drop and recreate database.

$[gpadmin@sachi 20150108]$ cat gp_cdatabase_1_1_20150108184636
--
-- Database creation
--
CREATE DATABASE sachi WITH TEMPLATE = template0 ENCODING = 'UTF8' OWNER = gpadmin;
[gpadmin@sachi 20150108]$ 

2. gp_dump_1_1_20150108184636.gz: This file contains DDLs for the schema objects.

Here are few lines from the top

--
-- Greenplum Database database dump
--

SET statement_timeout = 0;
SET client_encoding = 'UTF8';
SET standard_conforming_strings = off;
SET check_function_bodies = false;
SET client_min_messages = warning;
SET escape_string_warning = off;

SET default_with_oids = false;

--
-- Name: sachi2014; Type: SCHEMA; Schema: -; Owner: sachi
--

CREATE SCHEMA sachi2014;


ALTER SCHEMA sachi2014 OWNER TO sachi;

SET search_path = sachi2014, pg_catalog;

SET default_tablespace = '';

--
-- Name: foo; Type: TABLE; Schema: sachi2014; Owner: gpadmin; Tablespace:
--

CREATE TABLE foo (
    x timestamp without time zone
) DISTRIBUTED RANDOMLY;


ALTER TABLE sachi2014.foo OWNER TO gpadmin;

--
-- Name: foo1; Type: TABLE; Schema: sachi2014; Owner: gpadmin; Tablespace:
--

CREATE TABLE foo1 (
    x timestamp without time zone
) DISTRIBUTED RANDOMLY;


ALTER TABLE sachi2014.foo1 OWNER TO gpadmin;

3. gp_dump_1_1_20150108184636_post_data.gz: This file contains DDLs to create constraints, index etc.

Here are the contents of this file.

--
-- Greenplum Database database dump
--

SET statement_timeout = 0;
SET client_encoding = 'UTF8';
SET standard_conforming_strings = off;
SET check_function_bodies = false;
SET client_min_messages = warning;
SET escape_string_warning = off;

SET search_path = sachi2014, pg_catalog;

SET default_tablespace = '';

SET default_with_oids = false;

--
-- Name: firstkey; Type: CONSTRAINT; Schema: sachi2014; Owner: gpadmin; Tablespace: 
--

ALTER TABLE ONLY films
    ADD CONSTRAINT firstkey PRIMARY KEY (code);


--
-- Name: abc_id; Type: INDEX; Schema: sachi2014; Owner: gpadmin; Tablespace: 
--

CREATE INDEX abc_id ON abc USING btree (id);


--
-- Name: idx_bloattest_id; Type: INDEX; Schema: sachi2014; Owner: gpadmin; Tablespace: 
--

CREATE INDEX idx_bloattest_id ON bloattest USING btree (id);


--
-- Name: index_abc; Type: INDEX; Schema: sachi2014; Owner: gpadmin; Tablespace: 
--

CREATE INDEX index_abc ON abc USING btree (id);


--
-- Greenplum Database database dump complete
--

[gpadmin@sachi 20150108]$ 
The gadget spec URL could not be found

4. gp_dump_status_1_1_20150108184636: File contains status related details.

[gpadmin@sachi 20150108]$ cat gp_dump_status_1_1_20150108184636
20150108:18:46:36|gp_dump_agent-[INFO]:-Starting monitor thread
20150108:18:46:36|gp_dump_agent-[INFO]:-Dumping database "sachi"...
20150108:18:46:36|gp_dump_agent-[INFO]:-Dumping CREATE DATABASE statement for database "sachi"
20150108:18:46:38|gp_dump_agent-[INFO]:-TASK_SET_SERIALIZABLE
20150108:18:46:38|gp_dump_agent-[INFO]:-TASK_GOTLOCKS
20150108:18:46:38|gp_dump_agent-[INFO]:-Succeeded
20150108:18:46:38|gp_dump_agent-[INFO]:-Finished pre-data schema successfully
20150108:18:46:38|gp_dump_agent-[INFO]:-Finished successfully
20150108:18:46:38|gp_dump_agent-[INFO]:-Starting monitor thread
20150108:18:46:38|gp_dump_agent-[INFO]:-Dumping database "sachi"...
20150108:18:46:40|gp_dump_agent-[INFO]:-TASK_SET_SERIALIZABLE
20150108:18:46:40|gp_dump_agent-[INFO]:-TASK_GOTLOCKS
20150108:18:46:40|gp_dump_agent-[INFO]:-Succeeded
20150108:18:46:40|gp_dump_agent-[INFO]:-Finished successfully
[gpadmin@sachi 20150108]$ 

5. gp_dump_20150108184636.rpt:  Report file used to communicate to the mail contacts.

[gpadmin@sachi 20150108]$ cat gp_dump_20150108184636.rpt

Greenplum Database Backup Report
Timestamp Key: 20150108184636
gp_dump Command Line: -p 5432 -U gpadmin --gp-d=db_dumps/20150108 --gp-r=/home/gpmaster/gpsne-1/db_dumps/20150108 --gp-s=p --gp-c -n ""sachi2014"" sachi
Pass through Command Line Options: -n "\"sachi2014\""
Compression Program: gzip

Individual Results
segment 1 (dbid 3) Host sachi Port 40001 Database sachi BackupFile /disk2/gpdata2/gpsne1/db_dumps/20150108/gp_dump_0_3_20150108184636.gz: Succeeded
segment 0 (dbid 2) Host sachi Port 40000 Database sachi BackupFile /disk1/gpdata1/gpsne0/db_dumps/20150108/gp_dump_0_2_20150108184636.gz: Succeeded

Master (dbid 1) Host sachi Port 5432 Database sachi BackupFile /home/gpmaster/gpsne-1/db_dumps/20150108/gp_dump_1_1_20150108184636.gz: Succeeded
Master (dbid 1) Host sachi Port 5432 Database sachi BackupFile /home/gpmaster/gpsne-1/db_dumps/20150108/gp_dump_1_1_20150108184636.gz_post_data: Succeeded

gp_dump utility finished successfully.
[gpadmin@sachi 20150108]$


Now lets move to segment servers and look at the files and its contents
As you see in the gp_dump_20150108184636.rpt file. following files are created on the segments.

segment 1 (dbid 3) Host sachi Port 40001 Database sachi BackupFile /disk2/gpdata2/gpsne1/db_dumps/20150108/gp_dump_0_3_20150108184636.gz: Succeeded 
segment 0 (dbid 2) Host sachi Port 40000 Database sachi BackupFile /disk1/gpdata1/gpsne0/db_dumps/20150108/gp_dump_0_2_20150108184636.gz: Succeeded 


[gpadmin@sachi 20150108]$ cd /disk2/gpdata2/gpsne1/db_dumps/20150108/
[gpadmin@sachi 20150108]$ ls
gp_dump_0_3_20150108184636.gz  gp_dump_status_0_3_20150108184636  gp_restore_status_0_3_20150108184636
[gpadmin@sachi 20150108]$ ls -ltr
total 83648
-rw-------. 1 gpadmin gpadmin      360 Jan  8 18:46 gp_dump_status_0_3_20150108184636
-rw-------. 1 gpadmin gpadmin 85644160 Jan  8 18:46 gp_dump_0_3_20150108184636.gz
-rw-------. 1 gpadmin gpadmin      549 Jan  8 18:51 gp_restore_status_0_3_20150108184636
[gpadmin@sachi 20150108]$ 

1. gp_dump_status_0_3_20150108184636: Tracks status of the dump command.

[gpadmin@sachi 20150108]$ cat gp_dump_status_0_3_20150108184636
20150108:18:46:36|gp_dump_agent-[INFO]:-Starting monitor thread
20150108:18:46:36|gp_dump_agent-[INFO]:-Dumping database "sachi"...
20150108:18:46:38|gp_dump_agent-[INFO]:-TASK_SET_SERIALIZABLE
20150108:18:46:38|gp_dump_agent-[INFO]:-TASK_GOTLOCKS
20150108:18:46:48|gp_dump_agent-[INFO]:-Succeeded
20150108:18:46:48|gp_dump_agent-[INFO]:-Finished successfully
[gpadmin@sachi 20150108]$ 

2. gp_dump_0_3_20150108184636.gz: This file contains COPY command to insert data into the tables.

[gpadmin@sachi 20150108]$ cp gp_dump_0_3_20150108184636.gz sachi_0_3_20150108184636.gz
[gpadmin@sachi 20150108]$ gunzip sachi_0_3_20150108184636.gz
[gpadmin@sachi 20150108]$ ls -ltr
total 1435332
-rw-------. 1 gpadmin gpadmin 360 Jan 8 18:46 gp_dump_status_0_3_20150108184636
--

-rw-------. 1 gpadmin gpadmin 85644160 Jan 8 18:46 gp_dump_0_3_20150108184636.gz
-rw-------. 1 gpadmin gpadmin 549 Jan 8 18:51 gp_restore_status_0_3_20150108184636
-rw-------. 1 gpadmin gpadmin 1384123537 Jan 9 09:59 sachi_0_3_20150108184636

The gadget spec URL could not be found
[gpadmin@sachi 20150108]$ view sachi_0_3_20150108184636
-- Greenplum Database database dump
--

SET statement_timeout = 0;
SET client_encoding = 'UTF8';
SET standard_conforming_strings = off;
SET check_function_bodies = false;
SET client_min_messages = warning;
SET escape_string_warning = off;

SET search_path = sachi2014, pg_catalog;

SET default_with_oids = false;

--
-- Data for Name: abc; Type: TABLE DATA; Schema: sachi2014; Owner: gpadmin
--

COPY abc (id, name) FROM stdin;
\.


--
-- Data for Name: bloattest; Type: TABLE DATA; Schema: sachi2014; Owner: gpadmin
--

COPY bloattest (id, int_1, int_2, int_3, ts_1, ts_2, ts_3, text_1, text_2, text_3) FROM stdin;
2       6527368 1774693 6621748 2014-08-07 12:39:38.854845-04   2013-12-24 15:23:49.059645-05   2013-12-15 05:59:51.795645-05   text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1  text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2  text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2
4       4495918 2751623 8940072 2014-09-16 22:40:41.523645-04   2013-12-14 08:11:29.274045-05   2014-08-22 00:13:21.018045-04   text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1 text_1    text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2     text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2 text_2
@       
@

...

...


The gadget spec URL could not be found

The gadget spec URL could not be found

The gadget spec URL could not be found

The gadget spec URL could not be found

The gadget spec URL could not be found

The gadget spec URL could not be found

The gadget spec URL could not be found

The gadget spec URL could not be found

The gadget spec URL could not be found

The gadget spec URL could not be found

The gadget spec URL could not be found

The gadget spec URL could not be found

The gadget spec URL could not be found

The gadget spec URL could not be found

The gadget spec URL could not be found

How to make a copy of greenplum database without backup and restore

posted Jan 8, 2015, 4:45 PM by Sachchida Ojha

Here is the quick way to make a duplicate copy of your database in Greenplum
The gadget spec URL could not be found
[gpadmin@sachi ~]$ psql -d sachi
psql (8.2.15)
Type "help" for help.

sachi=# \l+
                                             List of databases
   Name    |  Owner  | Encoding |   Access privileges    |  Size   | Tablespace |        Description        
-----------+---------+----------+------------------------+---------+------------+---------------------------
 bd_dev    | gpadmin | UTF8     | =Tc/gpadmin            | 29 MB   | pg_default | 
                                : gpadmin=CTc/gpadmin                             
                                : w8xpb=c/gpadmin                                 
                                : bdacc_capsvc=c/gpadmin                          
                                : tev_bdasvc=c/gpadmin                            
 bd_prd    | gpadmin | UTF8     |                        | 28 MB   | pg_default | 
 bd_tst    | gpadmin | UTF8     |                        | 28 MB   | pg_default | 
 gpadmin   | gpadmin | UTF8     |                        | 29 MB   | pg_default | 
 gpperfmon | gpadmin | UTF8     | gpadmin=CTc/gpadmin    | 284 MB  | pg_default | 
                                : =c/gpadmin                                      
 himanshu  | gpadmin | UTF8     |                        | 28 MB   | pg_default | 
 postgres  | gpadmin | UTF8     |                        | 29 MB   | pg_default | 
 sachi     | gpadmin | UTF8     | =Tc/gpadmin            | 5453 MB | pg_default | 
                                : gpadmin=CTc/gpadmin                             
                                : sachi=CTc/gpadmin                               
                                : gpuser=CTc/gpadmin                              
 template0 | gpadmin | UTF8     | =c/gpadmin             | 27 MB   | pg_default | 
                                : gpadmin=CTc/gpadmin                             
 template1 | gpadmin | UTF8     | =c/gpadmin             | 28 MB   | pg_default | Default template database
                                : gpadmin=CTc/gpadmin                             
(10 rows)
The gadget spec URL could not be found
sachi=# create database copyofsachi template sachi;
ERROR:  source database "sachi" is being accessed by other users
sachi=# \q

[gpadmin@sachi ~]$ psql -d template1
psql (8.2.15)
Type "help" for help.

The gadget spec URL could not be found
template1=# create database copyofsachi template sachi;
CREATE DATABASE

The gadget spec URL could not be found
template1=# sachi=# \l+
                                              List of databases
    Name     |  Owner  | Encoding |   Access privileges    |  Size   | Tablespace |        Description        
-------------+---------+----------+------------------------+---------+------------+---------------------------
 bd_dev      | gpadmin | UTF8     | =Tc/gpadmin            | 29 MB   | pg_default | 
                                  : gpadmin=CTc/gpadmin                             
                                  : w8xpb=c/gpadmin                                 
                                  : bdacc_capsvc=c/gpadmin                          
                                  : tev_bdasvc=c/gpadmin                            
 bd_prd      | gpadmin | UTF8     |                        | 28 MB   | pg_default | 
 bd_tst      | gpadmin | UTF8     |                        | 28 MB   | pg_default | 
 copyofsachi | gpadmin | UTF8     |                        | 5452 MB | pg_default | 
 gpadmin     | gpadmin | UTF8     |                        | 29 MB   | pg_default | 
 gpperfmon   | gpadmin | UTF8     | gpadmin=CTc/gpadmin    | 284 MB  | pg_default | 
                                  : =c/gpadmin                                      
 himanshu    | gpadmin | UTF8     |                        | 28 MB   | pg_default | 
 postgres    | gpadmin | UTF8     |                        | 29 MB   | pg_default | 
 sachi       | gpadmin | UTF8     | =Tc/gpadmin            | 5453 MB | pg_default | 
                                  : gpadmin=CTc/gpadmin                             
                                  : sachi=CTc/gpadmin                               
                                  : gpuser=CTc/gpadmin                              
 template0   | gpadmin | UTF8     | =c/gpadmin             | 27 MB   | pg_default | 
                                  : gpadmin=CTc/gpadmin                             
 template1   | gpadmin | UTF8     | =c/gpadmin             | 28 MB   | pg_default | Default template database
                                  : gpadmin=CTc/gpadmin                             
(11 rows)

template1-# 

The gadget spec URL could not be found

The gadget spec URL could not be found

How to make a copy of a schema in the same database in Greenplum?

posted Jan 8, 2015, 4:22 PM by Sachchida Ojha   [ updated Jan 8, 2015, 4:23 PM ]

Many times we need to make a copy of the schema to preserve the test data. In oracle we use remap_schema option to make a copy.
The gadget spec URL could not be found
Since Greenplum database restore command (gpdbrestore) does not give you option to remap the schema so here is the trick that you can use. Lets make a copy of schema sachi2014 to sachi2015 on database sachi.

1. Backup sachi2014 schema.
2. make a checksum report (no of tables etc in the sachi2014 schema).
3. Rename schema sachi2014 to sachi2015.
4. restore sachi2014 schema.
5. run the checksum report to validate the schema objects and data.

The gadget spec URL could not be found
[gpadmin@sachi db_dumps]$ gpcrondump -x sachi -s sachi2014 
20150108:18:46:31:017728 gpcrondump:sachi:gpadmin-[INFO]:-Starting gpcrondump with args: -x sachi -s sachi2014
20150108:18:46:31:017728 gpcrondump:sachi:gpadmin-[INFO]:----------------------------------------------------
20150108:18:46:31:017728 gpcrondump:sachi:gpadmin-[INFO]:-Master Greenplum Instance dump parameters
20150108:18:46:31:017728 gpcrondump:sachi:gpadmin-[INFO]:----------------------------------------------------
20150108:18:46:31:017728 gpcrondump:sachi:gpadmin-[INFO]:-Dump type                            = Single database
20150108:18:46:31:017728 gpcrondump:sachi:gpadmin-[INFO]:-Database to be dumped                = sachi
20150108:18:46:31:017728 gpcrondump:sachi:gpadmin-[INFO]:-Schema to be dumped                  = sachi2014
20150108:18:46:31:017728 gpcrondump:sachi:gpadmin-[INFO]:-Master port                          = 5432
20150108:18:46:31:017728 gpcrondump:sachi:gpadmin-[INFO]:-Master data directory                = /home/gpmaster/gpsne-1
20150108:18:46:31:017728 gpcrondump:sachi:gpadmin-[INFO]:-Run post dump program                = Off
20150108:18:46:31:017728 gpcrondump:sachi:gpadmin-[INFO]:-Rollback dumps                       = Off
20150108:18:46:31:017728 gpcrondump:sachi:gpadmin-[INFO]:-Dump file compression                = On
20150108:18:46:31:017728 gpcrondump:sachi:gpadmin-[INFO]:-Clear old dump files                 = Off
20150108:18:46:31:017728 gpcrondump:sachi:gpadmin-[INFO]:-Update history table                 = Off
20150108:18:46:31:017728 gpcrondump:sachi:gpadmin-[INFO]:-Secure config files                  = Off
20150108:18:46:31:017728 gpcrondump:sachi:gpadmin-[INFO]:-Dump global objects                  = Off
20150108:18:46:31:017728 gpcrondump:sachi:gpadmin-[INFO]:-Vacuum mode type                     = Off
20150108:18:46:31:017728 gpcrondump:sachi:gpadmin-[INFO]:-Ensuring remaining free disk         > 10

Continue with Greenplum dump Yy|Nn (default=N):
> y
20150108:18:46:35:017728 gpcrondump:sachi:gpadmin-[INFO]:-Directory /home/gpmaster/gpsne-1/db_dumps/20150108 exists
20150108:18:46:35:017728 gpcrondump:sachi:gpadmin-[INFO]:-Checked /home/gpmaster/gpsne-1 on master
20150108:18:46:35:017728 gpcrondump:sachi:gpadmin-[INFO]:-Configuring for single database dump
20150108:18:46:36:017728 gpcrondump:sachi:gpadmin-[INFO]:-Adding compression parameter
20150108:18:46:36:017728 gpcrondump:sachi:gpadmin-[INFO]:-Adding schema name sachi2014
20150108:18:46:36:017728 gpcrondump:sachi:gpadmin-[INFO]:-Dump command line gp_dump -p 5432 -U gpadmin --gp-d=db_dumps/20150108 --gp-r=/home/gpmaster/gpsne-1/db_dumps/20150108 --gp-s=p --gp-c -n "\"sachi2014\"" sachi
20150108:18:46:36:017728 gpcrondump:sachi:gpadmin-[INFO]:-Starting dump process
20150108:18:46:48:017728 gpcrondump:sachi:gpadmin-[INFO]:-Dump process returned exit code 0
20150108:18:46:48:017728 gpcrondump:sachi:gpadmin-[INFO]:-Timestamp key = 20150108184636
20150108:18:46:48:017728 gpcrondump:sachi:gpadmin-[INFO]:-Checked master status file and master dump file.
20150108:18:46:48:017728 gpcrondump:sachi:gpadmin-[INFO]:-Dump status report
20150108:18:46:48:017728 gpcrondump:sachi:gpadmin-[INFO]:----------------------------------------------------
20150108:18:46:48:017728 gpcrondump:sachi:gpadmin-[INFO]:-Target database                          = sachi
20150108:18:46:48:017728 gpcrondump:sachi:gpadmin-[INFO]:-Dump subdirectory                        = 20150108
20150108:18:46:48:017728 gpcrondump:sachi:gpadmin-[INFO]:-Clear old dump directories               = Off
20150108:18:46:48:017728 gpcrondump:sachi:gpadmin-[INFO]:-Dump start time                          = 18:46:36
20150108:18:46:48:017728 gpcrondump:sachi:gpadmin-[INFO]:-Dump end time                            = 18:46:48
20150108:18:46:48:017728 gpcrondump:sachi:gpadmin-[INFO]:-Status                                   = COMPLETED
20150108:18:46:48:017728 gpcrondump:sachi:gpadmin-[INFO]:-Dump key                                 = 20150108184636
20150108:18:46:48:017728 gpcrondump:sachi:gpadmin-[INFO]:-Dump file compression                    = On
20150108:18:46:48:017728 gpcrondump:sachi:gpadmin-[INFO]:-Vacuum mode type                         = Off
20150108:18:46:48:017728 gpcrondump:sachi:gpadmin-[INFO]:-Exit code zero, no warnings generated
20150108:18:46:48:017728 gpcrondump:sachi:gpadmin-[INFO]:----------------------------------------------------
20150108:18:46:48:017728 gpcrondump:sachi:gpadmin-[WARNING]:-Found neither /usr/local/greenplum-db/./bin/mail_contacts nor /home/gpadmin/mail_contacts
20150108:18:46:48:017728 gpcrondump:sachi:gpadmin-[WARNING]:-Unable to send dump email notification
20150108:18:46:48:017728 gpcrondump:sachi:gpadmin-[INFO]:-To enable email notification, create /usr/local/greenplum-db/./bin/mail_contacts or /home/gpadmin/mail_contacts containing required email addresses

The gadget spec URL could not be found
[gpadmin@sachi db_dumps]$ psql -d sachi
psql (8.2.15)
Type "help" for help.

sachi=# alter schema sachi2014 rename to sachi2015;
ALTER SCHEMA
sachi=# \q
The gadget spec URL could not be found
[gpadmin@sachi db_dumps]$ gpdbrestore -s sachi
20150108:18:48:09:018102 gpdbrestore:sachi:gpadmin-[INFO]:-Starting gpdbrestore with args: -s sachi
20150108:18:48:09:018102 gpdbrestore:sachi:gpadmin-[INFO]:-Scanning Master host for latest dump file set for database sachi
20150108:18:48:09:018102 gpdbrestore:sachi:gpadmin-[INFO]:-Located dump file gp_cdatabase_1_1_20150108183946 for database sachi, adding to list
20150108:18:48:09:018102 gpdbrestore:sachi:gpadmin-[INFO]:-Located dump file gp_cdatabase_1_1_20150108184636 for database sachi, adding to list
20150108:18:48:09:018102 gpdbrestore:sachi:gpadmin-[INFO]:-Identified latest dump timestamp for sachi as 20150108184636
20150108:18:48:09:018102 gpdbrestore:sachi:gpadmin-[INFO]:-------------------------------------------
20150108:18:48:09:018102 gpdbrestore:sachi:gpadmin-[INFO]:-Greenplum database restore parameters
20150108:18:48:09:018102 gpdbrestore:sachi:gpadmin-[INFO]:-------------------------------------------
20150108:18:48:09:018102 gpdbrestore:sachi:gpadmin-[INFO]:-Restore type               = Full Database
20150108:18:48:09:018102 gpdbrestore:sachi:gpadmin-[INFO]:-Database to be restored    = sachi
20150108:18:48:09:018102 gpdbrestore:sachi:gpadmin-[INFO]:-Drop and re-create db      = Off
20150108:18:48:09:018102 gpdbrestore:sachi:gpadmin-[INFO]:-Restore method             = Search for latest
20150108:18:48:09:018102 gpdbrestore:sachi:gpadmin-[INFO]:-Restore timestamp          = 20150108184636
20150108:18:48:09:018102 gpdbrestore:sachi:gpadmin-[INFO]:-Restore compressed dump    = On
20150108:18:48:09:018102 gpdbrestore:sachi:gpadmin-[INFO]:-Restore global objects     = Off
20150108:18:48:09:018102 gpdbrestore:sachi:gpadmin-[INFO]:-Array fault tolerance      = n

Continue with Greenplum restore Yy|Nn (default=N):
> y
20150108:18:48:28:018102 gpdbrestore:sachi:gpadmin-[INFO]:-gp_restore -i -h sachi -p 5432 -U gpadmin --gp-d=db_dumps/20150108 --gp-i --gp-k=20150108184636 --gp-r=db_dumps/20150108 --gp-l=p --gp-c -d sachi
20150108:18:51:41:018102 gpdbrestore:sachi:gpadmin-[INFO]:-Commencing analyze of sachi database, please wait
20150108:18:52:14:018102 gpdbrestore:sachi:gpadmin-[INFO]:-Analyze of sachi completed without error
[gpadmin@sachi db_dumps]$ 

The gadget spec URL could not be found
The gadget spec URL could not be found


The gadget spec URL could not be found


The gadget spec URL could not be found

Running a Parallel Backup in Greenplum

posted Oct 30, 2014, 12:58 PM by Sachchida Ojha   [ updated Nov 3, 2014, 7:21 PM ]

Parallel backups are issued with the gp_dump command. When this command is executed: 
The gadget spec URL could not be found
  1. Each active segment is dumped in parallel. 
  2. A single dump file is created in the data directory for each of these segments. The dump file contains the data for an individual segment instance. 
  3. The backup files for this single gp_dump process is identified by a unique 14-digit timestamp key. 
  4. The master dumps the configuration of the database, or the system catalog, as well as DDL statements for the database and schemas. Data can be compressed to save space. This is accomplished with the --gp-c option to the gp_dump command. The dump files are created on the file system, so you must also ensure that there is sufficient disk space for the dump files both on the master and the segments.
Dump Files Created During Parallel Backup
The gadget spec URL could not be found
Here is an overview of the files created by a gp_dump. By default, the dump files are created in the data directory of the instance that was dumped. On the master, dump files of the following are created: 
  1. System Catalog data 
  2. CREATE DATABASE statement 
  3. DDL to recreate schema and database objects, the following files are created on the master:
        Dump File Description
  • gp_catalog_1_<dbid>_<timestamp> System Catalog tables
  • gp_cdatabase_1_<dbid>_<timestamp> CREATE DATABASE statement
  • gp_dump_1_<dbid>_<timestamp> Database schemas
  • gp_dump_status_1_<dbid>_<timestamp> Log file
  1.  Log On the segments, COPY statements and user data 
  2. Log 
  3. and the following dump files are created on the segments:
Dump File Description
  • gp_dump_0_<dbid>_<timestamp> Data for the segment
  • gp_dump_status_0_<dbid>_<timestamp> Log file

1-10 of 14