Setting up Greenplum Chorus 2.4

We may need to configure certain properties to run Greenplum Chorus

1. The chorus.properties file

2. To configure or change the HTTP port number

3. To configure or change the PostgreSQL Database port number

4. To configure parameters for the Java Virtual Machine

5. To configure the indexing frequency of database instances

6. To configure an external server to import data with gpfdist

7.To run data_import

8.To enable use of Oracle databases

1. chorus.properties file

Much of the Chorus configuration is defined in the chorus.properties file. It is located in the <installation directory>/shared/ directory.

In this same directory, there is a file named chorus.properties.example. This file contains all the attributes that can be included in the chorus.properties file. The

attributes in the actual configuration file, chorus.properties, may be a subset of the attributes in chorus.properties.example. You can include any attribute you

find in chorus.properties.example in your chorus.properties configuration file.

[chorus@sachi greenplum-chorus]$ ls

chorus_control.sh  chorus_path.sh  chorus_psql.sh  chorus_rails_console.sh  current  install.log  releases  shared

[chorus@sachi greenplum-chorus]$ pwd

/usr/local/greenplum-chorus

[chorus@sachi greenplum-chorus]$ cd shared

[chorus@sachi shared]$ ls -l

total 40

-rw-------. 1 chorus chorus 1950 Oct 30 10:56 chorus.properties

-rw-r--r--. 1 chorus chorus 4051 Oct 30 10:56 chorus.properties.example

-rw-rw-r--. 1 chorus chorus  247 Oct 30 11:21 database.yml

lrwxrwxrwx. 1 chorus chorus   26 Oct 30 10:58 db -> /disk3/greenplum-chorus/db

drwxrwxr-x. 2 chorus chorus 4096 Oct 30 10:56 demo_data

drwxrwxr-x. 2 chorus chorus 4096 Oct 30 10:56 libraries

lrwxrwxrwx. 1 chorus chorus   27 Oct 30 10:58 log -> /disk3/greenplum-chorus/log

-rw-------. 1 chorus chorus   45 Oct 30 10:58 secret.key

-rw-------. 1 chorus chorus  128 Oct 30 10:58 secret.token

drwxrwxr-x. 2 chorus chorus 4096 Oct 30 10:58 solr

-rw-rw-r--. 1 chorus chorus  147 Oct 30 10:56 sunspot.yml

lrwxrwxrwx. 1 chorus chorus   30 Oct 30 10:58 system -> /disk3/greenplum-chorus/system

drwxrwxr-x. 3 chorus chorus 4096 Oct 30 10:56 tmp

Let's look at what's inside chorus.properties file.

[chorus@sachi shared]$ cat chorus.properties

# Only defaults that should apply to all applications should go in here,

# optional configurations should only go into chorus.properties.example

# Server Settings

server_port = 8080

postgres_port = 8543

solr_port = 8983

java_options = -Djava.library.path=$CHORUS_HOME/vendor/hadoop/lib/ -server -Xmx2048m -Xms512m -XX:MaxPermSize=128m

# Runtime Settings

# The default session timeout time (length of time that you need to remain

# inactive for you to be logged out) is 8 hours.

session_timeout_minutes = 480

instance_poll_interval_minutes = 5

delete_unimported_csv_files_interval_hours = 6

delete_unimported_csv_files_after_hours = 24

reindex_search_data_interval_hours = 24

reset_counter_cache_interval_hours = 24

sandbox_recommended_size_in_gb = 5

# The number of rows to be shown in a preview by default.

default_preview_row_limit = 500

# Maximum execution time of visualizations and workfiles, in minutes

execution_timeout_in_minutes = 10

# Concurrency Settings

# Configure thread pool size of webserver and worker processes.

#

# The # of webserver threads determines the maximum number of simultaneous web

# requests. The # of worker threads determines the maximum number of

# asychronous jobs, such as table copying or importing, that can be run

# simultaneously.

#

# Each web or worker thread may use its own connection to the local Postgresql

# database, thus the sum of 'worker_threads' + 'webserver_threads' must be less

# than the 'max_connections' configured in postgresql.conf.

#

# The 'max_connections' parameter may be based on your operating system's kernel

# shared memory size. For example, on OS X this parameter will default to 20.

worker_threads = 10

database_threads = 100

webserver_threads = 40

# File Size Settings

file_sizes_mb.workfiles = 10

file_sizes_mb.csv_imports = 100

file_sizes_mb.user_icon = 5

file_sizes_mb.workspace_icon = 5

file_sizes_mb.attachment = 10

# Logging Settings

logging.loglevel = info

[chorus@sachi shared]$ 

2. To configure or change the HTTP port number

The default HTTP port for Greenplum Chorus is 8080. You can change it to any free port number above 1024.

1. Edit the <installation directory>/shared/chorus.properties file. Change the server_port entry to the port number you want. For example:

server_port= 1550

2. Restart Greenplum Chorus. 

Note: If ssl is enabled and configured, this HTTP port will redirect to the ssl_server_port 

3. To configure or change the PostgreSQL Database port number

The default port number for the PostgreSQL database listening is 8543. You can change it to any free port number above 1024.

1. Edit the <installation directory>/shared/chorus.properties file. Change the postgres_port entry to the port number you want. For example:

postgres_port= 9000

2. Restart Greenplum Chorus.

4. To configure or change the Solr port number

The default port number for Solr is 8983. You can change it to any free port number above 1024.

1. Edit the <installation directory>/shared/chorus.properties file. Change the solr_port entry to the port number you want For example:

solr_port= 9001

2. Restart Greenplum Chorus.

5. To configure parameters for the Java Virtual Machine

1. Edit the <installation directory>/shared/chorus.properties file. Change the java_options entry as you wish. For example:

java_options=-Djava.library.path=$CHORUS_HOME/vendor/hadoop/lib/ -server -Xmx1024m -Xms512m -XX:MaxPermSize=128m

2. Restart Greenplum Chorus.

6. To configure the indexing frequency of database instances

1. Edit the <installation directory>/shared/chorus.properties file. Change the reindex_datasets_interval_hours entry to the time interval you want. For example:

reindex_datasets_interval_hours= 24

2. Restart Greenplum Chorus.

7. To configure an external server to import data with gpfdist

To enable data movement between databases, gpfdist must be installed and running on the Chorus host. Two processes must be started: Start one process for writing and one process for reading, each with different ports but pointing to the same directory. See the Greenplum Database Administrator Guide on how to configure gpfdist.

1. Download the gpfdist package and install it.

2. Examine the gpfdist entry in <installation directory>/shared/chorus.properties. For example,

gpfdist.ssl.enabled= false

Note: Set gpfdist.ssl.enabled to true if gpfdist is configured with ssl certificates. ssl certificates must be installed on all segment servers.

gpfdist.url= sample-gpfdist-server

Note: This url must be the externally accessible url that can be resolved from the source/destination servers.

gpfdist.write_port= 8000

gpfdist.read_port= 8001

gpfdist.data_dir= /tmp

3. Start gpfdist with the write_port value and the data_dir value.

4. Start gpfdist with the read_port value and the data_dir value.

5. Restart chorus to activate the changes.

8. To run data_import

For more complete information about gpfdist, go to Support Zone and refer to The Greenplum Database Administrator Guide 4.2.

9. To enable use of Oracle databases

1. Place the Oracle client driver, jar ojdb6.jar, in <installation directory>/shared/libraries. You can find the Oracle client driver at:

http://www.oracle.com/technetwork/database/enterprise-edition/jdbc-112010-090769.html.

2. Set the permissions on the file: chmod 644 ojdbc6.jar

chown chorus:chorus ojdbc6.jar

3. Set oracle.enabled to true in the chorus.properties file (see “Customizing chorus.properties” ).

4. Restart Greenplum Chorus.