Setting up Greenplum Chorus 2.4
We may need to configure certain properties to run Greenplum Chorus
1. The chorus.properties file
2. To configure or change the HTTP port number
3. To configure or change the PostgreSQL Database port number
4. To configure parameters for the Java Virtual Machine
5. To configure the indexing frequency of database instances
6. To configure an external server to import data with gpfdist
7.To run data_import
8.To enable use of Oracle databases
1. chorus.properties file
Much of the Chorus configuration is defined in the chorus.properties file. It is located in the <installation directory>/shared/ directory.
In this same directory, there is a file named chorus.properties.example. This file contains all the attributes that can be included in the chorus.properties file. The
attributes in the actual configuration file, chorus.properties, may be a subset of the attributes in chorus.properties.example. You can include any attribute you
find in chorus.properties.example in your chorus.properties configuration file.
[chorus@sachi greenplum-chorus]$ ls
chorus_control.sh chorus_path.sh chorus_psql.sh chorus_rails_console.sh current install.log releases shared
[chorus@sachi greenplum-chorus]$ pwd
/usr/local/greenplum-chorus
[chorus@sachi greenplum-chorus]$ cd shared
[chorus@sachi shared]$ ls -l
total 40
-rw-------. 1 chorus chorus 1950 Oct 30 10:56 chorus.properties
-rw-r--r--. 1 chorus chorus 4051 Oct 30 10:56 chorus.properties.example
-rw-rw-r--. 1 chorus chorus 247 Oct 30 11:21 database.yml
lrwxrwxrwx. 1 chorus chorus 26 Oct 30 10:58 db -> /disk3/greenplum-chorus/db
drwxrwxr-x. 2 chorus chorus 4096 Oct 30 10:56 demo_data
drwxrwxr-x. 2 chorus chorus 4096 Oct 30 10:56 libraries
lrwxrwxrwx. 1 chorus chorus 27 Oct 30 10:58 log -> /disk3/greenplum-chorus/log
-rw-------. 1 chorus chorus 45 Oct 30 10:58 secret.key
-rw-------. 1 chorus chorus 128 Oct 30 10:58 secret.token
drwxrwxr-x. 2 chorus chorus 4096 Oct 30 10:58 solr
-rw-rw-r--. 1 chorus chorus 147 Oct 30 10:56 sunspot.yml
lrwxrwxrwx. 1 chorus chorus 30 Oct 30 10:58 system -> /disk3/greenplum-chorus/system
drwxrwxr-x. 3 chorus chorus 4096 Oct 30 10:56 tmp
Let's look at what's inside chorus.properties file.
[chorus@sachi shared]$ cat chorus.properties
# Only defaults that should apply to all applications should go in here,
# optional configurations should only go into chorus.properties.example
# Server Settings
server_port = 8080
postgres_port = 8543
solr_port = 8983
java_options = -Djava.library.path=$CHORUS_HOME/vendor/hadoop/lib/ -server -Xmx2048m -Xms512m -XX:MaxPermSize=128m
# Runtime Settings
# The default session timeout time (length of time that you need to remain
# inactive for you to be logged out) is 8 hours.
session_timeout_minutes = 480
instance_poll_interval_minutes = 5
delete_unimported_csv_files_interval_hours = 6
delete_unimported_csv_files_after_hours = 24
reindex_search_data_interval_hours = 24
reset_counter_cache_interval_hours = 24
sandbox_recommended_size_in_gb = 5
# The number of rows to be shown in a preview by default.
default_preview_row_limit = 500
# Maximum execution time of visualizations and workfiles, in minutes
execution_timeout_in_minutes = 10
# Concurrency Settings
# Configure thread pool size of webserver and worker processes.
#
# The # of webserver threads determines the maximum number of simultaneous web
# requests. The # of worker threads determines the maximum number of
# asychronous jobs, such as table copying or importing, that can be run
# simultaneously.
#
# Each web or worker thread may use its own connection to the local Postgresql
# database, thus the sum of 'worker_threads' + 'webserver_threads' must be less
# than the 'max_connections' configured in postgresql.conf.
#
# The 'max_connections' parameter may be based on your operating system's kernel
# shared memory size. For example, on OS X this parameter will default to 20.
worker_threads = 10
database_threads = 100
webserver_threads = 40
# File Size Settings
file_sizes_mb.workfiles = 10
file_sizes_mb.csv_imports = 100
file_sizes_mb.user_icon = 5
file_sizes_mb.workspace_icon = 5
file_sizes_mb.attachment = 10
# Logging Settings
logging.loglevel = info
[chorus@sachi shared]$
2. To configure or change the HTTP port number
The default HTTP port for Greenplum Chorus is 8080. You can change it to any free port number above 1024.
1. Edit the <installation directory>/shared/chorus.properties file. Change the server_port entry to the port number you want. For example:
server_port= 1550
2. Restart Greenplum Chorus.
Note: If ssl is enabled and configured, this HTTP port will redirect to the ssl_server_port
3. To configure or change the PostgreSQL Database port number
The default port number for the PostgreSQL database listening is 8543. You can change it to any free port number above 1024.
1. Edit the <installation directory>/shared/chorus.properties file. Change the postgres_port entry to the port number you want. For example:
postgres_port= 9000
2. Restart Greenplum Chorus.
4. To configure or change the Solr port number
The default port number for Solr is 8983. You can change it to any free port number above 1024.
1. Edit the <installation directory>/shared/chorus.properties file. Change the solr_port entry to the port number you want For example:
solr_port= 9001
2. Restart Greenplum Chorus.
5. To configure parameters for the Java Virtual Machine
1. Edit the <installation directory>/shared/chorus.properties file. Change the java_options entry as you wish. For example:
java_options=-Djava.library.path=$CHORUS_HOME/vendor/hadoop/lib/ -server -Xmx1024m -Xms512m -XX:MaxPermSize=128m
2. Restart Greenplum Chorus.
6. To configure the indexing frequency of database instances
1. Edit the <installation directory>/shared/chorus.properties file. Change the reindex_datasets_interval_hours entry to the time interval you want. For example:
reindex_datasets_interval_hours= 24
2. Restart Greenplum Chorus.
7. To configure an external server to import data with gpfdist
To enable data movement between databases, gpfdist must be installed and running on the Chorus host. Two processes must be started: Start one process for writing and one process for reading, each with different ports but pointing to the same directory. See the Greenplum Database Administrator Guide on how to configure gpfdist.
1. Download the gpfdist package and install it.
2. Examine the gpfdist entry in <installation directory>/shared/chorus.properties. For example,
gpfdist.ssl.enabled= false
Note: Set gpfdist.ssl.enabled to true if gpfdist is configured with ssl certificates. ssl certificates must be installed on all segment servers.
gpfdist.url= sample-gpfdist-server
Note: This url must be the externally accessible url that can be resolved from the source/destination servers.
gpfdist.write_port= 8000
gpfdist.read_port= 8001
gpfdist.data_dir= /tmp
3. Start gpfdist with the write_port value and the data_dir value.
4. Start gpfdist with the read_port value and the data_dir value.
5. Restart chorus to activate the changes.
8. To run data_import
For more complete information about gpfdist, go to Support Zone and refer to The Greenplum Database Administrator Guide 4.2.
9. To enable use of Oracle databases
1. Place the Oracle client driver, jar ojdb6.jar, in <installation directory>/shared/libraries. You can find the Oracle client driver at:
http://www.oracle.com/technetwork/database/enterprise-edition/jdbc-112010-090769.html.
2. Set the permissions on the file: chmod 644 ojdbc6.jar
chown chorus:chorus ojdbc6.jar
3. Set oracle.enabled to true in the chorus.properties file (see “Customizing chorus.properties” ).
4. Restart Greenplum Chorus.