Greenplum Chorus enables Big Data agility for your data science team. The first solution of its kind, Greenplum Chorus provides an analytic productivity platform that enables the team to search, explore, visualize, and import data from anywhere in the organization. It provides rich social network features that revolve around datasets, insights, methods, and workflows, allowing data analysts, data scientists, IT staff, DBAs, executives, and other stakeholders to participate and collaborate on Big Data. Customers deploy Chorus to create a self-service agile analytic infrastructure; teams can create workspaces on the fly with self-service provisioning, and then instantly start creating and sharing insights.
Greenplum Chorus is a collaborative platform for data science. Chorus users iterate faster and finish projects sooner through secure access to data and by sharing content and findings within their organization through a platform especially built for this purpose. Organizations will value the empowerment of data science, along with the reduction of IT operational involvement and one-off infrastructure costs. This chapter focuses on how you can prepare your environment for Greenplum Chorus. In particular, this chapter describes the system requirements.
Supported platforms and browsers
Greenplum Chorus operates on the following platforms and browsers:
• Red Hat Enterprise Linux 5.5, 5.7, 6.2 (64 bit)
• CentOS 5.5, 5.7, 6.2 (64 bit)
• SuSE Linux Enterprise Server 11 (64 bit)
• OSX Lion x86_64
• Firefox 17.0 or later
• Google Chrome 23 or later
• Internet Explorer 8.0 with Google Chrome Frame
• Internet Explorer 9.0 (Google Chrome Frame not required)
Note: IE 9 can be made to simulate IE 7 or IE 8 in its “compatibility mode.”
Chorus does not work with IE 7 or 8 (without Chrome frame), so you must disable compatibility mode. To do this:
a. Press the Alt key to open the IE9 menu bar.
b. Choose the Tools menu.
c. If Compatibility View is unchecked, do nothing.
d. If Compatibility View is checked, select it to uncheck it.
New features in Greenplum Chorus 2.4
You can now add CDH4 as a data source in Chorus. This integration simplifies sharing of information and self provisioning of HDFS files into a Greenplum Database by way
of external tables. Once credentials to data sources are detected to be invalid, Chorus will mark the credentials to be considered as invalid and will not use the credentials until the user updates the mapping.
Usability improvements in Greenplum Chorus 2.4
A number of usability improvements have been made to Chorus, including:
• Improved feedback to the user when attempting to access a data source through Chorus with credentials that are not mapped or are considered invalid.
• Improved drag and drop experience in the work file editor. Now starts with a larger canvas and auto highlights the current line.
• Improved data preview functionality in Chorus.
• Improved performance when previewing and scrolling datasets with a large number of columns
• You can now adjust column widths in the data preview.
• You can now sort previewed data by clicking the column name in the header row
• You can now delete unused data sources and associated objects from Chorus.
• A new optional Chorus property allows you to add a configurable prefix to the file names of all csv downloads of datasets from Chorus
• A new optional Chorus property allows you to add a configurable text string to the bottom left corner of all visualizations generated in Chorus.
• Upgraded Chorus' postgres database to version 9.2.4 and the nginx web server to version 1.2.8
To report issues, defects, and recommendations for improvement, create and/or log into your feedback central account. (https://feedbackcentral.emc.com/)
To report issues with Chorus, please include log files from the appropriate date and time. Log file locations: $CHORUS_HOME/shared/log
The most relevant log entries are in the production.log file. For issues during install, please provide install.log located under $CHORUS_HOME/
Greenplum Chorus version compatibility matrix
Greenplum Chorus interoperates with Greenplum Database, Greenplum Hadoop (both available at the EMC Download Center.), Oracle 10g/11g, and Tableau Server 7.0.
Greenplum Database : 4.0.5.x , 4.1.x, 4.2.x
Apache Hadoop (optional): 0.20.2, 0.20.203, 0.20.205
Greenplum HD : 1.1, 1.2
Greenplum MR : 1.0, 1.2.x
Oracle Database: 10g, 11g
Tableau Server: 7.0
Alpine Data Labs Workflows: Alpine