MADLib and MADLib pivot function

Post date: Oct 06, 2016 7:25:1 PM

Apache MADlib is a SQL-based open source library for scalable in-database analytics that supports Greenplum Database. The library offers data scientists numerous distributed implementations of mathematical, statistical and machine learning methods, including many utilities for data transformation.

New utilities have been added in the recent MADlib 1.9.1 release, including:

Pivot: data summarization tool that can do basic OLAP type operations

Sessionization: time-oriented session reconstruction on a data set comprising a sequence of events

Prediction metrics: set of metrics to evaluate the quality of predictions of a model

For more details

https://madlib.incubator.apache.org/docs/latest/group__grp__pivot.html

https://blog.pivotal.io/big-data-pivotal/products/new-tools-to-shape-data-in-apache-madlib

How to Install MADLib in Greenplum database

https://discuss.pivotal.io/hc/en-us/articles/204865798-How-to-install-or-uninstall-MADlib