Window Expressions in Greenplum

Window Expressions

Window expressions allow application developers to more easily compose complex online analytical processing (OLAP) queries using standard SQL commands. For example, moving averages or sums can be calculated over various intervals; aggregations and ranks can be reset as selected column values change; and complex ratios can be expressed in simple terms. 

A window expression represents the application of a window function applied to a window frame, which is defined in a special OVER() clause. A window partition is a set of rows that are grouped together for the purpose of applying an window function. 

Unlike aggregate functions, which return a result value for each group of rows, window functions return a result value for every row, but that value is calculated with respect to the rows in a particular window partition. If no partition is specified, the window function is computed over the complete intermediate result set.

The syntax of a window expression is:

window_function ( [expression [, ...]] ) OVER ( window_specification )

Where window_function is one of the functions listed in “Window Functions”, expression is any value expression that does not itself contain a window expression, and window_specification is:

[window_name]

[PARTITION BY expression [, ...]]

[[ORDER BY expression [ASC | DESC | USING operator] [, ...]

[{RANGE | ROWS} 

{ UNBOUNDED PRECEDING

| expression PRECEDING

| CURRENT ROW

| BETWEEN window_frame_bound AND window_frame_bound }]]

and where window_frame_bound can be one of:

UNBOUNDED PRECEDING

expression PRECEDING

CURRENT ROW

expression FOLLOWING

UNBOUNDED FOLLOWING

A window expression may only appear in the select list of a SELECT command. For example:

SELECT count(*) OVER(PARTITION BY customer_id), * FROM sales;

The OVER clause is what differentiates window functions from other aggregate or reporting functions. The OVER clause defines the window_specification to which the window function is applied. A window specification has the following characteristics:

1. The PARTITION BY clause, which defines the window partitions to which the window function is applied. If omitted, the entire result set is treated as one partition.

2. The ORDER BY clause defines the expression(s) for sorting rows within a window partition. Note that the ORDER BY of a window specification is separate and distinct from the ORDER BY clause of a regular query expression (see “The ORDER BY Clause” on page 590). The ORDER BY clause is required for the window functions that calculate rankings, as it identifies the measure(s) for the ranking values. For OLAP aggregations, the ORDER BY clause is required in order to use window frames (the ROWS | RANGE clause). 

Note: Columns of data types that lack a coherent ordering, such as time, are not good candidates for use in the ORDER BY clause of a window specification. Time, with or without time zone, lacks a coherent ordering because addition and subtraction do not have the expected effects. For example, the following is not generally true: x::time < x::time + '2 hour'::interval

3. The ROWS/RANGE clause is used to define a window frame for aggregate (non-ranking) window functions. A window frame defines a set of rows within a window partition. When a window frame is defined, the window function is computed with respect to the contents of this moving frame rather than the fixed contents of the entire window partition. Window frames can be row-based (ROWS) or value-based (RANGE).