Data Types to use in a Distribution Key
Post date: Oct 31, 2014 11:43:4 AM
The type smallint, integer, and bigint store whole numbers, that is, numbers without fractional components, of various ranges. Attempts to store values outside of the allowed range will result in an error.
The type integer is the common choice, as it offers the best balance between range, storage size, and performance. The smallint type is generally only used if disk space is at a premium. The bigint type should only be used if the integer range is insufficient, because the latter is definitely faster.
On very minimal operating system the bigint type might not function correctly, because it relies on compiler support for eight-byte integers. On such machines, bigint acts the same as integer, but still takes up eight bytes of storage. (We are not aware of any modern platform where this is the case.)
SQL only specifies the integer types integer (or int), smallint, and bigint. The type names int2, int4, and int8 are extensions, which are also used by some other SQL based DBMS.
CHAR is allowed but must contain a length that is reasonable with high cardinality to mimic the same distribution effect that of DATE.
VARCHAR is allowed but must contain a length that is reasonable with high cardinality to mimic the same distribution effect that of DATE.
NUMERIC has no limit but it can store numbers with up to 1000 digits of precision.
BYTEA 1 or 4 bytes plus the actual binary string. It is variable-length.
TIMESTAMP behaves just as DATE does
TIME behaves just as DATE and TIMESTAMP does
DATE fields are also favorable for these act as if the data is basically distributed simulating that of a partition. Do not use it alone in the distribution.