Understanding and Defining Database Performance in Greenplum

Understanding Database Performance in Greenplum

Before we define the Greenplum database performance, we need to understand that Greenplum Database is a data warehouse database system and Understanding the key performance factors can help avoid performance problems or identify performance opportunities:

Several factors that influence greenplum database performance are,

1. System Resources

Database performance relies heavily on disk I/O and memory usage. Knowing the baseline performance of the hardware on which your DBMS is deployed is essential is setting performance expectations. Performance of hardware components such as CPUs, hard disks, disk controllers, RAM, and network interfaces (and the interaction of these resources) will have a profound effect on how fast your database performs.

2. Workload

Your workload equals the total demand from the DBMS. The total workload is a combination of ad-hoc user queries, applications, batch jobs, transactions, and system commands directed through the DBMS at any given time. Workload, or demand, can change over time. For example, it may increase when month-end reports need to be run, or decrease on weekends when most users are out of the office. Workload is a major influence on database performance. Knowing your workload and peak demand times will help you plan for the most efficient use of your system resources, and enable the largest possible workload to be processed.

3. Throughput

A system’s throughput defines its overall capability to process data. DBMS throughput can be measured in queries per second, transactions per second, or average response times. DBMS throughput is closely related to the processing capacity of the underlying systems (disk I/O, CPU speed, memory bandwidth, and so on), so it is important to know the throughput capacity of your hardware when setting DBMS throughput goals.

4. Contention

Contention is the condition in which two or more components of the workload are attempting to use the system in a conflicting way — for example, trying to update the same piece of data at the same time, or running multiple large workloads at once that compete with each other for system resources. As contention increases, throughput decreases.

5. Optimization

Optimizations that you make to your DBMS can affect the overall performance of your system. SQL formulation, database configuration parameters, table design, data distribution, and so on can enable the database query planner and optimizer to create the most efficient access plans.