Analyzing Query plans
Post date: Aug 25, 2014 8:47:33 PM
Query plans are a right to left tree plan of nodes that are read from bottom to top with each node passing its result to the node directly above. There is one line for each node in the plan tree that represents a single operation, for example a scan, join, aggregation or sort operation. The node will identify the method used to perform the operation. For example a scan operation may perform a sequential scan or index scan. A join operation may perform a hash join or nested loop join.
The query plan will also include motion nodes. The motion operations are responsible for moving rows between segments required to process the query. The node will identify the method used to perform the motion operation for example, a redistribution motion or broadcast motion. A gather motion is when segments send the resulting rows to the master host. The last operation for most query plans will be a gather motion.
Scan operators include Seq Scan on heap tables, Append-Only Scan on row oriented AO tables, Append-Only Columnar Scan on column oriented AO tables, Index scan, and Bitmap Append-Only Row-Oriented Scan.
Join operators include hash join, nested loop join and merge join. Hash joins are typically the fastest joins in PDB. A nested loop join requires each outer tuple to be compared with each inner tuple that might join to it and requires one of the tables to be broadcasted. The nested loop join requires the broadcast of one of the tables so that all rows in one table can be compared to all rows in the other table. Nested loop joins perform well for small tables or tables limited by index use. It is also used for Cartesian joins and range joins. There are performance implications when using a nested loop join with large tables. For plan nodes that contain a nested loop join operator validate the SQL and ensure that the results are what is intended.
Poorly written or incorrect SQL is often times the primary offender affecting query performance. Also, joining two keys with different data types can result in a nested loop join. A merge join sorts both datasets and merges it together. A merge join is fast for pre-ordered data and is very rare in the real world.
Motion operators include broadcast motion, redistribute motion and gather motion. In a broadcast motion every segment performs a broadcast (or sends) its own, individual rows to all other segments. This will result in every segment instance having its own complete and local copy of the entire table.
A broadcast motion may not be as optimal as a redistribute motion therefore the optimizer typically only selects a broadcast motion for small tables. A broadcast motion is not acceptable for large tables. In the case where data was not distributed on the join key, a dynamic redistribution of the needed rows from one of the tables to another segment will be performed.
A gather motion is the last operation for most query plans.
To analyze query plans first identify plan nodes where the estimated cost to perform the operation is very high. Determine if the estimated number of rows seems reasonable and the cost relative to the number of rows for the operation performed.
If using partitioning validate that partition elimination is achieved. To achieve partition elimination the query predicate (WHERE clause) must be the same as the partitioning criteria. It is also important to note in order to eliminate partitions the WHERE clause must contain an explicit value.
The WHERE clause cannot contain a subquery if partitions are to be eliminated.
Review the execution order of the query plan tree. Review the estimated number of rows. We want the execution order to build on the smaller tables or hash join result and probe with larger tables.
Optimally the largest table is used for the final join or probe to reduce the number of rows being passed up the tree to the topmost plan nodes. If the analysis reveals that the order of execution builds and/or probes is not optimal ensure that database statistics are up to date. Running
ANALYZE will likely address this and produce an optimal query plan.
Identify plan nodes where a sort or aggregate operation is performed. Hidden inside an aggregate operation is sort. If the sort or aggregate operation involves a large number of rows there is opportunity to improve query performance. A HashAggregate operation is preferred over sort and aggregate operations when a large number of rows are required to be sorted. Usually a sort operation is chosen by the optimizer due to the SQL construct, that is, due to the way the SQL is written. Most sort operations can be replaced with a HashAgg if the query is rewritten. To favor a HashAggregate operation over a sort and aggregate operation ensure the enable_groupagg GUC is set to ON.
Eliminate large table broadcast motions for redistribute motions. If the number of rows is large then consider using the gp_segments_for_planner GUC to increase the cost of the motion. This sets the number of primary segments for the planner to assume in its cost and size estimates. If gp_segments_for_planner is set to zero (0) the default, then the value used is the actual number of primary segments. This variable affects the query planner’s estimates of the number of rows handled by each sending and receiving process in motion operators. Increasing the number of primary segments will increase the cost of the motion thereby favoring a redistribute motion over a broadcast motion. For example, setting gp_segments_for_planner = 100000 tells the planner that there are 100000 segments. Conversely to influence the optimizer to broadcast a table and not redistribute it, set gp_segments_for_planner to a low number, for example two (2).