Debugging SQL Query Performance Issues

When application user complaints that SQL query performance is degrading over time, here are the useful steps to debug the issue in detail.

1. Check the stats/last analyzed of the tables: This is to ensure that Query plan is using up-to-date stats. If Stats are stale, run analyze on table with stale statistics.  

select objname,actionname,statime from pg_stat_operations a

where actionname='ANALYZE'

and upper(objname) in ('list of table name in the SQL query')

and statime=(select max(statime) from pg_stat_operations b where a.objname=b.objname and actionname='ANALYZE') 

order by objname

2. Check  gp_bloat_diag view for bloat 

select * from gp_toolkit.gp_bloat_diag where bdinspname='schema_name' and upper(bdirelname) in ('list of table name in the SQL query')

3.  Check gp_stats_missing view for missing stats 

select * from gp_toolkit.gp_stats_missing where smischema='schema_name' and upper(smitable) in ('list of table name in the SQL query');

4. Check distribution keys of the tables

select table_owner,table_name,coalesce(distribution_keys, 'RANDOMLY')

from

(

SELECT pgn.nspname as table_owner,pgc.relname as table_name,pga.attname as distribution_keys

FROM (SELECT gdp.localoid,

CASE

WHEN ( Array_upper(gdp.attrnums, 1) > 0 ) THEN

Unnest(gdp.attrnums)

ELSE NULL

END AS attnum

FROM gp_distribution_policy gdp

ORDER BY gdp.localoid) AS distrokey

INNER JOIN pg_class AS pgc

ON distrokey.localoid = pgc.oid

INNER JOIN pg_namespace pgn

ON pgc.relnamespace = pgn.oid

LEFT OUTER JOIN pg_attribute pga

ON distrokey.attnum = pga.attnum

AND distrokey.localoid = pga.attrelid

ORDER BY pgn.nspname,

pgc.relname

) a

where table_owner='schema_name'

and upper(table_name) in ('list of table name in the SQL query');

5. Check distributions/Skew of the table

SELECT COUNT(*), gp_segment_id FROM <table_name> GROUP BY gp_segment_id;

SELECT 'table_name' as "Table Name",max(c) as "Max Seg Rows",

min(c) as "Min Seg Rows",

(max(c)-min(c))*100.0/max(c) as "Percentage Difference Between Max & Min"

from (SELECT count(*) c, gp_segment_id FROM <table_name> group by 2) as a;

select skcoid,skcnamespace,skcrelname,skccoeff

from gp_toolkit.gp_skew_coefficients

where skcnamespace='schema_name'

and upper(skcrelname) in <table_name>;

you can also check gp_skew_idle_fractions view.

6. Run Explain/explain analyze plan on the SQL.