Collecting and monitoring table statistics in Greenplum
Collecting Table/Column level Statistics in Greenplum:
sachi=# vacuum verbose sachi3 ;
INFO: vacuuming "public.sachi3" (seg0 sachi:40000 pid=19169)
INFO: vacuuming "public.sachi3" (seg1 sachi:40001 pid=19171)
INFO: "sachi3": found 0 removable, 0 nonremovable row versions in 0 pages (seg0 sachi:40000 pid=19169)
DETAIL: 0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages contain useful free space.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: "sachi3": found 0 removable, 0 nonremovable row versions in 0 pages (seg1 sachi:40001 pid=19171)
DETAIL: 0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages contain useful free space.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: vacuuming "pg_toast.pg_toast_52386" (seg0 sachi:40000 pid=19169)
INFO: vacuuming "pg_toast.pg_toast_52386" (seg1 sachi:40001 pid=19171)
INFO: index "pg_toast_52386_index" now contains 0 row versions in 1 pages (seg0 sachi:40000 pid=19169)
DETAIL: 0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: index "pg_toast_52386_index" now contains 0 row versions in 1 pages (seg1 sachi:40001 pid=19171)
DETAIL: 0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: "pg_toast_52386": found 0 removable, 0 nonremovable row versions in 0 pages (seg0 sachi:40000 pid=19169)
DETAIL: 0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages contain useful free space.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: "pg_toast_52386": found 0 removable, 0 nonremovable row versions in 0 pages (seg1 sachi:40001 pid=19171)
DETAIL: 0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages contain useful free space.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
VACUUM
sachi=#
sachi=# vacuum verbose analyze sachi3 ;
INFO: vacuuming "public.sachi3" (seg0 sachi:40000 pid=19240)
INFO: vacuuming "public.sachi3" (seg1 sachi:40001 pid=19242)
INFO: "sachi3": found 0 removable, 0 nonremovable row versions in 0 pages (seg1 sachi:40001 pid=19242)
DETAIL: 0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages contain useful free space.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: "sachi3": found 0 removable, 0 nonremovable row versions in 0 pages (seg0 sachi:40000 pid=19240)
DETAIL: 0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages contain useful free space.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: vacuuming "pg_toast.pg_toast_52386" (seg0 sachi:40000 pid=19240)
INFO: vacuuming "pg_toast.pg_toast_52386" (seg1 sachi:40001 pid=19242)
INFO: index "pg_toast_52386_index" now contains 0 row versions in 1 pages (seg0 sachi:40000 pid=19240)
DETAIL: 0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: "pg_toast_52386": found 0 removable, 0 nonremovable row versions in 0 pages (seg0 sachi:40000 pid=19240)
DETAIL: 0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages contain useful free space.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: index "pg_toast_52386_index" now contains 0 row versions in 1 pages (seg1 sachi:40001 pid=19242)
DETAIL: 0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: "pg_toast_52386": found 0 removable, 0 nonremovable row versions in 0 pages (seg1 sachi:40001 pid=19242)
DETAIL: 0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages contain useful free space.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: Executing SQL: select sum(gp_statistics_estimate_reltuples_relpages_oid(c.oid))::float4[] from gp_dist_random('pg_class') c where c.oid=52386
INFO: ANALYZE estimated reltuples=0.000000, relpages=0.000000 for table sachi3
INFO: ANALYZE skipping computing statistics on table sachi3 because it has no rows.
VACUUM
sachi=#
sachi=# vacuum verbose foo ;
INFO: vacuuming "public.foo" (seg0 sachi:40000 pid=19750)
INFO: "foo": found 0 removable, 4000 nonremovable row versions in 5 pages (seg0 sachi:40000 pid=19750)
DETAIL: 0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
1 pages contain useful free space.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: vacuuming "public.foo" (seg1 sachi:40001 pid=19752)
INFO: "foo": found 0 removable, 4000 nonremovable row versions in 5 pages (seg1 sachi:40001 pid=19752)
DETAIL: 0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
1 pages contain useful free space.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
VACUUM
sachi=#
sachi=# explain select * from foo;
QUERY PLAN
--------------------------------------------------------------------------------
Gather Motion 2:1 (slice1; segments: 2) (cost=0.00..90.00 rows=4000 width=8)
-> Seq Scan on foo (cost=0.00..90.00 rows=4000 width=8)
(2 rows)
sachi=# explain analyze select * from foo;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------
Gather Motion 2:1 (slice1; segments: 2) (cost=0.00..90.00 rows=4000 width=8)
Rows out: 8000 rows at destination with 1.376 ms to first row, 4.920 ms to end, start offset by 9.103 ms.
-> Seq Scan on foo (cost=0.00..90.00 rows=4000 width=8)
Rows out: Avg 4000.0 rows x 2 workers. Max 4000 rows (seg0) with 0.110 ms to first row, 0.697 ms to end, start offset by 10 ms.
Slice statistics:
(slice0) Executor memory: 247K bytes.
(slice1) Executor memory: 181K bytes avg x 2 workers, 181K bytes max (seg0).
Statement statistics:
Memory used: 128000K bytes
Total runtime: 14.878 ms
(10 rows)
sachi=# select relname,relpages,reltuples from pg_class where relname='foo';
relname | relpages | reltuples
---------+----------+-----------
foo | 10 | 8000
(1 row)
sachi=# select count(*) from foo;
count
-------
8000
(1 row)
So the estimated cost here is
(disk pages read * seq_page_cost ) + (tuples scanned * cpu_tuple_cost )
Where , default
seq_page_cost = 1.0
cpu_tuple_cost = 0.01
so the estimated cost = ( 10 * 1.0 ) + ( 8000 * 0.01 ) = 10+80=90
So this helps you understand why statistics are really imported for the planner to make in correct decision.
sachi=# vacuum verbose analyze foo ;
INFO: vacuuming "public.foo" (seg0 sachi:40000 pid=19349)
INFO: vacuuming "public.foo" (seg1 sachi:40001 pid=19351)
INFO: "foo": found 0 removable, 4000 nonremovable row versions in 5 pages (seg0 sachi:40000 pid=19349)
DETAIL: 0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
1 pages contain useful free space.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: "foo": found 0 removable, 4000 nonremovable row versions in 5 pages (seg1 sachi:40001 pid=19351)
DETAIL: 0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
1 pages contain useful free space.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: Executing SQL: select sum(gp_statistics_estimate_reltuples_relpages_oid(c.oid))::float4[] from gp_dist_random('pg_class') c where c.oid=33435
INFO: ANALYZE estimated reltuples=8000.000000, relpages=10.000000 for table foo
INFO: ANALYZE computing statistics on attribute x
INFO: Executing SQL: select count(*)::float4 from only public.foo as Ta where Ta.x is null
INFO: nullfrac = 0.00000000000000000000000000000000000000
INFO: avgwidth = 8.000000
INFO: Executing SQL: select count(*)::float4 from (select Ta.x from only public.foo as Ta group by Ta.x) as Tb
INFO: count(ndistinct()) gives 1.000000 values.
INFO: Executing SQL: select count(v)::float4 from (select Ta.x as v, count(Ta.x) as f from only public.foo as Ta group by Ta.x) as foo where f > 1
INFO: ndistinct = 1.000000
INFO: Executing SQL: select Ta.x as v, count(Ta.x)::float4/8000.000000::float4 as f from only public.foo as Ta where Ta.x is not null group by (Ta.x) order by f desc limit 25
INFO: mcv={"2013-08-22 00:00:00"}
INFO: freq={1}
INFO: Executing SQL: select v from (select Ta.x as v, row_number() over (order by Ta.x) as r from only public.foo as Ta where x is not null union select max(Tb.x) as v, 1 as r from only public.foo as Tb where x is not null) as foo where r % 320 = 1 group by v order by v
INFO: hist={"2013-08-22 00:00:00"}
VACUUM
sachi=#
Finding missing statistics of a table
To monitor the missing stats of a table, we use gp_toolkit.gp_stats_missing view. Lets explain it in more details. To check the table that do not have statistics and therefore may require an ANALYZE be run on the table. It looks simple, but actually that is not the case. To explain have a look at the below example.
sachi=# \d gp_toolkit.gp_stats_missing
View "gp_toolkit.gp_stats_missing"
Column | Type | Modifiers
-----------+---------+-----------
smischema | name |
smitable | name |
smisize | boolean |
smicols | bigint |
smirecs | bigint |
View definition:
SELECT aut.autnspname AS smischema, aut.autrelname AS smitable,
CASE
WHEN aut.autrelpages = 0 OR aut.autreltuples = 0::double precision THEN false
ELSE true
END AS smisize, attrs.attcnt AS smicols, COALESCE(bar.stacnt, 0::bigint) AS smirecs
FROM gp_toolkit.__gp_user_tables aut
JOIN ( SELECT pg_attribute.attrelid, count(*) AS attcnt
FROM pg_attribute
WHERE pg_attribute.attnum > 0
GROUP BY pg_attribute.attrelid) attrs ON aut.autoid = attrs.attrelid
LEFT JOIN ( SELECT pg_statistic.starelid, count(*) AS stacnt
FROM pg_statistic
GROUP BY pg_statistic.starelid) bar ON aut.autoid = bar.starelid
WHERE (aut.autrelkind = 'r'::"char" AND (aut.autrelpages = 0 OR aut.autreltuples = 0::double precision)) OR (bar.stacnt IS NOT NULL AND attrs.attcnt > bar.stacnt);
sachi=#
sachi=# create table sachi (id numeric, name varchar(32));
NOTICE: Table doesn't have 'DISTRIBUTED BY' clause -- Using column named 'id' as the Greenplum Database data distribution key for this table.
HINT: The 'DISTRIBUTED BY' clause determines the distribution of data. Make sure column(s) chosen are the optimal data distribution key to minimize skew.
CREATE TABLE
sachi=# select * from pg_stat_last_operation where objid = 'sachi' ::regclass order by statime;
classid | objid | staactionname | stasysid | stausename | stasubtype | statime
---------+-------+---------------+----------+------------+------------+-------------------------------
1259 | 52411 | CREATE | 10 | gpadmin | TABLE | 2014-09-25 12:29:20.730637-04
(1 row)
sachi=# select * from gp_toolkit.gp_stats_missing ;
smischema | smitable | smisize | smicols | smirecs
-----------+----------+---------+---------+---------
public | usha | f | 1 | 0
public | usha1 | f | 1 | 0
public | sachi3 | f | 1 | 0
public | sachi | f | 2 | 0
(4 rows)
as expected gp_stats_missing view tells that table sachi doesn't have stats.
sachi=# vacuum verbose sachi;
INFO: vacuuming "public.sachi" (seg0 sachi:40000 pid=20978)
INFO: vacuuming "public.sachi" (seg1 sachi:40001 pid=20980)
INFO: "sachi": found 0 removable, 0 nonremovable row versions in 0 pages (seg1 sachi:40001 pid=20980)
DETAIL: 0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages contain useful free space.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: "sachi": found 0 removable, 0 nonremovable row versions in 0 pages (seg0 sachi:40000 pid=20978)
DETAIL: 0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages contain useful free space.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: vacuuming "pg_toast.pg_toast_52411" (seg1 sachi:40001 pid=20980)
INFO: vacuuming "pg_toast.pg_toast_52411" (seg0 sachi:40000 pid=20978)
INFO: index "pg_toast_52411_index" now contains 0 row versions in 1 pages (seg1 sachi:40001 pid=20980)
DETAIL: 0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: "pg_toast_52411": found 0 removable, 0 nonremovable row versions in 0 pages (seg1 sachi:40001 pid=20980)
DETAIL: 0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages contain useful free space.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: index "pg_toast_52411_index" now contains 0 row versions in 1 pages (seg0 sachi:40000 pid=20978)
DETAIL: 0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: "pg_toast_52411": found 0 removable, 0 nonremovable row versions in 0 pages (seg0 sachi:40000 pid=20978)
DETAIL: 0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages contain useful free space.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
VACUUM
sachi=#
sachi=# select * from gp_toolkit.gp_stats_missing ;
smischema | smitable | smisize | smicols | smirecs
-----------+----------+---------+---------+---------
public | usha | f | 1 | 0
public | usha1 | f | 1 | 0
public | sachi3 | f | 1 | 0
public | sachi | f | 2 | 0
(4 rows)
sachi=#
sachi=# vacuum verbose analyze sachi;
INFO: vacuuming "public.sachi" (seg0 sachi:40000 pid=21017)
INFO: "sachi": found 0 removable, 0 nonremovable row versions in 0 pages (seg0 sachi:40000 pid=21017)
DETAIL: 0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages contain useful free space.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: vacuuming "pg_toast.pg_toast_52411" (seg0 sachi:40000 pid=21017)
INFO: index "pg_toast_52411_index" now contains 0 row versions in 1 pages (seg0 sachi:40000 pid=21017)
DETAIL: 0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: "pg_toast_52411": found 0 removable, 0 nonremovable row versions in 0 pages (seg0 sachi:40000 pid=21017)
DETAIL: 0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages contain useful free space.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: vacuuming "public.sachi" (seg1 sachi:40001 pid=21019)
INFO: "sachi": found 0 removable, 0 nonremovable row versions in 0 pages (seg1 sachi:40001 pid=21019)
DETAIL: 0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages contain useful free space.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: vacuuming "pg_toast.pg_toast_52411" (seg1 sachi:40001 pid=21019)
INFO: index "pg_toast_52411_index" now contains 0 row versions in 1 pages (seg1 sachi:40001 pid=21019)
DETAIL: 0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: "pg_toast_52411": found 0 removable, 0 nonremovable row versions in 0 pages (seg1 sachi:40001 pid=21019)
DETAIL: 0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages contain useful free space.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: Executing SQL: select sum(gp_statistics_estimate_reltuples_relpages_oid(c.oid))::float4[] from gp_dist_random('pg_class') c where c.oid=52411
INFO: ANALYZE estimated reltuples=0.000000, relpages=0.000000 for table sachi
INFO: ANALYZE skipping computing statistics on table sachi because it has no rows.
VACUUM
sachi=#
sachi=# select * from gp_toolkit.gp_stats_missing ;
smischema | smitable | smisize | smicols | smirecs
-----------+----------+---------+---------+---------
public | usha | f | 1 | 0
public | usha1 | f | 1 | 0
public | sachi3 | f | 1 | 0
public | sachi | f | 2 | 0
(4 rows)
Note: the vacuum and then analyze did not clear up the entry. The reason for the above case is in the p_toolkit.gp_stats_missing view definition. If you check the definition of the view , it tells you its basically checking if the table ( that is created above ) has zero rows for either relpages / reltuples in the pg_class table , that is the reason you will find the smisize column under the gp_stats_missing view with value false "f" stating either the relpages or reltuples has zero rows.
sachi=# select oid,relname,reltuples,relpages from pg_class where relname='sachi';
oid | relname | reltuples | relpages
-------+---------+-----------+----------
52411 | sachi | 0 | 1
(1 row)
============================================================================================
============================================================================================
sachi=# show gp_autostats_mode;
gp_autostats_mode
-------------------
ON_NO_STATS
(1 row)
sachi=# show gp_autostats_on_change_threshold;
gp_autostats_on_change_threshold
----------------------------------
2147483647
(1 row)
Automatic Statistics Collection Parameters
When automatic statistics collection is enabled, you can run ANALYZE automatically in the same transaction as an INSERT, UPDATE, DELETE, COPY or CREATE TABLE...AS SELECT statement when a certain threshold of rows is affected (on_change), or when a newly generated table has no statistics (on_no_stats). To enable this feature, set the following server configuration parameters in your Greenplum master postgresql.conf file and restart Greenplum Database:
• gp_autostats_mode
• log_autostats
Warning: Depending on the specific nature of your database operations, automatic statistics collection can have a negative performance impact. Carefully evaluate whether the default setting of on_no_stats is appropriate for your system.
gp_autostats_mode
This specifies when an automatic ANALYZE should take place. The values for this configuration parameter can be:
none
on_change
on_no_stats
on_no_stats
The default is on_no_stats and the database will automatically perform an ANALYZE for you. If you INSERT data into an empty table (new table or a table you just did a TRUNCATE on), the database will automatically gather statistics with an ANALYZE.
Typically in an Analytics Data Warehouse, you will insert data once and then read it many times. So the default of on_no_stats will automatically give you statistics as needed with no additional work from you.
none
Self explanatory. You have to execute ANALYZE if you want statistics.
on_change
With this setting, if perform an INSERT, UPDATE, or DELETE that exceeds the gp_autostats_on_change_threshold value, then an automatic ANALYZE will happen. Note for HAWQ, you can only INSERT.
In Greenplum and in the rare case where you are doing a DELETE or UPDATE to a table, you will need to execute an ANALYZE statement. Or, you can set the database to use on_change and the ANALYZE will happen automatically.
gp_autostats_on_change_threshold
This is only relevant for on_change and it is the number of rows that need to change before an automatic ANALYZE will happen. The default is over 2 billion rows so if you really want to use on_change, then you will also need to reduce this configuration parameter to something more realistic.
Examples
Example 1 – The default
gp_autostats_mode = on_no_stats
gp_autostats_on_change_threshold = 2147483647
============================================================================================
============================================================================================
sachi=# insert into sachi values(1, 'abc');
INSERT 0 1
sachi=# select oid,relname,reltuples,relpages from pg_class where relname='sachi';
oid | relname | reltuples | relpages
-------+---------+-----------+----------
52411 | sachi | 0 | 1
(1 row)
sachi=# sachi=# select * from gp_toolkit.gp_stats_missing;
smischema | smitable | smisize | smicols | smirecs
-----------+----------+---------+---------+---------
public | usha | f | 1 | 0
public | usha1 | f | 1 | 0
public | sachi3 | f | 1 | 0
public | sachi | f | 2 | 0
(4 rows)
Note: it did not remove the data from gp_toolkit.gp_stats_missing. The reason for it lies in the definition of the gp_stats_missing view "aut.autrelpages = 0 OR aut.autreltuples = 0::double precision " which states both the relpages and reltuples should have a entry .
sachi=# select count(*) from sachi;
count
-------
1
(1 row)
sachi=# select oid,relname,reltuples,relpages from pg_class where relname='sachi';
oid | relname | reltuples | relpages
-------+---------+-----------+----------
52411 | sachi | 0 | 1
(1 row)
So if we now run a analyze after the data insert
sachi=# analyze sachi;
ANALYZE
sachi=#
sachi=# select oid,relname,reltuples,relpages from pg_class where relname='sachi';
oid | relname | reltuples | relpages
-------+---------+-----------+----------
52411 | sachi | 1 | 1
(1 row)
sachi=# select * from gp_toolkit.gp_stats_missing;
smischema | smitable | smisize | smicols | smirecs
-----------+----------+---------+---------+---------
public | usha | f | 1 | 0
public | usha1 | f | 1 | 0
public | sachi3 | f | 1 | 0
(3 rows)
Note: Now it did remove the data from gp_toolkit.gp_stats_missing
Now lets do truncate.
sachi=# truncate table sachi;
TRUNCATE TABLE
sachi=# select * from gp_toolkit.gp_stats_missing;
smischema | smitable | smisize | smicols | smirecs
-----------+----------+---------+---------+---------
public | usha | f | 1 | 0
public | usha1 | f | 1 | 0
public | sachi3 | f | 1 | 0
public | sachi | f | 2 | 2
(4 rows)
sachi=#
Note: Now it came back.
sachi=# insert into sachi values(1, 'abc');
INSERT 0 1
sachi=# select * from gp_toolkit.gp_stats_missing;
smischema | smitable | smisize | smicols | smirecs
-----------+----------+---------+---------+---------
public | usha | f | 1 | 0
public | usha1 | f | 1 | 0
public | sachi3 | f | 1 | 0
(3 rows)
sachi=#
Note: Now it did remove the data from gp_toolkit.gp_stats_missing
sachi=# alter table sachi add column address varchar(128);
ALTER TABLE
sachi=#
sachi=# select * from gp_toolkit.gp_stats_missing;
smischema | smitable | smisize | smicols | smirecs
-----------+----------+---------+---------+---------
public | usha | f | 1 | 0
public | usha1 | f | 1 | 0
public | sachi3 | f | 1 | 0
public | sachi | t | 3 | 2
(4 rows)
sachi=#
Note: it came back again.
sachi=# analyze sachi;
ANALYZE
sachi=# select * from gp_toolkit.gp_stats_missing;
smischema | smitable | smisize | smicols | smirecs
-----------+----------+---------+---------+---------
public | usha | f | 1 | 0
public | usha1 | f | 1 | 0
public | sachi3 | f | 1 | 0
(3 rows)
Note: Now it is gone.
sachi=#
sachi=# delete from sachi;
DELETE 1
sachi=# select count(*) from sachi;
count
-------
0
(1 row)
sachi=# select * from gp_toolkit.gp_stats_missing;
smischema | smitable | smisize | smicols | smirecs
-----------+----------+---------+---------+---------
public | usha | f | 1 | 0
public | usha1 | f | 1 | 0
public | sachi3 | f | 1 | 0
(3 rows)
So
gp_stats_missing tells you which table needs statistics/analyze if table with data have not been analyzed anytime in its lifecycle / or when the table has altered to add / drop column.
gp_stats_missing doesn't remove the entry from its view when the table has no rows / or when tuples / pages has been reduced to zero after deleting of data .
gp_stats_missing doesn't tell you if the table needs statistics after updates / delete / bulk loading of data , since it only checks for relpages / reltuples = 0.
Checking the Statistics
Queries you can use to see the statics information.
SELECT c.oid, c.relpages, c.reltuples FROM pg_class c JOIN pg_namespace n ON c.relnamespace = n.oid WHERE n.nspname = 'public' AND c.relname = 'my_table';
SELECT * FROM pg_statistic WHERE starelid = 298610;