GCP – What’s new in PostgreSQL 16: New features available in Cloud SQL today
In an effort to improve usability and facilitate informed decision-making, Cloud SQL customers can now use PostgreSQL 16, which introduces new features for deeper insights into database operations and enhanced usability.
In this blog post we cover some of the highlights of the PostgreSQL 16 version, including:
Improvements in observability
Performance improvements
Vacuum efficiency
Replication improvements
Let’s take a deeper look at each of these areas.
Observability improvements
Observability is an important aspect of databases, helping operators optimize resource consumption by providing insights into how resources are being utilized. Here are some important observability enhancements introduced in PostgreSQL 16.
PG_STAT_IO
PostgreSQL16 adds a new view pg_stat_io that provides insights into the Input/Output (IO) behavior of a PostgreSQL database. We can use this view to make informed decisions to optimize database performance, improve resource utilization and ensure the overall health and scalability of the database system. This view presents the stats for the entire instance.
What can we infer from this view?
Like most other pg_stat_* views, the statistics in the view are cumulative. To track changes in the pg_stat_io view over a specific time period, record the values at the beginning and end of the workload.
This view tracks the stats mainly by the columns in backend_type, io_context and io_object.
The backend_type is a connection process and can be one of client backend, background worker, checkpointer, standalone backend, autovacuum launcher, autovacuum worker. The io_context is classified based on the load as normal, bulk read, bulk write, or vacuum.
The actual stats to be considered for knowing the I/O status of the instance are reads, writes, extends, hits, evictions, and reuses.
We can monitor the shared buffers efficiency by comparing the evictions-to-hits ratio. The buffer hit ratio is considered effective when hits for each context are much higher than evictions.
The bulk reads and bulk writes indicate sequential scans. The evictions, hits and reuses for these indicate the efficiency of ring buffers in this case.
We can also observe the amount of data read or written as part of the autovacuum or vacuum process. The metric data related to autovacuum are observed by io_context =’ vacuum’ and backend_type as ‘autovacuum worker’. A vacuum process goes by backend_type as ‘standalone backend’ with io_context as ‘vacuum’.
Here’s an image of the view:
Last sequential and index scans on tables and indexes
The views pg_stat_*_tables have two new columns
last_seq_scan
last_idx_scan
Want to know when the last time sequential scan or index scan happened on your tables? Check the newly introduced columns last_seq_scan and last_idx_scan in pg_stat_*_tables.
The timestamp of the last sequential or index scan on a table is indicated in these columns. This can be helpful for identifying any “read query” issues.
Similarly, the column last_idx_scan has been introduced to pg_stat_*_indexes. This column indicates the timestamp last time the index was used. If we were to drop an index, we can make an informed decision based on the value present in this column for the index.
Statistics on the occurrence of tuples moving to a new page for updates
The views pg_stat_*_tables now has a new column, n_tup_newpage_upd.
As we perform updates on a table and want to monitor how many of the rows end up in new heap pages, we can now view this in the column n_tup_newpage_upd.
This can reveal the factors contributing to the table’s growth over time. The value in this column also can be used to validate the ‘fillfactor’ set for the table. Especially for updates which are expected to be ‘HOT’, by observing the stats in this column we can establish if the ‘fillfactor’ is optimal or not.
Performance improvements
Performance is always a top priority for databases. Performance improvements are adopted much faster than other enhancements in a major version release. Here are some of the performance improvements in PostgreSQL 16.
Tables with only BRIN index on a table column are considered ‘HOT’
With PostgreSQL16, updates to a table with BRIN index are now considered as HOT considering the fillfactor for the table is optimal.’ ‘Fillfactor’ is an important setting for this update to be marked ‘HOT’. This improvement makes vacuuming such a table fast and resource-efficient.
Parallelization of FULL or OUTER joins
This performance improvement is very beneficial for selects involving very large tables joined by full or outer joins. In PostgreSQL16, this will result in a parallel hash after a parallel seq scan for each table, instead of a merge or hash after a full heap fetch. In our tests, it has shown quite a large improvement compared to PG15.
Example for full outer join
<ListValue: [StructValue([(‘code’, ‘postgres=> explain (analyze, buffers, verbose) select count(*) from object_store s full outer join object_store2 g on (s.project_id=g.project_id);rn QUERY PLAN rn————————————————————————————————————————————————————————-rn Finalize Aggregate (cost=145953.66..145953.67 rows=1 width=8) (actual time=6095.420..6236.950 rows=1 loops=1)rn Output: count(*)rn Buffers: shared hit=74980, temp read=34714 written=35020rn -> Gather (cost=145953.44..145953.65 rows=2 width=8) (actual time=6083.804..6236.922 rows=3 loops=1)rn Output: (PARTIAL count(*))rn Workers Planned: 2rn Workers Launched: 2rn Buffers: shared hit=74980, temp read=34714 written=35020rn -> Partial Aggregate (cost=144953.44..144953.45 rows=1 width=8) (actual time=6068.822..6069.193 rows=1 loops=3)rn Output: PARTIAL count(*)rn Buffers: shared hit=74980, temp read=34714 written=35020rn Worker 0: actual time=6053.795..6053.802 rows=1 loops=1rn Buffers: shared hit=23966, temp read=12066 written=11200rn Worker 1: actual time=6069.306..6069.313 rows=1 loops=1rn Buffers: shared hit=26385, temp read=10995 written=12292rn -> Parallel Hash Full Join (cost=83021.80..140786.80 rows=1666658 width=0) (actual time=3824.778..5852.278 rows=1333333 loops=3)rn Hash Cond: (g.project_id = s.project_id)rn Buffers: shared hit=74980, temp read=34714 written=35020rn Worker 0: actual time=3857.567..5832.558 rows=1361655 loops=1rn Buffers: shared hit=23966, temp read=12066 written=11200rn Worker 1: actual time=3851.661..5870.054 rows=1244012 loops=1rn Buffers: shared hit=26385, temp read=10995 written=12292rn -> Parallel Seq Scan on public.object_store2 g (cost=0.00..41652.00 rows=421200 width=16) (actual time=0.029..936.699 rows=1333333 loops=3)rn Output: g.project_idrn Buffers: shared hit=37440rn Worker 0: actual time=0.026..977.947 rows=1347470 loops=1rn Buffers: shared hit=12650rn Worker 1: actual time=0.043..1017.822 rows=1298124 loops=1rn Buffers: shared hit=12132rn -> Parallel Hash (cost=54050.57..54050.57 rows=1666658 width=16) (actual time=1456.617..1456.619 rows=1333333 loops=3)rn Output: s.project_idrn Buckets: 262144 Batches: 32 Memory Usage: 7968kBrn Buffers: shared hit=37384, temp written=17236rn Worker 0: actual time=1451.741..1451.743 rows=1202466 loops=1rn Buffers: shared hit=11238, temp written=5200rn Worker 1: actual time=1450.062..1450.064 rows=1516637 loops=1rn Buffers: shared hit=14175, temp written=6524rn -> Parallel Seq Scan on public.object_store s (cost=0.00..54050.57 rows=1666658 width=16) (actual time=0.023..530.669 rows=1333333 loops=3)rn Output: s.project_idrn Buffers: shared hit=37384rn Worker 0: actual time=0.018..506.262 rows=1202466 loops=1rn Buffers: shared hit=11238rn Worker 1: actual time=0.025..578.219 rows=1516637 loops=1rn Buffers: shared hit=14175rn Query Identifier: -5913048123863832940rn Planning Time: 0.211 msrn Execution Time: 6237.051 msrn(47 rows)’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e3b83dd3190>)])]>
Explain (generic_plan)
Prior to PostgreSQL 16, for parameterized SQLs the value of the parameter has to be passed to to obtain an execution plan. In PostgreSQL 16, with the option (generic_plan) we do not need to provide any additional values to the SQL to get the execution plan.
Example
<ListValue: [StructValue([(‘code’, “db=> CREATE TABLE measurement (rn city_id int not null,rn logdate date not null,rn peaktemp int,rn unitsales intrn) PARTITION BY RANGE (logdate);rnrnCREATE TABLE measurement_y2006m02 PARTITION OF measurementrn FOR VALUES FROM (‘2006-02-01’) TO (‘2006-03-01’);rnrnCREATE TABLE measurement_y2006m03 PARTITION OF measurementrn FOR VALUES FROM (‘2006-03-01’) TO (‘2006-04-01’);rnrnPrepare statementrnrndb=> PREPARE partitioned_selfjoin (int) ASrnSELECT *rn FROM measurement arn JOIN measurement brn ON a.peaktemp = b.peaktemprn WHERE a.city_id = $1;rnPREPARErnrnGet execution planrnrnPre PostgreSQL 16: Pass a value for the parameter $1 = 10rnrndb=> EXPLAIN EXECUTE partitioned_selfjoin(10);rnrnFor PG – 16rnrndb=> explain (generic_plan) SELECT *rn FROM measurement arn JOIN measurement brn ON a.peaktemp = b.peaktemprn WHERE a.city_id = $1;”), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e3b83dd3be0>)])]>
Vacuum improvements
Vacuum is a significant part of PostgreSQLMVCC. Vacuum releases space after deleting the dead tuples, minimizing table bloat. This prevents the database from ending up in transaction wrap-around problems. Here are some ways vacuum processes improved in PostgreSQL16.
Improved VACUUM operation performance for large tables
BUFFER_USAGE_LIMIT
PostgreSQL 16 introduces a new server variable ‘vacuum_buffer_usage_limit’ to set the ring buffers allocated for VACUUM and ANALYZE operations with a default value of 256K. Setting the ‘BUFFER_USAGE_LIMIT’ option during a VACUUM operation overrides the default value of ‘vacuum_buffer_usage_limit’ and allocates the specified ring buffer size. A larger ‘buffer_usage_limit’ can speed up vacuum operations but may displace buffers used by the main workload from ‘shared_buffers’, which may result in performance degradation. It is often advisable to limit the usage of ring buffers for VACUUM operations using ‘buffer_usage_limit’ when vacuuming very large tables. This option can be used judiciously when approaching Txid wraparound, at which point completing the VACUUM is critical. When ANALYZE is also part of the VACUUM operation, both operations together use the ring buffer size specified in ‘buffer_usage_limit’. A setting of 0 for ‘buffer_usage_limit’ results in disabling the buffer access strategy, which can result in evicting huge numbers of shared buffers, causing performance degradation. The limits for ‘buffer_usage_limit’ are between 128K and 16 GB.
VACUUM to only process TOAST tables
Now in PostgreSQL 16 we can vacuum only TOAST tables related to a relation. Historically, the option ‘process_toast’ was introduced to turn off vacuuming the TOAST table when set to FALSE. Otherwise, vacuum ran on both the main and TOAST table of a relation. In PostgreSQL 16, based on the requirement, we can either vacuum both the main and TOAST table or just do one of them that belongs to a relation. This allows better control to vacuum either main, TOAST, or both, depending on your need.
Here’s an example of how it can be applied:
<ListValue: [StructValue([(‘code’, ‘Vacuum only toast table for a relationrnrnpostgres=> vacuum (PROCESS_TOAST TRUE, PROCESS_MAIN FALSE) prodattribbig;rnrnVacuumdb only toast table for a relationrnrn$ vacuumdb -h <ipaddress> -U postgres -d testdb -t prodattribbig –no-process-main’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e3b84d71340>)])]>
vacuumdb option to process schema
Vacuumdb now has an option to vacuum or analyze all the tables belonging to a schema in the database. This is a very useful feature when we are targeting tables of only one schema.
<ListValue: [StructValue([(‘code’, ‘$ vacuumdb -h <host/ipaddress> -v -U postgres -d testdb -n testschema’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e3b84d71040>)])]>
Replication improvements
Replication is an important part of the database high availability feature. In PostgreSQL 16, the community has added several usability features to replication.
Initial table synchronization in logical replication to copy rows in binary format
In PostgreSQL 16, we can initialize the copy of the rows for logical replication in binary format. This can be much faster, especially with columns that have binary data. Here is an example on how to create a subscription where in the initial data copy is in binary format:
<ListValue: [StructValue([(‘code’, “testdb=> create subscription testtab connection ‘host=10.101.0.20 port=5432 dbname=testdb user=replication_user password=<<pwd>>’ PUBLICATION testtab WITH (copy_data=on, binary=true);”), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e3b84d71190>)])]>
Improved logical replication apply without a primary key
Traditionally, PostgreSQL logical replication relied on full table scans for tables that lacked primary keys, impacting performance. However, with PostgreSQL 16, any available B-tree index on the table is now leveraged, significantly enhancing logical apply efficiency. Index usage statistics are available in the pg_stat_*_indexes view.
Logical decoding on standby
In PostgreSQL 16, logical decoding is enabled on the read replica, allowing subscribers to connect to the read replicas instead of the primary db instance. By doing so, the workload is shared between the primary instance and the replica, reducing strain on the former. This offloads the logical replication workload off of the primary instance onto the replica. This represents a huge performance improvement for the primary node, especially with nodes having many logical replication slots. Another advantage is, in case of a promotion of the replica, subscribers are not affected by the change and continue to operate without any hindrance. Be aware that any delay on the read replica will subsequently affect the logical subscriber, unlike before.
Try PostgreSQL 16 today
It’s time to try out PostgreSQL16 on Cloud SQL with improved observability, improved logical replication, vacuuming and much more. Start your PostgreSQL16 journey on Cloud SQL from here.
Read More for the details.