Presented by:

Display pic

Sai Srirampur

ClickHouse Inc

I was the CEO and co-founder of PeerDB, where we focused on building the world's fastest replication tool for Postgres. Recently, ClickHouse acquired PeerDB to offer a native Postgres CDC integration within ClickHouse. I now work on the product team at ClickHouse, leading all things Postgres, including Postgres CDC integration.

Before PeerDB, I was an early employee at Citus Data and saw it through the Microsoft acquisition. For the past decade, I have been an active member of the Postgres community, helping customers implement Postgres, Citus and PeerDB.

No video of the event yet, sorry!

pg_dump and pg_restore are reliable tools for backing up and restoring Postgres databases. They are widely used for various real-world use cases, including Database Migrations, Disaster Recovery, and more.They offer fine grain control, allowing selection of specific db objects for backup and restore, choice of dump format (plain text or compressed), and parallel processing of multiple tables. Furthermore they guarantee dumping and restoring a consistent snapshot of the database.

However, pg_dump and pg_restore are single-threaded at the table level, significantly slowing down dump and restore of databases with star schema (common in real-world apps), with a few large and many small tables. For such databases with over 1TB of data, pg_dump and pg_restore can take multiple days, resulting in increased downtime during migrations and higher Recovery Time Objectives (RTOs) in DR scenarios.

In this talk, we'll discuss an idea to parallelize pg_dump and pg_restore at the individual table level, still ensuring consistency. It involves creating a Postgres snapshot, using that snapshot to logically partition a large table based on CTIDs, and then parallelly dumping/restoring each partition. As enhancements, a) each partition can be read from the source and written to the target simultaneously, enabling pg_dump | pg_restore and b) by tracking the partitions that have been dumped and restored, pg_dump and pg_restore can resume from where they left off in case of failures, instead of restarting from the beginning.

This idea has already been implemented in PeerDB, an open-source Postgres replication tool. The benefits have been significant, enabling the replication of large production databases with over a TB of data in just a few hours, compared to days. We will share these insights and assess how they can be integrated into the Postgres upstream!

Date:
2024 April 19 16:00 PDT
Duration:
20 min
Room:
San Pedro
Conference:
Postgres Conference 2024
Language:
English
Track:
Dev
Difficulty:
Intermediate