I implemented the first Vacuum Cleaner Daemon (the most "hated" thing about Postgres)
Presented by:
Curt Kolovson
I was a member of the original Postgres research project at UC Berkeley from 1986-1990 when I was a PhD student. My PhD advisor was Prof. Michael Stonebraker. My claim to fame is that I implemented the first version of the Postgres Vacuum Daemon. At that time, one key feature of Postgres was to be a no-overwrite data store that contained past states of tuples (historical data), in order to support "time travel". The idea was that historical data would be maintained in Postgres up to some user-specified time in the past, and data older than that would be periodically "vacuumed" to write-once read-many (WORM) optical drives, which were new at the time. As everyone knows, the historical data feature of Postgres was eventually dropped, and vacuuming took on a different purpose to reduce data bloat through MVCC garbage collection. My PhD thesis topic was "Indexing Techniques for Multidimensional Spatial Data and Historical Data in Database Management Systems", and I had two papers published during my PhD years. They were:
- Curtis P. Kolovson and Michael Stonebraker, "Segment Indexes: Dynamic Indexing Techniques for Multi-Dimensional Interval Data". SIGMOD Conference 1991: 138-147.
- Curtis P. Kolovson and Michael Stonebraker, "Indexing Techniques for Historical Databases". ICDE 1989: 127-137.
My career in industry consisted of working at AT&T Bell Labs, HP, VMware, and MariaDB. I am currently an advisor to startups and a PostgreSQL Consultant / Contributor.
Title: Keynote “The Original Postgres Storage System” Speaker: Curt Kolovson https://www.linkedin.com/in/ckolovson/
This is a retrospective talk reflecting on my experience at UC Berkeley and my collaborations with Prof. Michael Stonebraker, both from 1980-81 when I earned my MS in Computer Science at UC Berkeley, and from 1986-1990 when I earned my PhD working under Prof. Stonebraker. The 1980s was an exciting time to be in the EECS Department of UC Berkeley, as many great people worked on ground-breaking research projects during that decade.
Most of the talk will cover the key design principles, decisions, and development of Postgres -- in particular, the Postgres Storage System. Most (but not all) of the main ideas and concepts from the original UC Berkeley Postgres Project are still key features of present-day PostgreSQL -- with some exceptions, such as support for temporal/historical data. It may be time to revisit this worthy idea, and improve on the original implementation.
The PostgreSQL Community has made enormous contributions to the current PostgreSQL code base, and it continues to evolve. The evolution of PostgreSQL has been and remains a remarkable achievement, and the PG Community deserves a great deal of credit and heartfelt thanks from those of us who worked on its humble beginnings.
References: Postgres papers… especially the Postgres Storage System
- Date:
- 2024 April 18 09:00 PDT
- Duration:
- 20 min
- Room:
- Almaden
- Conference:
- Postgres Conference 2024
- Language:
- English
- Track:
- Essentials
- Difficulty:
- Easy