- This article briefly exposes the basic concepts of how table partitioning and inheritance works in PostgreSQL.
- Some prior knowledge on basic queries and DDLs is necessary.
- There will be no deep dive in how the feature works under the hood, neither how to migrate your existing data to this solution.
Introduction
Since university, PostgreSQL has had a special place in my heart. I was heavily influenced back then by this incredible a professor I had. Until this day in my career, I've never seen someone with SQL capabilities like hers.
I can still recall this day, where after all her lectures about referential constraints and all sorts of integrity one can think about, she raises the question about what DB we used & liked the most. As we were a bunch of naïve young folks, breathing the LAMP (Linux, Apache, MySQL, PHP) era, the most common answer was MySQL.
Then she explained all the issues MySQL had on enforcing some constraints (I won't remember which ones since I never actually used MySQL professionally). Then, she took us to the lab and had made us test those constraints in both DBs. On that day, I feel in love with PostgreSQL.
However, it's a love that I neglected for a long period of time. But over the last months, I've had lots of fun playing with it again. That's why I decided to write a little about it.
Partitioning your data
According to Daniel Abadi, "Data Partitioning is the technique of distributing data across multiple tables, disks, or sites in order to improve query processing performance or increase database manageability".
If you want to dive deeper into how to handle data-intensive architectures and the problems around them, I would highly recommend reading Martin's Kleppman "Design Data-Intensive Applications". It's definitely my reference book for system design nowadays.
Now let's dig into how we can achieve partitioning at a table level using inheritance.
Using inheritance
Wait, ain't inheritance an Oriented Object programming thing like those Java
public class Dog extends Animal
type of stuff? Well, our folks working on PostgreSQL found a neat way to partition our data based on inheritance.
Let's start with the following table:
CREATE TABLE t_archive (id serial, year integer, data text );
But this table would grow too much (here we're not talking about indexes), and we want to optimize the search. A simple way is to partition the data into different tables. But then you would need to know which table to query from, right? Well, not exactly.
If we use inheritance, we can sort of avoid that trap. Let's create a few tables that will inherit from t_archive
:
CREATE TABLE t_archive_2021 () INHERITS (t_archive);
CREATE TABLE t_archive_2020 () INHERITS (t_archive);
CREATE TABLE t_archive_2019 () INHERITS (t_archive);
Now all the tables have the same structure as follows:
When you query the parent table, all child tables will also be scanned.
Let's try now with a filter based on the field we want?
Well, it wasn't as I expected. I would like to have a single table being looked at. We don't care about data in other years beside 2021
, right? What we want here is to scan a single table. How to achieve that?
In order to do so, we need to explicitly define checks on the table, so PostgreSQL can base itself on them when creating the query plan. Let's add a few checks to our tables:
ALTER TABLE t_archive_2021 ADD check (year >= 2021 AND year < 2022);
ALTER TABLE t_archive_2020 ADD check (year >= 2020 AND year < 2021);
ALTER TABLE t_archive_2019 ADD check (year < 2020);
Now, when we run the query, the data lookup will only check for the correct table.
Changing parent table
When altering the parent table, for instance, adding a column, all the changes will reflect in its child tables. By running the following command:
ALTER TABLE t_archive ADD COLUMN example text;
The result looks like this:
Conclusion
Hopefully with this brief exposure to the concept, you will be able to at least remember about it when the partition topic comes up at a design you're doing. Not that you would use it, but at least you would one more tool in your database Swiss army knife in case you need it.
See you later :)