A one story about problem with partitioned table in PostgreSQL

Maksim Ryzhikov
3 min readNov 1, 2023

A one story about performance problem with partitioned table in PostgreSQL.

Let’s consider the use case where we have table “foo” and track history of changes in this table through trigger which write data to history table “foo_history”.

[app]
-1. insert->[foo]
-2. after insert->[foo trigger]
-3. insert->[foo_history]

“foo_history” uses partitions because potentially it could be big and data from this table would be deleted and collected in a cold storage from time to time. The simplest way to create a partition is create .. partition of..:

create table foo_history_p2 partition of foo_history for
values from (<date-from>) to (<date-to>)

But this way requires ACCESS EXCLUSIVE lock on a parent table. This means that no other operations could be done on this table at the time when we create the partition. Usually, create partition does not take too much. But, what would be if in our application we use some kind of OLAP(Online Analytic Processing) queries over this history tables. For example, we collect some statistics and send it to a statistics server from time to time:

[app]
-select->[foo_history]
-response->[app]
-send->[statistics]

These queries can take some time to get a result from “foo_history” table. In our example, let’s consider that such query can take 5 seconds.

Now let’s return to our “foo” table. We use table “foo” often in our application, mostly to insert data. Queries over “foo” table should be small and fast. And all go well until we don’t get an alert that our insert queries in “foo” aborted duet timeout (from time to time). Looking on a database analytics, we see that those queries blocked and most of the time sit and waiting until lock would be released. What can goes wrong? We use simple insert statement which could not take too much time. And the load on the application is not so big so that we have problem with competing queries.

Time to remember about our trigger. Trigger runs in the same transaction as query. Every statement event simple INSERT INTO ... is wrapped in a transaction. We can’t commit a transaction until we finish all triggers. Let’s combine insert, trigger, select and create partition together in one chart:

[app]
-1. select (5s)->[foo_history]
[app]
-2. create partition->[foo_history]
[app]
-3. insert->[foo]
-4. after insert->[foo_trigger]
-5. insert->[foo_history]
  1. We run statistic query over “foo_history” table
  2. In a parallel starts the script which try to create a partition for “foo_history” table. Create partition take exclusive lock and wait util query (1) finished.
  3. At this time goes request to insert data in table “foo”. We open transaction on insert.
  4. Insert data and call “foo_trigger”
  5. “foo_trigger” try to insert data in “foo_history” but because “create partition” took exclusive lock we're getting in line after “create partition” and waiting

so all request on insert in “foo” would wait until long query finished, and we create a partition even if this partition doesn’t necessary for these inserts.

Solution

We can move all read query on read only node (if we have master-follower scheme). Postgres allows adding CONCURRENTLY qualifier, but it still requires SHARE UPDATE EXCLUSIVE lock on the parent table.

Much better would be to create a partition table separately and after that attach it to the parent table. In this case, it would require SHARE UPDATE EXCLUSIVE lock on the partitioned table:

CREATE TABLE foo_history_p2 (LIKE foo_history INCLUDING ALL);

ALTER TABLE foo_history ATTACH PARTITION foo_history_p2 FOR
VALUES FROM ('2023-12-01 00:00:00.000') TO ('2023-12-30 00:00:00.000');

--

--