If you’ve ever wondered how a database can let you run a massive analytics query while simultaneously processing thousands of user updates, without the whole system grinding to a halt, the answer lies in a concept called MVCC (Multi-Version Concurrency Control).
In this deep dive, we’re going to step away from the abstract theory and look at the actual bytes on the disk. We will explore how PostgreSQL implements this "time travel" feature using hidden columns, how updates are actually inserts in disguise, and why your database needs a garbage collector.
In older database systems, ensuring data consistency often meant locking. If User A is editing a row, User B has to wait to read it. PostgreSQL takes a different approach.
Instead of updating data in place, Postgres treats data as immutable versions. When you "update" a row, you aren't overwriting the old data; you are creating a specific version of that row valid for a specific timeframe.
Both exist simultaneously. This is MVCC.
To make this work, PostgreSQL quietly appends several system columns to every single row you create. You usually don't see them, but they are the engine behind the magic.
Let's uncover them.
We'll use a fresh scenario: an Airline Reservation System. We need to track seats on a flight.
First, create the environment (assuming you have a Postgres instance running):
CREATE DATABASE flight_systems;
\\c flight_systems
CREATE EXTENSION pageinspect; -- We need this to see the raw bits later
Now, create a simple table for seat assignments.
CREATE TABLE seats (
seat_no TEXT PRIMARY KEY,
class TEXT,
price INT
);
INSERT INTO seats (seat_no, class, price) VALUES
('1A', 'First', 1000),
('1B', 'First', 1000),
('10C', 'Economy', 200);