Explaining MVCC in Postgres: system defined columns

I’m playing around with some diagrams for explaining MVCC that I’ll be posting here over the next few days. Not sure if I’ll end up giving up on slides and just use a whiteboard for the talk. I made an illustrated shared buffers deck to go along with Greg Smith’s excellent talk on shared buffers a while back. This is the beginning of a talk that I hope will emulate that.

Here are my first few slides, showing the system-defined columns. The next few slides will describe optimizations PostgreSQL has for managing the side effects of our pessimistic rollback strategy, and reducing IO during vacuuming and index updates.

6 thoughts on Explaining MVCC in Postgres: system defined columns

Comments are closed.

Bruce Momjian · September 1, 2010 at 8:51 am ·

FYI, I am working on an MVCC talk as well. I have lots of queries but no slides yet.

Marc Cousin · September 1, 2010 at 10:39 pm ·

The slide makes it look as if “table oid” is part of a row. Won’t it confuse some readers ?
Maybe I don’t get what you mean 🙂

Anyway, thanks for sharing.

selena · September 2, 2010 at 10:29 am ·

Marc: tableoid is actually part of the row. All those columns are system-defined, and if you execute: SELECT tableoid from mytable you’ll see what I mean.

Marc Cousin · September 3, 2010 at 10:11 am ·

Selena : yes, tableoid is selectable in a table. Still, it is not a field of the table. It is t_tableOid of the HeapTupleData of current tuple.
heap_fetch, for instance, shows how it is built.

So what I meant isn’t that you can’t get tableoid from a select, just that it isn’t part of the row, while xmin and xmax are (they are part of the row’s header), for instance.

That’s why I felt tableoid didn’t fit in the presentation of a row. It’s more of a shortcut, really, to get the table’s oid during a select.

selena · September 3, 2010 at 12:19 pm ·

Marc:

Thanks for the pointer! But from the perspective of a user, is that distinction important to point out? I see why one might want to note that we’re not wasting tons of disk space on storing the tableoid, but there’s no way of seeing the implementation detail without looking at source code, is there? (and our docs call it out as a column…)

-selena

Marc Cousin · September 3, 2010 at 9:52 pm ·

Selena:

I don’t know if it should be in this presentation. That’s why I asked. I just felt that maybe someone like me would ask you a question about it if I attended the presentation 🙂

I know that if I was a beginner attending, I’d be confused by this one column: it has no technical reason to be a part of a row, neither a part of the header.

It has no role in MVCC either. Neither has the oid field. But at least, there is an oid for each row, if the table is with oid.

I think that if these slides focus on MVCC, there may be no point in talking about the tableoid and oid columns, they aren’t used for it, and may confuse people.

In the same vein, I don’t know if you should mention that ctid is from the page header, whereas xmin, xmax, cmin, cmax are from the row header ? ctid is sort of a pointer to the record, people may wonder why it would be in the record (and it isn’t).