I’ve got a post about Heikki’s visibility map talk in the queue, but first I’ll post the updated slides for the user groups talk — Leading without being in charge.
PostgreSQL has a developer’s room and Simon Riggs just wrapped up a talk about Replication. I sincerely hope that the video of the talk turned out well, because it was the most inspiring and technically interesting talk I have seen in a very long time. Unfortunately, I don’t have a copy of the slides at the moment, but word is that they will be posted on the BSD wiki soon.
Simon focused on new features in 8.4 that affect file-based replication, also mentioning streaming, synchronous replication — which will not be included in 8.4, but is being actively worked on. He explained his rationale for objecting to the inclusion of the synchronous replication patches, mostly, I think, based on the complexity of the WAL archiving required as it was implemented.
Then, Simon launched into an in-depth tour of the issues and solutions brought about during his team’s work on Hot Standby. Hot Standby allows read-only queries to be made against a file-based replication enabled Postgres server, known as Point-in-time recovery and WAL Shipping in the Postgres documentation.
Simon started work on PITR-related patches about five years ago, and continues that work with others today.
One fascinating aspect of the hot standby patches is that they ultimately caused performance improvements in sub-transactions across the board – and will likely cause up to 5% improvement in that code path. There were other performance improvements, but I’ll wait for the slides to mention those. At several times during the talk, Simon pointed out features that Postgres has that no other database has — such as multiple options for dealing with conflicts in hot standby (freezing, conflict resolution and timeout).
At the end of the talk, Simon spent a few minutes talking about how Postgres is capable of being the best database, not just the best open source database. And how all the people in the room were capable of contributing as he had. He claimed that prioritization and aiming to work on the biggest, most interesting problem you can are all you need. And he claimed that all that made him different was that he was a little more persistent about solving problems.
Rock on, Simon.
If you haven’t heard, the Linux Plumbers Conference is happening September 17-19, 2008 in Portland, OR. It’s a gathering designed to attract Linux developers – kernel hackers, tool developers and problem solvers.
A few of us that met through the Portland PostgreSQL User Group (PDXPUG) pitched an idea for a talk on filesystem performance. We wanted to examine performance conventional wisdom and put it to the test on some sweet new hardware, recently donated for performance testing Postgres. We’re asking questions like: Is RAID5 really the worst performing configuration for a database? How much does partition alignment really matter? Is there one Linux filesystem that a DBA should always choose for best performance under any load? Is adaptive readahead all that?
Our talk was accepted, so we’ve been furiously gathering data, and drawing interesting conclusions, ever since. Gabrielle Roth and I are presenting, using the results of extensive testing conducted by Mark Wong, a database benchmarking expert and author of pg_top. We’ll be sharing 6 different assumptions about filesystem performance, tested on five different filesystems, under five types of loads generated by fio, a benchmarking tool designed by kernel hacker Jens Axboe to test I/O.