OSCON: We’re at the end…

I’m finally getting to blog, and here are a few highlights:

* “Mistakes were made” was a great time. Thank you everyone who shared stories. And those of you who attended, please connect with me – email or whatever, and let’s continue our discussions about failure.
* I have a little bit of editing to do left on the harder, better, faster, stronger slides. Talk ratings have been very high (thank you audience! :) Should have that up tomorrow!
* Not having a booth at OSCON was a real bummer for Postgres. We need to figure out a way to make this happen for us every year.
* Great having the time to connect with old friends in the hallways this week.
* Thanks O’Reilly for supporting our open source community.
* Thanks Google Open Source Programs office for bringing together open source leaders yet again this year for some important conversations.

Thank you everyone from the Postgres community who contributed to the Postgres day just before OSCON. All the speakers and their talks are listed here.

We need to keep having adjunct events like this! I think LCA has it right scheduling Mini-BoFs to provide networking opportunities for the distinct groups. I think OSCON should formalize this next year, and figure out a way of facilitating those groups in a more structured way.

I have another blog post brewing about difficult conversations.. but that’s going to have to wait until after I enjoy the brewers fest!

PgCon Pub Track: Learning more about Synchronous Replication

So, we’re at the Pub and doing “create a billion tables” time trials with Jan Urbanski using Python and Josh Berkus using Perl.

We’re also hacking on a test framework the Slony developers have, specifically hacking with Steve Singer. What we discovered is that sync rep doesn’t wait for a WAL segment to be *replayed* before it returns. In the pg_stat_replication table, we see sent_location, write_location and flush_location synchronized, but not replay_location.

This makes sense from a database perspective, but may be surprising behavior for application developers. There are patches out there (according to what I just heard from Bernd) to make synchronous replication wait for replay on the slave, but it’s not certain when that will be committed. It definitely won’t be part of version 9.1.

I just wrote up configuration details from a database administrator’s perspective, and am planning on doing some additional work to make a highly condensed configuration tutorial for our main docs. We definitely need to explain this more clearly for users, who might be thinking of it more from an application perspective.

Announcing Postgres Open

On behalf of the Postgres Open organizing committee, I’m pleased to share this announcement:

Postgres Open 2011, a conference for data innovators focused on disruption of the database industry through PostgreSQL, will take place September 14-16, 2011 at Chicago’s Westin Michigan Avenue hotel.

“PostgreSQL’s consistent addition of new features and enhancements, while remaining focused on reliability and performance, has provided myYearbook a solid foundation to create new and innovative applications,” said Gavin Roy, CTO at myYearbook. “We are looking forward to the Postgres Open Conference as a venue to share, network, and learn innovative ways to leverage Postgres in our environment.”

Postgres Open, a community-organized, non-profit conference, addresses the breadth of PostgreSQL usage, from core database system design to enterprise database use. Inviting entrepreneurs and technologists on the leading edge of data management, the conference will focus on open source database innovation and changes in the database market. Postgres Open includes regular talks, keynotes and hands-on tutorials.

We’re pleased to announce that VMWare and EnterpriseDB are joining the conference as founding sponsors.

The theme of the inaugural conference is “disruption of the database industry”. Topics will include new features in the latest version of PostgreSQL, use cases, product offerings and important announcements. Invited talks and presentations will cover many of the innovations in version 9.1, such as nearest-neighbor indexing, serializable snapshot isolation, and transaction-controlled synchronous replication. Vendors will also be announcing and demonstrating new products and services to enhance and extend PostgreSQL.

Postgres Open 2011′s main program (September 15-16) will be preceded by a day of intensive, half-day tutorials.

The Call For Papers for Postgres Open will open in late May.

Our program committee includes:
Robert Haas, Major Contributor, PostgreSQL committer,
Josh Berkus, Core Team member,
Greg Smith, Major Contributor to PostgreSQL and author of High
Performance PostgreSQL 9.0,
Gavin Roy, CTO of MyYearbook.com and
Selena Deckelmann, Major Contributor to PostgreSQL.

If you’d like to receive announcements as the conference progresses, please visit the website and add your email address to our list.

For information concerning sponsorship, please send email to sponsorship@postgresopen.org for a copy of our prospectus.

PgCon Day 1 – Cluster summit and catching up with folks

Yesterday, I spent my morning at the Clustering summit, catching up on what the cluster hackers have been up to for the last year. I was lucky enough to sit next to Jan Wieck and Kevin Grittner. You may remember Kevin from his work on serializable snapshot isolation.

There were some pretty awesome side conversations about where folks think work needs to be done next, and conflict resolution for multi- (or many-) master setups.

I gave a quick update on Bucardo 5, which had an alpha release last week, supports many-master and has has experimental support for non-Postgres targets. The first two targets are text and MongoDB.

The Postgres project has given the generic name “binary replication” to all the features like WAL shipping, streaming replication and synchronous replication. Simon Riggs also gave his update on these features at the Clustering Summit today. He observed that the 9.1 release is the culmination of 7 years of work on replication subsystems. Simon pointed out that synchronous replication is the best, and most obvious, use case for the binary replication at the core of Postgres. And also pointed out that he was quite pleased with the ultimate design.

For the afternoon, I spent some time with folks on the infrastructure team, giving Magnus well-deserved congratulations for his induction into -core, and meeting up with folks from all over at the Royal Oak and Keg, a reasonable steakhouse in town.

Looking forward to the developers meeting today!

At PgCon 2011 – day 0

I wrote my review of synchronous replication over on Emma’s Tech blog (It’ll probably be published mid-day Tuesday). I’m visiting Ottawa this year on behalf of Emma, one of many great sponsors of Postgres’ yearly international developer conference, pgCon.

This week will be packed for me – attending the Clustering summit, the developers meeting, presenting about Emma’s database systems, leading the lightning talks, and of course attending the many parties this week.

Because we are spread so far around the globe, pgCon is often our one chance to get together and really dig into problems in-person.

And, I’m pulling together our first ever Procedural Language summit. With the new extension system, over 30 procedural languages implemented, and a ton of new features being added to existing PLs, I thought it was time PL developers should come together and have a chat. I’ve still got a few details to work out before Saturday (sorry all that RSVP’d – final agenda coming soon!).

I’m hoping to also have another, unrelated, announcement this Wednesday. Hopefully all the details come together!

Anyway, with that cliffhanger, I’m off to get a good night’s rest before the clustering summit tomorrow.

9.1 beta 1 is out! Help us test.

Postgres released version 9.1 beta 1 today! This is a preview of 9.1, predicted to be available in the next 2-3 months, not a bugfix release for earlier versions of Postgres.

PostgreSQL 9.1 contains a huge volume of new features, possibly more any single release of PostgreSQL before. These features also include several innovations which PostgreSQL is the first database system to have. The most anticipated features in this version include:

  • Synchronous Replication
  • Per-column collations for multilingual databases
  • Unlogged Fast Tables
  • K-Nearest-Neighbor Indexing
  • Serializable Snapshot Isolation
  • Writeable Common Table Expressions
  • SE-Linux Integration
  • Extensions
  • SQL/MED attached tables

The PostgreSQL project now depends on you to test 9.1beta1 in order have a rapid and bug-free 9.1 release. If you are able to help with testing version 9.1, please see the Beta Testing HOWTO

Binary downloads are available, as is the source.

If you’d like to grab a copy of the latest from git, here is a quick set of instructions to compile 9.1beta1 from the git repo:


git checkout REL9_1_BETA1
./configure --prefix=/opt/pg9.1beta1
make
sudo make install

And then to create a database:

/opt/pg9.1beta1/bin/initdb -D mytestdb
/opt/pg9.1beta1/bin/pg_ctl -D mytestdb start

For a preview of features coming this fall, check out Depesz’s blog.

Two talks at MySQL Conf done! Slides…

Just finished my last talk. Slides are downloadable here, and also embedded after the break.

MySQL Conf – Managing Terabytes

Own it: Working with a changing open source community

The floor show is closed, so no more booth work tomorrow. I’ve had a great time here talking with people and seeing my colleagues in the PostgreSQL and MySQL community.

Looking forward to getting some hacking time in tomorrow and enjoying an evening connecting with people instead of working on slides. :)

Continue reading

Where meritocracy fails

Robert wrote about patches and rejection today, and quoted me from some tweets I made about meritocracy. I think Robert made some good points in his post, and I’m going to make some suggestions about patch review.

But first, I want to address my irritation about meritocracy

The first thing that I’ll say is that I’m not sure exactly what people mean when they mention meritocracy. A definition of it is “Meritocracy…is a system of government or other administration (such as business administration) wherein appointments are made and responsibilities assigned to individuals based upon their “merits”, namely intelligence, credentials, and education, determined through evaluations or examinations.”

My assuption was that Ed was saying, “Postgres is awesome because our community is meritocratic.” I don’t believe that’s our strongest value, or quality as a community. And, it’s not something that I think embodies what is awesome about Postgres.

Our strongest quality is our ability to create great code.

We consistently produce readable, reliable and robust code amongst geographically diverse people who have very strong, divergent opinions about a great many things. We find common ground in the production of database software between people who are rhetorically violent even in agreement.

The code quality arises from a commitment by Postgres hackers to discuss in public decisions that many developers prefer to make in private. We are committed to a kind of radical transparency about our code that, at least in our shared Postgres myth, is embodied in Tom Lane’s example. He overwhelmingly gifts to us his time and passion, in the form of methodical reviews of code. And that’s not to say that our reviews are perfect in tone or fact, but just that we consistently do them.

When I think about our review process as it has evolved through Commitfest, it seems so undeniably humane and personal. I know at the same time that it’s still frightening… Just last week a developer talked to me about how much he feared someone tearing into *him* and his code, picking apart decisions he’d made and the bits he knew needed more work. Anyone who shares a creative work knows how this feels – whether it’s a painting, poetry, music or code.

But I don’t think that commitfest or the direct reviews fellow hackers still provide to each other, produced a meritocracy. And I don’t think that we should pursue meritocratic organization much more than we already have.

What we have is something that largely works, and produces a product we feel good about endorsing and improving. There are elements of “promotion through merit”. We pay closer attention now to giving commit access to people who it seems really ought to have it. And we recognize individual efforts where it is appropriate in our commit logs – something many projects fail to do.

At the same time, the operation of the project is dominated by people who fit into a very specific profile. And that’s something like:

  • the top 1% of the world in terms of salary,
  • are male,
  • had parents that were mostly successful (aren’t in jail for violent offenses for example), and
  • either don’t have kids, or have a partner or paid helper that does most of the childcare during the work day.

I count myself among you, with the exception that I’m not male, and I don’t have kids. But I guarantee you that if I did have kids, either my partner would provide the bulk of childcare during the work day, or we would pay someone to do it for us.

I bring this up because in a truly meritocratic organization, privilege wouldn’t matter. Anyone could join us. But the truth is, not everyone can join the Postgres project. And that’s why bringing up the myth, and applying it to an organization I contribute to annoys me.

I try to think regularly about my own privilege, and the place of open source software like Postgres in the world. I consider how to contribute to an organization that is not only is excellent in terms of what it produces, but is also something to be proud of because of the way that people treat and care for each other.

So, I don’t think more, or purer meritocracy helps us have better relationships or treat people well.

We are still small enough at our core (somewhere around 300 people at any point in time), that we can operate like the best businesses do. We rely on good relationships between small groups who tend to appoint leaders to communicate between teams. Our teams seem to often be pairs, or small businesses, which fits our project’s need for deep understanding of each feature.

But apart from the practicality of avoiding further pursuit of meritocracy, I don’t believe that it helps us with talents that we need as a project now. What matters is not that someone is the best at something, but that they have the time to put some effort in, which will then motivate others. That someone out there has a few minutes to write a review, file a bug report or fix a typo on our websites.

What we have to do is create structures that invite people to give what they can, when they can give it. This is what we enable with our extensive comments and thorough documentation. We probably could use someone with Tom Lane’s singular attention and time to our web site, but I think we could make better use of 10 people who could devote a fraction of that time, consistently and with good humor.

So, ending the pursuit of a mythical meritocracy doesn’t mean that we start accepting code which doesn’t meet high standards, or that all of the sudden we’re going to include more code from people in the bottom 1% of the world in terms of salary. It means that we take a look at different aspects of our project and see what is within our means to open up and make accessible to people who aren’t exactly like us.

Report from first day at PgEast and hoping for another tool to be opened up

I wrote up some quick notes from talks and conversations over at the Emma Tech blog.

The most exciting talk I sat in today so far was about an Oracle PL/SQL to Postgres PL/PgSQL translation tool that I’m hoping the company who created it will open source. We’ll see. Fortunately, a fellow conference-goer had an inspirational story to share about open sourcing another tool for Postgres, which meant incredible adoption in just a few months in our community.

Not every project will see that kind of immediate benefit and growth from open sourcing, but there is a certain class of project – where most people can complete 80% of a useful tool, but don’t bother to put in the additional effort to get the remaining 20% of the features that they’d really like to have.

But, when someone does finally release a tool that provides that extra 20% of features, adopting the new tool is a no-brainer.. particularly if it is open source. I think this PL/SQL conversion tool falls into this sweet spot.

Now I’m sitting in the Foreign Data Wrappers talk and very excited to see what Andrew is announcing. Great to see people creating things that make the crowd here clap, smile and celebrate.