Looking toward Chicago: Postgres Open, local user groups, parties and on to October!

I’ve been incredibly busy this past month, and not blogging – being a free agent has possibly made me busier than I was before!

Postgres Open’s schedule is in near-final state. We’ve started adding talks to our Demo room on Thursday, and are looking forward to a keynote from Charles Fan, SVP at VMWare about recent developments in vmware’s cloud offerings for Postgres.

We’ll also be getting a more in-depth look at Heroku’s new postgres.heroku.com on-demand database service, as well as an open source tool they wrote called WAL-E.

Thanks to Heroku, we’ll be streaming much of the content from the conference live, so you’ll be able to catch the keynotes and many of the talks, even if you’re not there. And we’ll be sharing the videos after.

I believe we’re the first Postgres conference to do this! Someone correct me if I’m wrong. 🙂

While I’m in Chicago, I’m planning to drop by the Windy City Perl Mongers for a reprise of my 9.1 talk from OSCON.

We’re also planning a couple parties for Postgres Open, and hopefully inviting a few of the local user groups to join us.

After that, I’m headed in October to PostgreSQL Conference EU, and will be giving a talk about terabyte Postgres databases (and the problems you run into with them), and a database-specific “Mistakes were Made” talk, about operations and the tools we need to use to help us make fewer mistakes.

The importance of doing things badly

Update: added “code review” to the things that we’re doing well below.

There were a couple themes for me from OSCON last week. One is transitions and change. I’ve got a whole slew of thoughts on this, particularly from my experience leaving the management team of Open Source Bridge.

But the other is the importance of doing things badly. In particular, the importance of doing things badly in open source.

Tim Anglade, at about 41:10, says that he thinks the reason why open source companies make money is because open source is kind of shitty (from an interview he did with Cliff Moon last fall). So, on one hand there’s a Money Making Opportunity. Probably not the one that we’d all prefer, but it is what it is.

When he said that, I immediately thought about the other things that we do badly (other than documentation) and the discussions I’d been having with people last week.

Basically, we had a problem in the Postgres community of experienced developers solving every small bug at nearly the moment it was reported. It’s sort of like a cat sitting at the entrance of the only mousehole.

The effect on the code is amazing – we have clearly documented, concise and consistent code. But the effect on the community is that we don’t have mid-level developers, and it is very difficult for inexperienced developers to build up a portfolio of small projects, based on bugs.

I don’t have a ready solution for this problem. And I do not mean this as a criticism of the thousands of hours our core teams have devoted to fixing bugs. We all benefit from the dedication. I am just pointing out that our system had a clear tradeoff – fewer contributors.

What we could do a bit worse (to address the point of this blog post) is lengthen our response time to solving bugs, and let some less experienced developers respond to the bugs queue. This probably involves creating a bug tracker and holding the tension a bit longer on fixes.

Our committers have made efforts toward spreading the load around more – with commitfest – meaning a greater support of code review, with Tom’s recent presentations about the planner, with our wiki-fied Todo list. And there are many more examples of our committers putting real effort into mentoring, tutoring and finding ways of bringing more people in.

The thing that’s missing from all of those efforts, however, is urgency. That’s what bug-fixing is great for. That’s why we have people who remain in operations work even if they hate being woken up at 3am. Urgent work is worthwhile work (mostly).

I’m sure there are other particular areas where we could do things worse, and thus invite more people to contribute. I’ll be thinking about this more in regard to our project event planning, as I think there’s a bit of a disconnect there, and a huge opportunity to involve more people.

I’m reminded again of David Eaves’ talks about how community management is the core competency of open source, not technology. I struggle with that thought every day, but it rings truer the more I try to work on the significant problems facing any particular open source project.

OSCON: Postgres represent! And my links for Harder, Better, Faster, Stronger talk

I’m giving a couple talks at OSCON this year. The first is on Tuesday, 10:40am room C123: Harder, Better, Faster, Stronger: Postgres 9.1. The other is Mistakes were Made, Wednesday at 1:40pm in room D136.

My colleague Robert Treat is giving a Pro PostgreSQL workshop Wednesday at 1:40pm too, room 204. He’s also giving a Scalability Patterns talk at 4:20pm Tuesday. I’m sure his talks will be awesome. 🙂

And here are the rest of the talks tagged with PostgreSQL.

Also remember — there’s a PgDay tomorrow at the Oregon Convention Center!

I’m pushing my examples for my 9.1 talk into a github repo. It should be populated with whatever I decide to use for the talk by Monday evening.

Building 9.1 for me on Mac OS X (leopard!) involved the following:

git tag -l | grep REL9_1
git checkout REL9_1_BETA2
./configure --with-perl --with-python --prefix=/opt/pg91beta2 --with-readline
make install

Normal caveats apply – you need X Code of a reasonably recent version, and a bunch of support libraries to make this happen. I haven’t rebuilt from scratch on OS X in a long time, but now I realize that maybe I aught to go through the pain and document this again.

But I digress!

I have a long list of resources for this talk and wanted to share. Probably in the slides for the talk, I’ll provide shortlinks so that people can pull them up and read instead of listening to me 😀

Here’s my links:

And if you’re wondering about the title, I took it from an great Daft Punk song that fans have created some epic videos of:

PgCon Pub Track: Learning more about Synchronous Replication

So, we’re at the Pub and doing “create a billion tables” time trials with Jan Urbanski using Python and Josh Berkus using Perl.

We’re also hacking on a test framework the Slony developers have, specifically hacking with Steve Singer. What we discovered is that sync rep doesn’t wait for a WAL segment to be *replayed* before it returns. In the pg_stat_replication table, we see sent_location, write_location and flush_location synchronized, but not replay_location.

This makes sense from a database perspective, but may be surprising behavior for application developers. There are patches out there (according to what I just heard from Bernd) to make synchronous replication wait for replay on the slave, but it’s not certain when that will be committed. It definitely won’t be part of version 9.1.

I just wrote up configuration details from a database administrator’s perspective, and am planning on doing some additional work to make a highly condensed configuration tutorial for our main docs. We definitely need to explain this more clearly for users, who might be thinking of it more from an application perspective.

Announcing Postgres Open

On behalf of the Postgres Open organizing committee, I’m pleased to share this announcement:

Postgres Open 2011, a conference for data innovators focused on disruption of the database industry through PostgreSQL, will take place September 14-16, 2011 at Chicago’s Westin Michigan Avenue hotel.

“PostgreSQL’s consistent addition of new features and enhancements, while remaining focused on reliability and performance, has provided myYearbook a solid foundation to create new and innovative applications,” said Gavin Roy, CTO at myYearbook. “We are looking forward to the Postgres Open Conference as a venue to share, network, and learn innovative ways to leverage Postgres in our environment.”

Postgres Open, a community-organized, non-profit conference, addresses the breadth of PostgreSQL usage, from core database system design to enterprise database use. Inviting entrepreneurs and technologists on the leading edge of data management, the conference will focus on open source database innovation and changes in the database market. Postgres Open includes regular talks, keynotes and hands-on tutorials.

We’re pleased to announce that VMWare and EnterpriseDB are joining the conference as founding sponsors.

The theme of the inaugural conference is “disruption of the database industry”. Topics will include new features in the latest version of PostgreSQL, use cases, product offerings and important announcements. Invited talks and presentations will cover many of the innovations in version 9.1, such as nearest-neighbor indexing, serializable snapshot isolation, and transaction-controlled synchronous replication. Vendors will also be announcing and demonstrating new products and services to enhance and extend PostgreSQL.

Postgres Open 2011’s main program (September 15-16) will be preceded by a day of intensive, half-day tutorials.

The Call For Papers for Postgres Open will open in late May.

Our program committee includes:
Robert Haas, Major Contributor, PostgreSQL committer,
Josh Berkus, Core Team member,
Greg Smith, Major Contributor to PostgreSQL and author of High
Performance PostgreSQL 9.0,
Gavin Roy, CTO of MyYearbook.com and
Selena Deckelmann, Major Contributor to PostgreSQL.

If you’d like to receive announcements as the conference progresses, please visit the website and add your email address to our list.

For information concerning sponsorship, please send email to sponsorship@postgresopen.org for a copy of our prospectus.

PgCon Day 1 – Cluster summit and catching up with folks

Yesterday, I spent my morning at the Clustering summit, catching up on what the cluster hackers have been up to for the last year. I was lucky enough to sit next to Jan Wieck and Kevin Grittner. You may remember Kevin from his work on serializable snapshot isolation.

There were some pretty awesome side conversations about where folks think work needs to be done next, and conflict resolution for multi- (or many-) master setups.

I gave a quick update on Bucardo 5, which had an alpha release last week, supports many-master and has has experimental support for non-Postgres targets. The first two targets are text and MongoDB.

The Postgres project has given the generic name “binary replication” to all the features like WAL shipping, streaming replication and synchronous replication. Simon Riggs also gave his update on these features at the Clustering Summit today. He observed that the 9.1 release is the culmination of 7 years of work on replication subsystems. Simon pointed out that synchronous replication is the best, and most obvious, use case for the binary replication at the core of Postgres. And also pointed out that he was quite pleased with the ultimate design.

For the afternoon, I spent some time with folks on the infrastructure team, giving Magnus well-deserved congratulations for his induction into -core, and meeting up with folks from all over at the Royal Oak and Keg, a reasonable steakhouse in town.

Looking forward to the developers meeting today!

At PgCon 2011 – day 0

I wrote my review of synchronous replication over on Emma’s Tech blog (It’ll probably be published mid-day Tuesday). I’m visiting Ottawa this year on behalf of Emma, one of many great sponsors of Postgres’ yearly international developer conference, pgCon.

This week will be packed for me – attending the Clustering summit, the developers meeting, presenting about Emma’s database systems, leading the lightning talks, and of course attending the many parties this week.

Because we are spread so far around the globe, pgCon is often our one chance to get together and really dig into problems in-person.

And, I’m pulling together our first ever Procedural Language summit. With the new extension system, over 30 procedural languages implemented, and a ton of new features being added to existing PLs, I thought it was time PL developers should come together and have a chat. I’ve still got a few details to work out before Saturday (sorry all that RSVP’d – final agenda coming soon!).

I’m hoping to also have another, unrelated, announcement this Wednesday. Hopefully all the details come together!

Anyway, with that cliffhanger, I’m off to get a good night’s rest before the clustering summit tomorrow.

9.1 beta 1 is out! Help us test.

Postgres released version 9.1 beta 1 today! This is a preview of 9.1, predicted to be available in the next 2-3 months, not a bugfix release for earlier versions of Postgres.

PostgreSQL 9.1 contains a huge volume of new features, possibly more any single release of PostgreSQL before. These features also include several innovations which PostgreSQL is the first database system to have. The most anticipated features in this version include:

  • Synchronous Replication
  • Per-column collations for multilingual databases
  • Unlogged Fast Tables
  • K-Nearest-Neighbor Indexing
  • Serializable Snapshot Isolation
  • Writeable Common Table Expressions
  • SE-Linux Integration
  • Extensions
  • SQL/MED attached tables

The PostgreSQL project now depends on you to test 9.1beta1 in order have a rapid and bug-free 9.1 release. If you are able to help with testing version 9.1, please see the Beta Testing HOWTO

Binary downloads are available, as is the source.

If you’d like to grab a copy of the latest from git, here is a quick set of instructions to compile 9.1beta1 from the git repo:

git checkout REL9_1_BETA1
./configure --prefix=/opt/pg9.1beta1
sudo make install

And then to create a database:

/opt/pg9.1beta1/bin/initdb -D mytestdb
/opt/pg9.1beta1/bin/pg_ctl -D mytestdb start

For a preview of features coming this fall, check out Depesz’s blog.

Where PostgreSQL succeeds and what to do next

Response to my earlier post about meritocracy was overwhelming.

Also, Robert posted a response focusing on code, and how the PostgreSQL project works around Commitfest.

Addressing some criticisms

I talked to Bruce Momjian about a few things that I said toward the end of my earlier post. Things that may have offended people in our community.

We focused mainly on that I brought a discussion of the outside world into the microcosm of PostgreSQL. And that I brought two things together: intrinsic ability of an individual to succeed, and the value of an individual’s contribution to Postgres itself.

I talked about a world that is filled with people who are poor, uneducated or disenfranchised who we, as a project, probably just can’t reach. And that by mentioning these facts, which Bruce and I agreed were facts, I was confusing and insulting people who contribute so much time and hard work to our project.

What PostgreSQL does well

To clarify, PostgreSQL does an admirable job of promoting and encouraging the work of the people who step up and contribute code. Robert’s post about Commitfest shows how much effort goes into finding and encouraging the type of people that we’d like to contribute more code, review our code and document it.

As a project, we also do pretty well with encouraging non-code contributions. In particular, I think we do very well with conferences: finding creative ways of sponsoring them, seeking out and developing new speakers, and helping start user groups. The focus has always been on finding the right people, in the communities that we see growing, and encouraging them. Today, we see conferences and great Postgres representation in Japan, Russia, Canada, the US, China, Cuba, the European Union and Brazil. And there are more.

So, I think that we (Postgres) are succeeding, and growing.

I brought up my criticisms in the context of Robert’s original post, and a request that I lay out my concerns about invoking meritocracy. The concerns I expressed are more about the outside world, how that world impacts Postgres, and how Postgres can impact the rest of the world.

I do think we can do more to create structures that encourage participation, the Commitfest being a great example of how to implement and succeed in the future. I’ve seen a few people step up and offer help in the last couple weeks, and I’ll encourage them in their work. And hopefully talk about their successes here.

What do we do next?

What I wanted to do was provoke a larger discussion about what we could be doing. I didn’t offer any particular solutions. I just asked that we think for a moment about what we might be able to do

And that, magically, happened.

David Fetter asked: “Which of those barriers do you see as important to address first?”

I’d like to connect Postgres more with the people in regions that our community doesn’t yet reach.

So, I’ve put up a survey asking people who live in high population regions that our community doesn’t really serve at all – most of Africa and the Middle East.

Please take a moment and let us know how you use Postgres, and what ways the Postgres community can connect with you.

The plan over the next six months is to both find ways of getting Postgres experts to give talks in those regions, and to find ways of supporting more people who want to be advocates for Postgres.