May 31, 2013 postgresql jsconf, json, plv8, schema liberation 3 Comments

JSON, PLV8 and Postgres 9.3

I’m presenting a talk about JSON and PLV8 (available on speakerdeck or on dropbox) at the JSConf 5th year Family Reunion, in Amelia Island, FL today. My husband and I took a couple extra days vacation between the conference and the Memorial Day holiday to go birding. So, in the slide deck, I included pictures we took of several of the 40+ species we’ve seen.

The Great Florida Birding Trail starts on Amelia Island, and wow — there are so many amazing species, several of which are endangered, that we’ve seen in the short time we’ve been here. It’s hard to do justice to how beautiful the area is. There’s a wikipedia entry, and some strange politics that have preserved the natural areas nearby, like Cumberland Island.

Regardless, Amelia Island is a stunning backdrop for a conference. I’m a bit of a fish out of water here, but I’ve met so many people excited about the JSON datatype and what it means for their development environments. And, tons of people who wish I would have told them about this a year ago when Postgres 9.2 first started supporting it!

My talk is at 10:30am, right after my fellow Mozillian and creator of JavaScript Brendan Eich presents a talk about the future of browser VMs.

May 24, 2013 postgresql 2 Comments

The People of Postgres: Tom Lane

This post was originally posted on Medium, a new blogging platform made up mostly of people who aren’t necessarily subscribed to Planet. So, please forgive the obvious statements, as the target audience are people who don’t know very much about Postgres. Tom Lane, taken by Oleg Bartunov

Wednesday May 23, with no fanfare, Tom Lane’s move to Salesforce.com was made public on the Postgres developer wiki.

For 15 years, Tom has contributed code to Postgres, an advanced open source relational database that started development around the same time as MySQL but has lagged behind it in adoption amongst web developers. Tom’s move is part of a significant pattern of investment by large corporations in the future of Postgres.

For the past few years, Postgres development has accelerated. Built with developer addons in mind, things like PLV8 and an extensible replication system have held the interest of companies like NTT and captured the imagination of Heroku.

Tom has acted as a tireless sentry for this community. His role for many years, in addition to hacking on the most important core bits, was to defend quality and a “policy of least surprise” when implementing new features.

Development for this community is done primarily on a mailing list. Tom responds to so many contributor discussions that he’s been the top overall poster on those mailing lists since 2000, with over 85k messages.

Really, he’s a cultural touchstone for a community of developers that loves beautiful, correct code.

Someone asked: “What does [Tom’s move] mean for Postgres?”

You probably don’t remember this:

Salesforce.com bases its entire cloud on Oracle database,” Ellison said, “but its database platform offering is PostgreSQL. I find that interesting.

When I read that last October, I was filled with glee, quickly followed by terror. I love my small database community, my friends and my job. What if Oracle shifted its attention to our community and attacked it, directly? So far, that hasn’t happened.

Instead, Salesforce advertised they were hiring “5 new engineers…and 40 to 50 more people next year” for a “huge PostgreSQL project.”

Tom’s move probably won’t change much for the day-to-day operation of Postgres itself. Hopefully, things are about to get real at Salesforce.

I’m a major contributor to Postgres. I started in 2006, learning about relational databases through work at a small bike parts manufacturer and ERP. My contributions include code, starting conferences, encouraging user group leaders and introducing Postgres to communities that otherwise would never hear from us. I’m a data architect at Mozilla.

May 20, 2013 postgresql

Distributed databases: a series of posts including 2-phase commit in Postgres

There’s a fantastic set of blog posts about distributed databases and network partitioning, starting with this post explaining the perils of trying to “communicate with someone who doesn’t know you’re alive.”

The next post is about Postgres and 2-phase commit. And there are four additional posts in the series.

The whole series worth reading for anyone interested in data stores, consistency and Postgres! 🙂

May 17, 2013 postgresql 4 Comments

Migrations with Alembic: a lightspeed tour

I’ve got a Beer & Tell to give about alembic. Alembic is a migration tool that works with SQLAlchemy. I’m using it for database migrations with PostgreSQL.

So, here’s what I want to say today:

Written by SQLAlchemy wiz Mike Bayer
Here’s the tutorial. Socorro is now using alembic in production with SQLAlchemy 0.6.x. I’m hoping to get us upgraded to 0.8.x soon.
Here’s what running an upgrade in production for Socorro looks like. Awesome right?
Here’s what a migration looks like.
Here’s a configuration file.
Generating a migration from the command line might look something like:
alembic revision -m "bug XXXXXX Add a new table" --autogenerate

The most difficult thing to deal with so far are the many User Defined Functions that we use in Socorro. This isn’t something that any migration tools I tested deal well with.

Happy to answer questions! And I’ll see about making a longer talk about this transition soon.

May 10, 2013 python 5 Comments

TIL: Formatting, search_path and colorcolumn

The last six months have involved a lot more writing of code than the previous couple of years.

I’ve been tweeting little things I learn on a daily basis and thought I’d look back on this week.

format()

A reocurring problem with report writing is getting numbers formatted properly for the occassion. I discovered ‘format’ in Python this week:

print "{0:.2f}%".format(float(1)/3 * 100)

That prints out a float to 2 decimal places. I looked around and Dive Into Python has similar syntax, but without the format() function. So, the equivalent would be:

print "blah %.2f" % (float(1) / 3 * 100)

So, why use one over the other? A user on StackOverflow suggested that compatibility with 2.5 might drive a person to use ‘%’ over ‘format()’, but otherwise, the poster suggested that format() is the cleaner looking and more flexible choice.

set search_path = bixie

I’m working on a new schema for a project. We’re rolling out a prototype quickly, so we’re going to house it in our existing production database for now. To keep things easy to clean up, Laura suggested that we put things into a separate schema. For managing our database models, I’ve switched to using SQLAlchemy, and also alembic for migrations. This made it super easy to specify that I wanted all the Bixie related tables in their own schema:

class BixieCrash(DeclarativeBase):                                              
    __table_args__ = {'schema': 'bixie'}                                        
    __tablename__ = 'crashes'

And that was it.

Then, to avoid having to add ‘bixie.’ to all the table paths in test queries, I put this command into the tests:

 cursor.execute(""" SET search_path TO bixie """)

I imagine there are some other ways to handle this. We’re not really using the ORM for anything other than schema loading, so I’ll probably add that to our connection initialization code for the new app. Then developers can write their queries as without any concerns about being in the correct schema.

And I’ll glow just a little bit about deploying alembic on stage!

set colorcolumn=80

I’ve been trying to write prettier Python. Today’s micro-effort was figuring out how display a vertical line to tell me when I exceed the 80 character width. The proper command to add to .vimrc is:

:set colorcolumn=80

Which looks something like:

Selena Deckelmann's blog about open source and working at Mozilla.

Monthly Archives: May 2013

More about JavaScript and PostgreSQL