Migrations with Alembic: a lightspeed tour

I’ve got a Beer & Tell to give about alembic. Alembic is a migration tool that works with SQLAlchemy. I’m using it for database migrations with PostgreSQL.

So, here’s what I want to say today:

Written by SQLAlchemy wiz Mike Bayer
Here’s the tutorial. Socorro is now using alembic in production with SQLAlchemy 0.6.x. I’m hoping to get us upgraded to 0.8.x soon.
Here’s what running an upgrade in production for Socorro looks like. Awesome right?
Here’s what a migration looks like.
Here’s a configuration file.
Generating a migration from the command line might look something like:
alembic revision -m "bug XXXXXX Add a new table" --autogenerate

The most difficult thing to deal with so far are the many User Defined Functions that we use in Socorro. This isn’t something that any migration tools I tested deal well with.

Happy to answer questions! And I’ll see about making a longer talk about this transition soon.

4 thoughts on Migrations with Alembic: a lightspeed tour

Comments are closed.

Adrian Klaver · May 18, 2013 at 12:07 pm ·

I would be interested in a longer talk/post on Alembic. I have started reading up on it , like what I see and could use a primer. If I follow correctly the above, UDFs are not currently covered by Alembic, so they need to be dealt with separately?

selena · May 19, 2013 at 10:51 am ·

That’s correct. My choice was to create a subdirectory that has one file per UDF. So, when we change UDFs, we get diffs that are useful. And I write a simple loop in a migration to capture loading the change. The unfortunate thing here is that rollbacks are slightly more complicated. To get the old version of the UDF, you’d need to check out an older version of the repo. Our deployment system actually could accommodate this — I’d just need a symlink to the previously deployed version. There’s some devil-in-details there… Someone suggested that we just rename old UDFs based on the hash of the revision. I like that and may try it out. Still need to figure out how and when to drop the old functions once we’ve confirmed everything is working.
- Adrian Klaver · May 20, 2013 at 5:53 pm ·
  
  Interesting. One idea, one question
  
  Idea:
  Your description of wrestling with UDF diffs got me thinking about something I ran across while digging into Mercurial, Mq, which in turn is built on Quilt. At this point I have only surveyed from the 10,000 ft level. As I understand it they both allow one to maintain a queue of patches in parallel with the commit history and apply, change and rollback patches as needed. With the option of applying the patches as commits and making them a permanent part of the history at the time of your choosing. Not sure whether this would work in your case, but I am now motivated to check it out for my own use.
  
  Question:
  I am not sure about the drop the old function statement.. Where are you dropping the function? Old and new in the version controlled code is a point of view issue, i.e what revision you are on. In the database the function would be the new version if the migrations had been applied. Now it is entirely possible I am being dense and am missing something obvious.
  - selena · May 21, 2013 at 1:43 pm ·
    
    So regarding rollback of an upgraded UDF — the reason this is an issue is that once you check out a new version of the UDF in git the old version is no longer visible. And you might say “well, just checkout the old git version to revert!”
    
    But that’s not possible.
    
    Because… we are talking about a deployed version of code that is shipped to the database — the same version of the code that is shipped to every other system in our 50+ node environment.
    
    In order to get access to the old UDF versions that are in revision control, I’d need to find a way to get a perviously deployed version of the application.
    
    The workaround is to keep an old version of the UDF stashed in the database — under an assumed name, basically 🙂
    
    Anyway, clearly I should write more about this problem and explain exactly what I mean. It is a complicated issue and one that not a lot of people ever even run into at this point.
    
    But, as more people chose to use things like PLV8, I think it will be more important to have a good workflow documented.