Migrations with Alembic: a lightspeed tour

I’ve got a Beer & Tell to give about alembic. Alembic is a migration tool that works with SQLAlchemy. I’m using it for database migrations with PostgreSQL.

So, here’s what I want to say today:

The most difficult thing to deal with so far are the many User Defined Functions that we use in Socorro. This isn’t something that any migration tools I tested deal well with.

Happy to answer questions! And I’ll see about making a longer talk about this transition soon.

4 thoughts on Migrations with Alembic: a lightspeed tour

Comments are closed.

  1. I would be interested in a longer talk/post on Alembic. I have started reading up on it , like what I see and could use a primer. If I follow correctly the above, UDFs are not currently covered by Alembic, so they need to be dealt with separately?

    • That’s correct. My choice was to create a subdirectory that has one file per UDF. So, when we change UDFs, we get diffs that are useful. And I write a simple loop in a migration to capture loading the change. The unfortunate thing here is that rollbacks are slightly more complicated. To get the old version of the UDF, you’d need to check out an older version of the repo. Our deployment system actually could accommodate this — I’d just need a symlink to the previously deployed version. There’s some devil-in-details there… Someone suggested that we just rename old UDFs based on the hash of the revision. I like that and may try it out. Still need to figure out how and when to drop the old functions once we’ve confirmed everything is working.

      • Interesting. One idea, one question

        Your description of wrestling with UDF diffs got me thinking about something I ran across while digging into Mercurial, Mq, which in turn is built on Quilt. At this point I have only surveyed from the 10,000 ft level. As I understand it they both allow one to maintain a queue of patches in parallel with the commit history and apply, change and rollback patches as needed. With the option of applying the patches as commits and making them a permanent part of the history at the time of your choosing. Not sure whether this would work in your case, but I am now motivated to check it out for my own use.

        I am not sure about the drop the old function statement.. Where are you dropping the function? Old and new in the version controlled code is a point of view issue, i.e what revision you are on. In the database the function would be the new version if the migrations had been applied. Now it is entirely possible I am being dense and am missing something obvious.

        • So regarding rollback of an upgraded UDF — the reason this is an issue is that once you check out a new version of the UDF in git the old version is no longer visible. And you might say “well, just checkout the old git version to revert!”

          But that’s not possible.

          Because… we are talking about a deployed version of code that is shipped to the database — the same version of the code that is shipped to every other system in our 50+ node environment.

          In order to get access to the old UDF versions that are in revision control, I’d need to find a way to get a perviously deployed version of the application.

          The workaround is to keep an old version of the UDF stashed in the database — under an assumed name, basically :)

          Anyway, clearly I should write more about this problem and explain exactly what I mean. It is a complicated issue and one that not a lot of people ever even run into at this point.

          But, as more people chose to use things like PLV8, I think it will be more important to have a good workflow documented.