Migrations with Alembic: a lightspeed tour

I’ve got a Beer & Tell to give about alembic. Alembic is a migration tool that works with SQLAlchemy. I’m using it for database migrations with PostgreSQL.

So, here’s what I want to say today:

The most difficult thing to deal with so far are the many User Defined Functions that we use in Socorro. This isn’t something that any migration tools I tested deal well with.

Happy to answer questions! And I’ll see about making a longer talk about this transition soon.

TIL: Formatting, search_path and colorcolumn

The last six months have involved a lot more writing of code than the previous couple of years.

I’ve been tweeting little things I learn on a daily basis and thought I’d look back on this week.

format()

A reocurring problem with report writing is getting numbers formatted properly for the occassion. I discovered ‘format’ in Python this week:

print "{0:.2f}%".format(float(1)/3 * 100)

That prints out a float to 2 decimal places. I looked around and Dive Into Python has similar syntax, but without the format() function. So, the equivalent would be:

print "blah %.2f" % (float(1) / 3 * 100)

So, why use one over the other? A user on StackOverflow suggested that compatibility with 2.5 might drive a person to use ‘%’ over ‘format()’, but otherwise, the poster suggested that format() is the cleaner looking and more flexible choice.

set search_path = bixie

I’m working on a new schema for a project. We’re rolling out a prototype quickly, so we’re going to house it in our existing production database for now. To keep things easy to clean up, Laura suggested that we put things into a separate schema. For managing our database models, I’ve switched to using SQLAlchemy, and also alembic for migrations. This made it super easy to specify that I wanted all the Bixie related tables in their own schema:

class BixieCrash(DeclarativeBase):                                              
    __table_args__ = {'schema': 'bixie'}                                        
    __tablename__ = 'crashes'

And that was it.

Then, to avoid having to add ‘bixie.’ to all the table paths in test queries, I put this command into the tests:

 cursor.execute(""" SET search_path TO bixie """)

I imagine there are some other ways to handle this. We’re not really using the ORM for anything other than schema loading, so I’ll probably add that to our connection initialization code for the new app. Then developers can write their queries as without any concerns about being in the correct schema.

And I’ll glow just a little bit about deploying alembic on stage!

set colorcolumn=80

I’ve been trying to write prettier Python. Today’s micro-effort was figuring out how display a vertical line to tell me when I exceed the 80 character width. The proper command to add to .vimrc is:

:set colorcolumn=80

Which looks something like:

colorcolumn in action

Keynote at OSDC.tw: What Beginners Teach Us

I returned from Taiwan on Monday after a long weekend at OSDC.tw.

I gave a keynote called “What beginners teach us“. Audrey Tang graciously translated the slides for me.

The talk was a bit short – about 30 minutes, which left nearly 15 minutes for questions. So many people asked questions about resources for teaching their families and children Python. My first recommendation was Python Programming: An Introduction to Computer Science. The reason I suggested this was the clearly articulated learning objectives at the start of each chapter.

What we really need, though, is a short list of books oriented toward the different kinds of relationships programmers have (parents, grandparents, colleagues, lovers, children, etc.). Each relationship feels a bit different in terms of what will motivate the person, why the programmer is seeking to educate them and what ultimately will be useful about programming or computational thinking in their lives. I’d appreciate any books or resources you’ve used!

Anyway, the questions were wonderful, and I’m looking forward to the video being posted.

UPDATE: Video is now posted!

WoFOSS

I met members of WoFOSS and chatted about starting a PyLadies Taiwan chapter! WoFOSS has been around at least since 2010, and they have monthly meetings in Taipei. About 70 Taiwanese women are involved, and they are hackers from all kinds of different FOSS communities.

Great hacker talks

I also saw some amazing talks, including one by first-time speaker Andrew Cole, who also publishes a zine in Seattle about pinball. His presentation was a tour of Rosetta Code, which translates bits of code between lots of different languages. He linked to the fabulous “chef” language (not the configuration management tool).

The conference was full of amazing hacks, like this Open Office piano-player (and many more audio/pictograph hacks) from imacat. This hack used OO Calc to create a playable piano, and another sheet plays a Christmas tune that reveals an image of Santa flying reindeer.

More fun hacks came from Yusuke Kawasaki. His talk was a very funny set of demos using iPhones and a very simple IR transmitter, made from extremely cheap parts and Sguru. He later demo’d a two-iPhone camera and remote control system for a remote control dump trunk at the hackathon.

On day two, Audrey Tang gave an amazing demo of making an open source dictionary using Postgres and Node out of documents available only in Excel from the government. I was inspired.

I highly recommend OSDC.tw, which typically happens in April every year.

Tech literacy and learning to code for girls in middle and high school in Portland, OR

A friend asked about programs suitable for a 10 year old and a 14 year old girl in the Portland area.

Here’s what I came up with:

As far as things that are already underway: http://www.chicktech.org/ has the most stuff for the 14-year-old

Looks like it happened in January: http://www.chicktech.org/participants/workshops/

Next there’s FreeGeek: http://www.freegeek.org/

http://www.freegeek.org/volunteer/

Their “adoption” program is interesting because you learn how to put together your own computer. Highly recommended.

There’s online programming courses offered through ORVED: http://www.orved.org/

What did I miss?

From twitter:

PostgreSQL security releases now available: versions 9.2.4, 9.1.9, 9.0.13 and 8.4.17

PostgreSQL Global Development Group has just released updates for all currently supported versions of PostgreSQL.

From the release announcement:

The PostgreSQL Global Development Group has released a security update to all current versions of the PostgreSQL database system, including versions 9.2.4, 9.1.9, 9.0.13, and 8.4.17. This update fixes a high-exposure security vulnerability in versions 9.0 and later. All users of the affected versions are strongly urged to apply the update immediately.

A major security issue fixed in this release, CVE-2013-1899, makes it possible for a connection request containing a database name that begins with “-” to be crafted that can damage or destroy files within a server’s data directory. Anyone with access to the port the PostgreSQL server listens on can initiate this request. This issue was discovered by Mitsumasa Kondo and Kyotaro Horiguchi of NTT Open Source Software Center.

I wanted to highlight a couple things from the FAQ we developed for this release.

  1. There are no known exploits for the major security issue fixed by this release. The vulnerability was discovered through security testing conducted by NTT.
  2. Only users of 9.0 PostgreSQL and higher are affected by the major vulnerability.
  3. Affected users are those who allow unrestricted access to the network port PostgreSQL listens on. If you allow anyone, without IP address whitelisting, firewalling or some other kind of network-based access control, to connect to your network port, you are especially vulnerable.

Upgrading from minor version (9.2.3 to 9.2.4, for example) only requires that you install the new binaries and then restart PostgreSQL.

Additionally, if you are using GiST indexes, read the detailed notes in the release announcement to see if you are using features that require you to REINDEX your GiST indexes.

Please update as soon as possible!

Many thanks to our volunteer packagers who worked hard for the past several weeks to make this release possible. All PostgreSQL software releases are managed by volunteers.

About high school computer science teachers

I’m giving a talk at PyCon next Saturday about teachers. The title is “What teachers really need from us“.

The first thing I should admit is that when I started thinking about this talk, I was sure that the list of what teachers needed from us was really long.

Then, I started actually talking with teachers.

So, here’s what some of them have said:

  • Reading comprehension is the biggest barrier to completion of AP Computer Science (Page 8 of this AP CS course description)
  • Fighting for continued existence is the biggest battle for a computer science teacher every year. “The number of secondary schools offering introductory computer science courses dropped 17 percent from 2005 to 2009 and the number offering Advanced Placement (AP) computer science courses dropped 35 percent in that time period.” December 2010 report
  • Writing personal letters from a teacher to students and parents increased the number of girls in one teacher’s class (in Virginia) from nearly zero to 50%. Research into increasing the number of women and minorities in CS classrooms is available in Stuck in the Shallow End.
  • Students at a high school learned three languages in three years. (C++, Java and Python) This busted so many notions I had about how long learning to program takes or what languages are most appropriate for beginners.
  • Kids don’t need algebra to learn to program. Algebra is a weeder course, often a prerequisite to CS and one that strongly indicates whether or not a student will graduate high school. What if kids could take an “algebra on computers” course instead of failing out of school? Please note, learning to program is not the same thing as being a professional programmer.
  • School counselors who help kids choose classes still send students to CS class believing that they’re going there to learn to type. Find out more about the wildly varying understanding state-by-state of what a computer science class really is in the Running on Empty report.
  • What teachers wanted from me was for me to come to their classes and give a short talk to their kids about myself and my work.
  • Teachers were super excited to hear about PyLadies. They struggle to get girls into their classes and are all looking for ways to increase the diversity of their classes.
  • The CS teachers I’ve met want to share their lessons – with me and with other teachers.
  • The CS teachers I’ve met don’t know other CS teachers.
  • Teachers were only mentioned once in the 84 initial statements of support for code.org

I think we’re all really missing out when we don’t talk to teachers.

I’ve talked directly with nine computer science teachers. Most of them are in Oregon, but I also was introduced to a couple teachers who came to Python-related conferences, or were married to Python programmers. I’m hoping to meet more. If you know someone, please put them in touch with me. I’m happy to chat over the phone or email, and love to meet folks in person.

Why PyLadies?

pyladies_blue

Hey Hacker News! If you’re coming here for the first time, you may also be interested in What I mean when I say I would like more women in the software industry

PyLadies is a group of women working on welcoming, encouraging and directly inviting women to join the Python community and to learn from each other.

I started a PyLadies chapter in Portland, OR last September (2012). We started out with weekly meetings to do homework from a Coursera class to make games with Python. That turned into weekly meetings — plus homework meetups on Saturdays at a local coffee shop, and IRC hangout time to test homework. And that turned into me giving mini-lessons at each Coursera meetup about the material from the class.

People seemed really excited.

Stats - PyLadies PDX (Portland, OR) - Meetup

Before we knew it, it was December, we had over 60 women subscribed to the Meetup, 30 of which had attended a meeting. Today, we’ve got 96 subscribers, 50 people have attended a meeting, and more have signed up to attend events in the future than ever before. And, it’s done by women. Using open source. Teaching classes. Learning developer tools. And writing software.

Since September, I’ve met even more women involved in running PyLadies chapters across the country. Much like the way the PostgreSQL community is organized, we’ve got a loosely connected group of people working independently. We offer support to each other, but don’t have hard and fast rules about what each chapter does. We encourage teaching and workshops, but don’t require them. We share our resources and are quick to put git repos out there of our materials. We send lots of pull requests. And we’re constantly looking for ways for women to get more involved in open source and Python.

All Group Reviews - PyLadies PDX (Portland, OR) - Meetup

I’m completely energized by the positive feedback we’ve gotten for every meeting. More recently, I’ve heard from people that they feel confident and sure of their knowledge because they’ve spent time in our meetups talking and learning from other women.

My goal is to make every get together like that – by having great lessons, a shared understanding of coaching and peer-based education and presentations from our members. Building these groups takes time, and I’m impatient to get to the part where I feel like every interaction with the group is rewarding for every member.

And I can’t do it alone. We’ve got four meetup organizers (although one is about to relocate to the Bay Area!). I work closely with Flora Worley, a kickass developer who chose programming as a career path after working on a PhD, on topic details and planning for the meetings. I’m so looking forward to meeting in person with the many members of the PyLadies community at PyCon next month.

Recent talks: How to get a job like mine, Command-line essentials, Restore FTW

Here are a few of the talks I’ve given recently here in Portland. I’m trying to give more talks locally, and happy to speak at your Portland User Group. Just drop me an email.

  • How to get a job like mine. This talk was given to PSU students as an encouragement for them to get involved in free and open source software. Toward the end, we did a brainstorming session on the reasons why they didn’t contribute, and tried to come up with projects each person in the audience might be interested in learning more about
  • Command-line II. I’m writing up my notes from this talk, hopefully to turn it into a real tutorial that others could copy. My goal this year is to give a tutorial every other week, and I’m hoping to have at least 10 lessons come out of that work. It seems like I need to give each lesson twice to really get the hang of it. Which means I aught to get out of this experience with 26 lessons… but trying to stay realistic about my time.
  • Restores FTW at PDXPUG. This talk is about backups for PostgreSQL and how to get your teams to come up with restore plans that exercise databases as part of normal operations. I’m trying to switch talks about Backups to being talks about Restores. The next time I give this, I think I’ll change the order of the “restore patterns” to be at the start of the talk, and the discussion about planning for backups/restores to the end. I plan to do a Mozilla brownbag that covers these topics and also goes through a live demo of backing up, restoring and testing PostgreSQL with the new 9.2 tools.

WebTools workweek, start of a symbols database, Kasturba Ghandi

I came across a comment from Sumana saying that she’d like to hear more about the day-to-day life of our fellow FLOSS women. So here’s a run-down of my past week:

Mozilla WebTools team workweek

Mozilla teams hold work weeks from time to time – to get the team together, to experiment with new ideas and in our case, to meet up with a couple other teams (Marketplace and AMO, plus a couple extra folks we work a lot online with, but don’t see very often). I did my normal nerd-out things like making a spreadsheet of all the names and silly intro comments people made on the first day, and I setup and deployed backup scripts to a new 5TB backup server that’s just for crash-stats.mozilla.com’s PostgreSQL database.

There were a few projects on the table to deep-dive into: support for JSON datatypes, creating a symbols database-backed system to replace our filesystem-based one, and work a bit on replacing the SQL-file migration system in Socorro with a SQLAlchemy one.

Symbols database and Range Types

I ended up focusing on the symbols database because Ted, one of our breakpad experts, was around and very generously walked me through what we needed. I have a rough schema in place, and a plan for setting up a few systems to house what will likely be a 1TB database.

In working on this, I spent some time learning more about how to apply range types. The queries for finding symbols are mostly “show me the functions that contain the memory address I have”. Functions all have start addresses and a size, so running “contains” queries makes a lot of sense. In my initial tests, queries using the range types were about 60% faster than queries using plain integer types.

When we’ve got a larger data set to work with imported, I will post some detailed numbers about the in-database comparisons, as well as any performance improvements we’ll get from querying a database instead of loading the plain-text symbols files

Getting JSON files to describe builds and releases

A small project I’ve been working on is getting JSON files produced to describe our builds. Before I go on — please know that this is pretty obscure. The people who are concerned about this information are mostly people who identify crashes and track down which releases are affected by particular bugs. What we keep are things like what platform (Linux, Mac OS X, Windows), what day a release occurred on, whether the release was a beta or not and a few other things.

The way that we got this information in the past was by deriving it from filenames and directory names in our release FTP server. The code to pull this information out is kind of a pain, and if anyone changes a directory name (for a good reason, or on accident..), this code breaks.

It would be much better if we had a way of getting this information in a standardized format. I recently talked to B2G about putting this information into a JSON file (they already were publishing release information via the manifest directory on our FTP server in XML, so it wasn’t too big of a leap). I thought it would be nice to spread this practice to our other software releases.

As luck would have it, a person familiar for Firefox builds is in Mountain View and was giving Ted a ride to the airport! So, just as they were about to leave, we chatted about the problem, created a bug and now I’m going to get build and release information from a JSON file. :)

It’s a tiny change, and hopefully won’t take very long to make, but is going to make getting this information much more pleasant and reliable.

Reading about Kasturba Gandhi

I decided to read a real paper book on my flights last week, and picked up a copy of “The Forgotten Woman”, a biography of Kasturba Gandhi, wife of Mahatma Gandhi. Arun Gandhi visited the University of Oregon in the 90s, and my husband had picked up a signed copy.

I’m having a hard time summing up the book. There were a number of things that surprised me. I hadn’t realized that illiteracy for women was so prevalent at the turn of the 20th century in India. I also wasn’t aware of the focus Mahatma Gandhi had on women’s role in political transformation, or how much he had attributed the origin of Satyagraha to Kasturba. Also, this biography attributed Gandhi’s vow of celibacy to Kasturba’s near death after the birth of her fifth child. Kasturba also led an important self-reliance movement, urging women in India to learn to spin and weave their own cloth, rather than buying foreign goods. She also led an effort to teach hygiene to Indigo farming families.

I had a look at the wikipedia page for her, which had no citations and not very well written. I’ve started some work on it, but need to think a bit more about how it should be structured.