Inheritance and sharding with Postgres

A friend told me about their sharding scheme last night, and it made me very curious about how others are handling this problem. This question about database design turns into a devops issue, so it’s something really the entire development group and devops and DBAs need to be aware of and concerned about. And it’s not a problem exclusive to Postgres.
Continue reading

Going from Vagrant and Puppet into EC2: A short survey of 5 tools (and two I didn’t bother trying)

I thought this would be easy.

I started using Vagrant, and was productive with it in about a day. Really a couple hours. Most of my time was spent downloading the correct version of VirtualBox, looking for starter images and then a small amount of time experimenting with the Vagrantfile scripting language (for multiple VMs).

And we made some Puppet configs.
Continue reading

Broken windows, broken code, broken systems

A few days ago, I asked:

I spend a lot of time thinking about the little details in systems – like the number of ephemeral ports consumed, number of open file descriptors and per-process memory utilization over time. Small changes across 50 machines can add up to a large overall change in performance.

And then, today, I saw this article:

One of the more telling comments I received was the idea that since the advent of virtualization, there’s no point in trying to fix anything anymore. If a weird error pops up, just redeploy the original template and toss the old VM on the scrap heap. Similar ideas revolved around re-imaging laptops and desktops rather than fixing the problem. OK. Full stop. A laptop or desktop is most certainly not a server, and servers should not be treated that way. But even that’s not the full reality of the situation.

I’m starting to think that current server virtualization technologies are contributing to the decline of real server administration skills.

There definitely has been a shift – “real server administration skills” are now more about packaging, software selection and managing dramatic shifts in utilization. It’s less important know to know exactly how to manage M4 with sendmail, and more important that you know you should probably use postfix instead. I don’t spend much time convincing clients that they need connection pooling; I debug the connection pooler that was chosen.

The available software for web development and operations is quite broad – the version of Linux you select, whether you are vendor supported or not, and the volume of open source tools to support applications.

Inevitably, the industry has shifted to configuration management, rather than configuration. And, honestly, the shift started about 15 years ago with cfengine.

Now we call this DevOps, the idea that systems management should be programmable. Burgess called this “Computer Immunology”. DevOps is a much better marketing term, but I think the core ideas remain the same: Make programmatic interfaces to manage systems and automate.

But, back to the broken window thing! I did some searching for development and broken windows and found that in 2007, a developer talked about Broken Window Theory:

People are reluctant to break something that works, but not so much when it doesn’t. If the build is already broken, then people won’t spend much time making sure their change doesn’t break it (well, break it further). But if the build is pristine green, then they will be very careful about it.

In 2005, Jeff Atwood mentioned the original source, and said “Maybe we should be sweating the small stuff.”

That stuck with me because I admit that I focus on the little details first. I try to fix and automate where I can, but for political or practical reasons, I often am unable to make the comprehensive system changes I’d like to see.

So, given that most of us live in the real world where some things are just left undone, where do we draw the line? What do we consider a bit of acceptable street litter, and what do we consider a broken window? When is it ok to just reboot the system, and when do you really need to figure out exactly what went wrong?

This decision making process is often the difference between a productive work day, and one filled with frustration.

The strategies that we use to make this choice are probably the most important aspects of system administration and devops today. There, of course, is never a single right answer for every business. But I’m sure there are some themes.

For example:

James posted “Rules for Infrastructure” just the other day, which is a repost of the original gist. What I like about this is that they are phrased philosophically: here are the lines in the sand, and the definitions that we’re all going to agree to.

Where do you draw the line? And how do you communicate to your colleagues where the line is?

Forgetting: Logging as an ethical choice

I have kind of a weird idea for a database person.

Forgetting should be built into our applications by default. I just spent the weekend at FooCamp, and I held a session to discuss this idea, and some of the possible consequences if it were implemented.

To explain why I think this, I’m going to take an extreme stance for a moment and argue a position that I’d like to see rebutted. So, please have at it! 🙂

For too long we have allowed decisions made by developers – default application settings – to determine what ultimately become surveillance levels.

There are notable counter examples: 4chan intentionally expires postings every few days. Riseup keeps no logs. The EFF documents what we do and do not legally need to keep. These, however, are the efforts of a tiny minority when considered against the rest of the web.

Over time, our conception of what is reasonable has changed around logging and accounting for vast periods of our activities. Never before would a silly recording taken by a 15-year old be stored indefinitely, and then be documented as a watershed event because of how many times it was viewed in a vast global network, rather than for the content of the cultural artifact itself. The log of views themselves were the cultural artifact, and it is celebrated.

Fading away isn’t evil. But we act like it is when we pipe what once was ephemeral into indefinite storage.

Why have we decided to participate in this social experiment? It really wasn’t a collective decision. Some software developers and investors decided that archival on a massive scale was important or profitable. We started calling these things “part of history” and just storing them without thinking about it. Saving became default.

I’m not saying that archiving the internet, search robots or “opting in” are bad things. But those who least understand archiving’s effect on personal privacy may be the ones most likely to suffer in the future.

The ripple effects of the decision to move from “default expire” to “default save” are vast. Consider for a moment if we were to call the ability to intentionally forget on the internet a human right.

Instead, what we’ve done is to say to millions of people – you do not have the right to forget. Companies will take your locations and status updates, and never delete them. And privacy is rapidly becoming a privilege of those who can afford to buy it.

For the sake of argument, consider the difference between narrative historical documentation and collections of “facts.” The narrative is an aggregation, full of embellishments and forgetting and kernels of truth. Facts are collected, supposedly objectively. Both approaches to capturing historical thought suffer from the fallacy that historical “fact” is fixed and doesn’t evolve based on the viewer and reteller over time. How much worse is this effect when our collections of facts are now ballooning to include every blog post, photo, tweet and web access log you’ve ever made?

The point is not that individuals wish to change history or even obscure events which may reflect poorly on them. (Even though we all do!)

We need to give people a real choice – not a set of ACLs and rules. Choice about what is archived about them, control over that process and a clear delineation between personal artifact and public property.

Kathy Sierra deleted her twitter stream and was accused of removing a piece of history, and possibly the worse internet offense – taking away conversations. Taken at face value, isn’t that the point of conversation? That it is ephemeral?

Conversations leave echos in changed thoughts and light or deep impressions in the minds of the participants. Just because Twitter has by default chosen to retain these conversations indefinitely doesn’t change the nature of conversation itself. No one would argue that just because we share our thoughts that we are obligated to share every thought.

In the same way, we are not obligated to maintain a record of our sharing. And if we do maintain and share a record of our own end of a conversation, we still have the right to ultimately destroy it.

Once shared, of course, an artifact of a conversation can’t be taken away from those that have copies. But authors and owners of the original work must always retain the right to destroy.

So, that brings me to what is ethical in our applications. When we say: “we’re keeping your data forever” and “delete means your account will still be here when you come back”, application developers and companies are making an ethical choice. They are saying, “your shared thoughts aren’t your own – to remember or forget. We are going to remember all of this for you, and you no longer have the right to remove them.”

Connectedness is not the same as openness. Storing vast logs of data related to individuals which connect thousands of facts over the course of their lives should be presented as the ethical choice it is, rather than a technical choice about “defaults”. Picking what we decide to log and store is an ethical and political decision. And it should also be possible for it to be a personal decision.