Everyday Postgres: Tuning a brand-new server (the 10-minute edition)

Server tuning is a topic that consumes many books, blog posts and wiki pages.

Below is some practical advice for getting low-hanging fruit out of the way if you’re new to tuning Postgres and just want something that will likely work well-enough on low volume systems. I’d say looking at this list and making changes on a new system should take 10 minutes or less.

Run pgtune

Greg Smith open sourced a utility for making a first pass at tuning Postgres for a local system with pgtune. This tool is easy to run – just copy it to a target system and then point it at your existing Postgres config. It puts its changes into a new file at the very bottom.

Use XFS

Filesystem choice matters. Greg Smith goes into some detail on why ext3 is a terrible performance choice for a database filesystem in his talk Righting Your Writes. At this point, XFS is the filesystem that should be your default choice. If you want to explore ext4 or zfs (if that’s an option for you), that may be worth looking at. It is “safe” however to choose XFS. Depending on your disk situation, recreating your filesystem might take a bit longer than 10 minutes, but hopefully this will save you time and bad performance in the future!

Increase your readahead buffer

On Linux, the readahead buffer (brief explanation) is set way to small for most database systems. Increase this to about 1 MB with blockdev -setra 2048 [device].

For further performance analysis

I wrote this performance checklist a while back for assessing a system’s health. I’d say a review of all the things on that list would take probably half a day. Following up and making the changes could take a day or more. These kinds of analysis are worth exploring periodically to ensure you haven’t missed important changes in your environment or your application over time.

My Postgres Performance Checklist

I am asked fairly frequently to give a health assessment of Postgres databases. Below is the process I’ve used and continue to refine.

The list isn’t exhaustive, but it covers the main issues a DBA needs to address.

  1. Run boxinfo.pl on a system
    Fetch the script from http://bucardo.org/wiki/Boxinfo. Run as the postgres user on the system (or a user that has access to the postgres config).
  2. Check network.
    What is the network configuration of the system? What is the network topology between database and application servers? Any errors?
  3. Check hardware.
    How many disks? What is the RAID level? What is the SLA for disk replacement? How many spares? What is the SLA for providing data to the application? Can we meet that with the hardware we have?
  4. Check operating system.
    IO scheduler set to ‘noop’ or ‘deadline’, swappiness set to 0 (http://www.pythian.com/news/1913/what-exactly-is-swappiness/)
  5. Check filesystems.
    Which filesystem is being used? What parameters are used with the filesystem? Typical things: noatime, ‘tune2fs -m 0 /dev/sdXY‘ (get rid of root reserved space on database partition), readahead – set to at least 1MB, 8MB might be better.
  6. Check partitions.
    What are the partition sizes? Are the /, pg_xlog and pgdata directories separated? Are they of sufficient size for production, SLAs, error management, backups?
  7. Check Postgres.
    What is the read/write mix of the application? What is our available memory? What is the anticipated transpactions per second? Where are stats being written (tmpfs)?
  8. Check connection pooler.
    Which connection pooler is being used? Which system is it running on? Where will clients connect from? Which connection style (single statement, single transaction, multi-transaction)?
  9. Backups, disaster recovery, HA
    Big issues. Must be tailored to each situation.

What’s your checklist for analyzing a system?

FSM, visibility map and new VACUUM awesomeness


Heikki Linnakangas, listening as Simon Riggs sketches on the chalkboard.

Update: Heikki’s slides are here!

Heikki Linnakangas gave a presentation this past Sunday at FOSDEM about the improved free space map (FSM), which tracks unused space inside the database, and new visibility map, a bitmap which will indicate which data pages can be skipped during a partial VACUUM. This performance enhancement will affect all users of the upcoming 8.4 software release. You can see what the new FSM implementation looked like back in October from depesz’s blog.

Despite Heikki’s modest claim during the talk that the performance tests were inconclusive, the consensus among Postgres contributors is that this feature will result a substantial improvement in the performance of VACUUM for tables that are large, but have few UPDATEs.

The new free space map and Visibility map (in 8.4) and autovacuum (enabled by default starting in version 8.2) are huge administrative usability improvements to version 8 of Postgres. Prior to version 8.1, VACUUM had to be scheduled outside of database system. Autovacuum has been part of the core Postgres distribution for over two years, and is tunable via several global configuration parameters.

The visibility map enables partial VACUUMs — meaning that VACUUM no longer has to examine every tuple to update the FSM. The new FSM implementation eliminates two configuration parameters, effectively automating a formerly manual configuration process.

The new FSM is stored on disk in seperate files inside of $PGDATA/base/, and is cached in shared_buffers. The result is that the max_fsm_* configuration parameters are no longer in 8.4 — Postgres is able to track and adjust this data structure without user intervention.

A few critical features of the new FSM are:

* Now a binary tree structure
* Constructed using 1 byte per heap page
* The top level shows the maximum amount of contiguous space available
* The data structure is auto-repairing and can be reconstructed from the bottom

Previously, every time that VACUUM was run, the free space map had to be reconstructed from scratch. Now, individual nodes in the map may be updated (aka “retail” updates).

Visibility map is a bitmap of heap pages which tracks which tuples on pages are visible to transactions, and therefore not available for VACUUMing.

Previously, when VACUUM ran, it *had* to look at every tuple in a table, because there was no information about which pages may not have been updated since the last VACUUM. With the visibility map, VACUUM will now be able to perform partial scans of table data, skipping pages which are marked as fully visible. Partial scans means fewer I/O operations for VACUUM, and happier database administrators.