- michael glaesemann on visualizing data about postgres.7 places for data: analyze, host, bloat, dtrace/systap, logs, contrib, stats #pgcon09j #
- @Mettadore yes, definitely. in reply to Mettadore #
- listening to @stefankaltenb talk about postgres project infrastructure team and projects #pgcon09j #
- listening to Greg Stark talk about forensic analysis of corrupted postgresql databases #pgcon09j #
- most common failure reported on the postgres mailing lists is bad memory #pgcon09j #
- common errors:fsync=off,full_page_writes=off,hot backups done wrong,recover to different architecture, IMMUTABLE misuse,collation #pgcon09j #
- .@janl we can just blame hardware 😉 but really, some config issues should be better documented, and maybe doing it wrong should be harder. in reply to janl #
- @janl maybe config examples should be on a continuum: cheap/fast/OOC <-> absolute consistency in reply to janl #
- @stewartsmith no 🙂 there's a data structure that gets verified – then you get errors/warnings, but no checksum. in reply to stewartsmith #
- @janl LOL i could make a custom config parameter that executes a function…. in reply to janl #
- greg stark demonstrating zeroing out corrupted 8k pages. yeah, we've all done that before. #pgcon09j #
- @snaga oh awesome! nice to know that, and thank you! in reply to snaga #
- worried about memory errors? use ECC memory, and monitor for corrected errors. DIMMs usually have 0 or lots of errors. #pgcon09j #
- DRAM Errors in the Wild: A Large-Scale Field study (by Bianca Schroeder, and Eduardo Pinheiro/Wolf-Dietrich) http://tr.im/FpsO #pgcon09j #
- workaround for lack of checksum: use a FS checksum (ZFS)-only catches a subset of errors & does not catch in-memory corruption. #pgcon09j #