Like I posted my Notes and comments from OSCON 2011, here are my notes and comments from Postgres Open 2011. Some of it is narrative, and some of it is just barely-formatted notes. The target here is my own use of what was most interesting and useful for me at work, but I make them public here for anyone who’s interested.

Mastering PostgreSQL Administration

Bruce Momjian

http://postgresopen.org/2011/schedule/presentations/89/

http://momjian.us/presentation

http://momjian.us/main/writings/pgsql/administration.pdf

Most of this stuff I knew already, so the notes are short.

Connections

  • local -- Unix sockets
    • Significantly faster than going through host
  • host -- TCP/IP, both SSL and non-SSL
  • hostssl -- only SSL
    • Can delay connection startup by 25-40%
  • hostnossl -- never SSL

Template databases

  • You can use template databases to make a standard DB for when you create new ones. For example, if you want to always have a certain function or table, put it in template1. This works with extensions and contrib like pg_crypto.

Data directory

  • xxx_fsm files are freespace map
  • pg_xlog is the WAL log directory
  • pg_clog is compressed status log

Config file settings

  • shared_buffers should be 25% of total RAM for dedicated DB servers. Don't go over 40-50% or machine will starve. Also, overhead of that many buffers is huge.
  • If you can get five minutes of your working set into shared_buffers, you're golden.
  • Going over a couple hundred connections, it's worth it to look at a pooler.

Analyzing activity

  • Heavily-used tables
  • Unnecessary indexes
  • Additional indexes
  • Index usage
  • TOAST usage

Identifying slow queries and fixing them

Stephen Frost

http://postgresopen.org/2011/schedule/presentations/71/

Fixing

  • MergeJoin for small data sets?
    • Check work_mem
  • Nested Loop with a large data set? Could be bad row estimates.
  • DELETEs are slow?
    • Make sure you have indexes on foreign keys
  • Harder items
    • Check over your long-running queries
    • Use stored procedures/triggers
    • Partioning larger items

Prepared queries

  • Plan once, run many
  • Not as much info to plan with, plans may be more stable
    • No constraint exclusion, though
    </li>
  • How to explain/explain analyze

Query Review

  • Don't do select count(*) on big tables
    • Look at pg_class.reltuples for an estimate
    • Write a trigger that keeps track of the count in a side table
    </li>
  • ORDER BY and LIMIT can help Pg optimize queries
  • select * can be wasteful by invoking TOAST
  • Use JOIN syntax to make sure you don't forget the join conditions

CTE Common Table Expressions

WITH (
    my_view AS ( select * from my_expensive_view),
    my_sums AS ( select sum(my_view.x)
)
SELECT my_view.*, my_sums.sum FROM my_view, my_sums

PostgreSQL 9.1 Grand Tour

Josh Berkus

http://www.pgexperts.com/document.html?id=52

Overview

  • Synchronous replication
  • Replication tools
  • Per-Column collation
  • wCTEs
  • Serialized Snapshot Isolation
  • Unlogged tables
  • SE-Postgres
  • K-Nearest Neighbor
  • SQL/MED
  • Extensions
  • Other Features

Land of Surreal Queries: Writable CTEs

-- This is in 8.4
WITH deleted_posts AS (
    DELETE FROM posts
    WHERE created < now() - '6 months'::INTERVAL
    RETURNING *
)
SELECT user_id, count(*)
FROM deleted_posts
GROUP BY 1

In 9.1, you can do UPDATE on that.

Unlogged tables

Sometimes you have data where if something happens, you don't care. Unlogged tables are much faster, but you risk data loss.

Extensions

CREATE EXTENSION IF NOT EXISTS citext WITH SCHEMA ext;

SQL-MED

Handling for FDW, which is Foreign Data Wrappers.

Others

  • Valid-on-creation FKs
  • Extensible ENUMs
  • Triggers on Views
  • Reduced NUMERIC size
  • ALTER TYPE without rewrite
  • pg_dump directory format as a precursor for parralel pg_dump

Monitoring the heck out of your database

Josh Williams, End Point

http://joshwilliams.name/talks/monitoring/

What are we looking for?

  • Performance of the system
  • Application throughput
  • Is it dead or about to die?

"They don't care if the system's on fire so long as it's making money."

Monitoring Pg

  • Log monitoring for errors
  • Log monitoring for query performance
  • Control files / External commands
  • Statistics from the DB itself

Monitoring error conditions

  • ERROR: Division by zero
  • FATAL: password authentication
  • PANIC: could not write to file pg_xlog...

Quick discussion of tail_n_mail

Log monitoring for query performance

check_postgres

Most of the rest of the talk was about check_postgres, which I already know all about. A few cool to-do items came out of it.o

  • Look at tracking --dbstats in cacti
  • Add the --noidle to --action=backends to get a better sense of the counts.

Honey, I Shrunk the Database

Vanessa Hurst

http://postgresopen.org/2011/speaker/profile/36/

http://www.slideshare.net/DBNess/honey-i-shrunk-the-database-9273383

Why shrink?

  • Accuracy
    • You don't know how your app will behave in production unless you use real data.
  • Freshness
    • New data should be available regularly
    • Full database refreshes should be timely
  • Resource Limitation
    • Staging and developer machines cannot handle production load
  • Data protection
    • Limit spread of sensitive data

Case study: Paperless Post

  • Requiremenets

    • Freshness - Daily on command for non-developers
    • Shrinkage - slices & mutations
  • Resources

    • Source -- extra disk space, RAM and CPUS
    • Destination -- Limited, often entirely un-optimizied
    • Development -- constrained DBA resources

Shrunk strategies

  • Copies
    • Restored backups or live replicas
  • Slices
    • Select portions of live data
  • Mutations
    • Sanitized or anonymized data
  • Assumptions
    • Usually for testing

Slices

  • Vertical slice

    • Difficult to obtatin a valid, useful subset of data
    • Example: Include some tables, exclude others
  • Horizontal slice

    • Difficult to write & maintain
    • Example: SQL or application code to determine subset of data
  • Pg tools -- vertical slice

    • pg_dump
    • include data only
      • Include table schema only
      • Select tables
      • Select schemas
      • Exclude schemas

Postgres Tuning

Greg Smith

Tuning is a lifecycle.

Deploy / Monitor / Tune / Design

You may have a great design up front, but then after a while you have more data than you did before, so you have to redesign.

Survival basics

  • Monitor before there's a problem
  • Document healthy activity
  • Watch performance trends
    • "The site is bad. Is it just today, or has it been getting worse over time?"
    </li>
  • Good change control: Minimize changes, document heavily
    • Keep your config files in version control like any other part of your app.
  • Log bad activity
  • Capture details during a crisis

Monitoring and trending

  • Alerting and trending
  • Alerts: Nagios + check_postgres

Trending

  • Watch database and operating system on the same timeline
  • Munin: Easy, complete, heavy
    • Generates more traffic, may not scale up to hundreds of nodes
  • Cacti: Lighter, but missing key views
    • Not Greg's first choice
    • Harder to get started with the Postgres plugins
    • Missing key views, which he'll cover later
  • Various open-sourc and proprietary solutions

Munin: Load average

  • Load average = how many processes are active and trying to do something.
  • Load average is sensitive to sample rate. Short-term spikes may disappear when seen at a long-term scale.

Munin: CPU usage

  • Best view of CPU usage of the monitoring tools.
  • If your system is running a lot of system activity, often for connection costs, look at a pooler like pg_bouncer.

Munin: Connection distribution

  • Greg wrote this in Cacti because it's so useful.
  • Graph shows a Tomcat app that has built-in connection pool.
  • The graph shown isn't actually a problem.
  • Better to have a bunch of idle connections because of a pooler, rather than getting hammered by a thousand unpooled connections.

Munin: Database shared_buffers usage

  • If shared_buffers goes up without the same spike in disk IO, it must be in the OS's cache.
  • If shared_buffers is bigger than 8GB, it can be a negative, rather than letting the OS do the buffering. derby's is at 5GB.
  • There is some overlap between Pg's buffers and the OS's, but Pg tries to minimize this. Seq scan and VACUUM won't clear out shared_buffers, for example.
  • There's nothing wrong with using the OS cache.
  • SSDs are great for random-read workloads. If the drive doesn't know to sync the data, and is not honest with the OS about it, you can have corrupted data.
  • SSDs best use is for indexes.

Munin: Workload distribution

  • Shows what kind of ops are done on tuples.
  • Sequential scans may not necessarily be bad. Small fact tables that get cached are sequentially scanned, but that's OK because they're all in RAM.

Munin: Long queries/transactions

  • Watch for oldest transaction. Open transactions block cleanup activities like VACUUM.
  • Open transaction longer than X amount of time is Nagios-worthy.

Using pgbench

  • pgbench can do more than just run against the pgbench database. It can simulate any workload. It has its own little scripting language in it.

OS monitoring

  • top -c
  • htop
  • vmstat 1
  • iostat -mx 5
  • watch

Long queries

What are 5 long running queries?

psql -x -c 'select now() - query_start as runtime, current_query from pg_stat_activity order by 1 desc limit 5'

It's safe to kill query processes, but not to kill -9 them.

Argument tuning

  • Start monitoring your long-running queries.
  • Run an EXPLAIN ANALYZE on slow queries showing up in the logs.
  • Sort to disk is using 2700K, so we update work_mem to 4MB. However, that still doesn't fix it. Memory use is bigger in RAM than on disk.
  • If you're reading more than 20% of the rows, Pg will switch to a sequential scan, because random I/O is so slow.
  • Indexing a boolean rarely makes sense.

The dashboard report

  • Sometimes you want to cache your results and not even worry about the query speed.
  • Use window functions for ranking.

The OFFSET 0 hack

  • Adding an OFFSET 0 in a subquery forced a certain JOIN order on the subquery. Something about making the subquery know that it is limited in some way.