Today's PostgreSQL indexing gotcha
At work, I have a big 14M-row production table with a bunch of indexes on it. One of the indexes was bloated, so I built a new version of the index, and dropped the old bloated index. Got back a gig of space on the filesystem. Excellent.
Now, from what I understand, that should be all I have to do. Postgres doesn’t need an ANALYZE on the table to use the new index. All the column stats for the table in pg_stats are still there, so the query planner can use the index, and it should all Just Work.
Except that all of a sudden slow queries started showing up in the server log, and we were doing sequential scans. The planner wasn’t using the newly built index.
So I did an ANALYZE on the table, and suddenly the planner started using the index. Why was this?
This goes against what I knew. On this page, Robert Treat, Pg guru, says:
When adding indexes, it is not necessary to re-analyze the table so that postgres will "know" about the index; simply creating the index is enough for postgres to know about it.
So why didn’t it work for me? Turns out it was because the index I rebuilt was a functional index.
Apparently, Pg doesn’t know about the functional index unless there’s an ANALYZE to make the planner know about it. I’m guessing that somewhere there’s a pg_stats
equivalent that has functional index histograms in it, too.
If you have further insight on this, please let me know in the comments.