← all writing
5 min read

The Postgres index I should have added eighteen months earlier

A recruiter search that sat at nine seconds. The fix was a partial index — and the lesson was about where the data for that decision actually comes from.

The search that took nine seconds and pretended it was fine

The talent search at Remolution had a problem that didn't look like a problem at first. The page loaded. The filters rendered. You typed two words, picked a location, hit Search — and the spinner sat there for what felt like a long time. Long enough that the recruiter would alt-tab to Slack and forget what they were doing.

Internally we called it the nine-second query, which was generous. Some tenants saw two, some saw fifteen. It depended on how many candidates they had and which filters they leaned on. An average enterprise tenant carried somewhere north of a hundred thousand candidate rows by the time I picked up the search work in 2024.

The first instinct — and I'll own this, because it was mine — was to throw an index at it.

Why the obvious index wasn't the right one

The search query joined candidates against a few related tables and filtered on roughly a dozen columns: location, current role title, years of experience, last-contacted date, tags, source. Some filters came in on most queries. Most filters showed up almost never.

I added a composite index on the columns I thought were hot — location, role, experience. Ran EXPLAIN. The planner laughed at me. Most queries still did a sequential scan, because the recruiters' actual filter combinations didn't match my composite's leading columns often enough for the planner to bother.

This is the part nobody tells you about adding indexes. An index isn't a fix for a slow query — it's a bet about which rows your users actually ask for. I had bet on a clean filter combination that came from reading the schema. The recruiters were betting on a messy one that came from doing their job.

So I went and pulled the query logs. Not the slow-query log — the all-queries log, for a week. Then I bucketed by filter combination. Two patterns held roughly half of the search traffic. Neither one matched the composite I'd written.

The partial index, and the column I didn't know I needed

Both hot patterns shared one thing: they filtered on archived_at IS NULL. Recruiters almost never searched archived candidates. That column was on essentially every search, and it was the one I'd ignored because it didn't feel like a filter — it felt like a sanitary default.

The fix was a partial index. Not on the filter columns I'd guessed at, but on the candidates table restricted to active rows, with a smaller composite for the columns the hot patterns actually shared.

CREATE INDEX CONCURRENTLY idx_candidates_active_search
  ON candidates (tenant_id, location_id, last_contacted_at DESC)
  WHERE archived_at IS NULL;

The partial index was a fraction of the size of the equivalent full index, because archived rows were a real portion of the table on long-lived tenants. Smaller index meant more of it stayed hot in memory. The planner picked it up immediately for the hot patterns, and the median search dropped from seconds to something that felt instant in the browser.

The composite ordering matters too. tenant_id first, because every query is multi-tenant scoped — that's the most selective column we have. location_id second, because it filters down further on the queries that use it. last_contacted_at DESC last, because the UI sorts by it by default, and ordering inside the index avoids a separate sort step.

I also added a GIN index on the tags array column, separately, because the tag filter — when used — hit different rows than the location filter. One composite couldn't serve both, and trying to make it serve both would have made it serve neither.

What changed in how I read query plans

The lesson that stuck with me wasn't about partial indexes. I knew partial indexes existed; I'd read the Postgres docs on them before. The lesson was about where the data for the index decision comes from.

Before this, I'd been reading the schema and the slow-query log and writing indexes from intuition. That works for hobby projects, where the workload is whatever you happen to do in the next ten minutes. It doesn't work for a SaaS with a hundred enterprise tenants whose recruiters all use the product slightly differently.

The thing that actually worked:

  • Log every search query for a representative window. A week is usually enough. A day is not.
  • Bucket by filter combination, not by query text. Two recruiters typing different search terms with the same filters are doing the same query from the planner's perspective.
  • Look at which filters appear in most queries — including the ones that don't feel like filters. Soft-delete flags, status enums, tenant scoping. Those are the columns that belong in your partial-index predicate or your leading composite position.
  • Run EXPLAIN ANALYZE on the actual top-N filter combinations after you build the index. Not on a query you wrote from the schema.

The Postgres docs on partial indexes give you the syntax. The "which rows do my users actually look at" question is yours to answer, and the only way to answer it honestly is to look.

What I'd tell past me

What I'd tell past me

The first index you write from intuition will be wrong. Not catastrophically wrong — just wrong enough that the query is still slow, and you'll start thinking the problem must be somewhere else.

Read the query log before you read the schema. The schema tells you what's possible. The log tells you what's happening.

A few months later a different slow page came in — a candidate timeline view. Same shape of problem: an index that looked obvious from the schema, an actual workload that wanted something different. I went straight to the logs this time and the index landed on the first try.

That's the small win that doesn't make it into anyone's résumé. Not I made search 10× faster. Just: I stopped writing indexes from intuition. The cumulative effect on a year of database work is bigger than any single nine-second query.