I have an index and my query is still slow

To prevent query performance degradation, we generally add indexes to frequently accessed columns. For the most part, this simple strategy will fine, but what if you have an index and your query is still slow? How do we debug these issues and are there any solutions?

Background

In this post, I will be using PostgreSQL but most relational databases follow very similar principles. I will also reduce the scope of this post to B-tree indexes, which will cover the vast majority of your application's access patterns.

Structure of the B-Tree

Marcus Winand has a very comprehensive guide to database indexes in his ebook, which I highly recommend. But for convenience, I will quickly summarize the main structure of a B-tree to give you enough of an overview to proceed with the rest of the post.

The B-Tree index is made up of two parts:

  1. The search tree
  2. A double linked list connecting the leaf nodes

The search tree is relatively straightforward, its a simple balanced tree with a key corresponding to the column(s) you created the index on. Since the tree is balanced, a tree traversal will always have a search complexity of O(logn). In practice, a table with millions of records might have a tree with a depth of 4 or 5, making the tree traversal very efficient. For more specifics on the scalability of a B-tree, I recommend reading this.

The leaf nodes contain the values of the column(s) you created your index on and a pointer to the location of the record in the database. Each leaf node is ordered based on the key of your index, and leaf nodes are connected as a doubly-linked list.

The Index Lookup

When your database performs your query and decides to use your index, the database has to perform three steps

  1. The tree traversal
  2. The Leaf node chain traversal
  3. Access the table to fetch db records

We already explained that (1) will always be fast. That means that if your index lookup is slow then that means that either your leaf node chain is very long, or your database needs to perform many table access operations.

Why is my query slow?

Now that you have a good understanding of how the B-Tree index is actually structured, we can now look into why your query might be slow. There are three main reasons why your query might be slow

  1. The linked list traversal is very long
  2. Your database needs to perform lots of table accesses
  3. The query planner is not performing the optimal operations to perform your query

But how do we know which of these reasons is causing our slow index?

SQL Explain

You can use the SQL method EXPLAIN to get a comprehensive description of what your database's query planner is actually doing. Here are some examples directly from Postgres's documentation

EXPLAIN SELECT * FROM foo;

                       QUERY PLAN
---------------------------------------------------------
 Seq Scan on foo  (cost=0.00..155.00 rows=10000 width=4)
(1 row)
EXPLAIN SELECT * FROM foo WHERE i = 4;

                         QUERY PLAN
--------------------------------------------------------------
 Index Scan using fi on foo  (cost=0.00..5.98 rows=1 width=4)
   Index Cond: (i = 4)
(2 rows)

You can also use EXPLAIN ANALYZE. EXPLAIN returns estimates whereas EXPLAIN ANALYZE returns those estimates as well as the real time your database is taking on each operation by running your query.

Query Planner Operations

For more detailed information I recommend reading this blog post, but I will summarize the four most frequent operations that you will encounter.

Sequential scan

The sequential scan loops through all records in your database and checks the query condition on each individual record. Generally this is the least efficient query operation, but for small tables and for some special cases a sequential scan can actually be faster than an index scan.

Index scan

The index scan performs all three operations in the index lookup we explain above, (1) the tree traversal, (2) the leaf node chain traversal, (3) table accesses.

Index only scan

This operation performs the tree traversal and the leaf node chain traversal, but does not perform any table accesses. If your select statement is only selecting columns that are saved in you index, you can skip having to perform any table accesses.

Bitmap index scan

This operation is specific to Postgres, but all you really need to know about it is that it is an optimization of the index scan. It uses a bitmap to keep track of database records that are stored close together in the database, and then grabs records that are close together in batches. This allows Postgres to avoid accessing the same heap page more than once.

Why is the DB performing sub-optimal operations?

You might find that after running an SQL explain that it is performing the wrong type of scan, or even not using your index at all. There are a multitude of reasons why this might be the case, and the topic probably deserves its own post but I will briefly go over one of the possible reasons.

It may be due to Postgres's statistics being out of date. Postgres's query planner uses statistics gathered about your table to determine the best course of action to perform a given query. To prevent increasing latency due to write amplification, statistics are only updated periodically whenever ANALYZE is run.

By default, Postgres has an AUTOANALYZE task that runs whenever there 10% of your table is new writes and/or taken up by dead tuples (an artifact of MVCC). Since this is determined by a percentage of the table, the larger your table is the less frequently autoanalyze is triggered. For that reason, it is advisable to tweak your autoanalyze params so that it is run more frequently for large tables. While you're at it, it is probably also advisable to tweak your autovacuum params. You can read more about this here and here.

Solutions

Create indexes with high selectivity

Selectivity = distinct values in the column / total rows in the table

For example, your primary key is very selective because it uniquely identifies every record in your database. The more selective your index is, the fewer rows there will be that match your key in an index. As a result, you have a smaller leaf node chain to traverse.

For example, let's say you have an index on type column which has 4 values.

SELECT id FROM table WHERE type = 1 and ...(conditions on some unindexed columns)

Postgres may attempt to use the index on type because it is the only indexed column in the WHERE clause of the query. If a large percentage of the table match type = 1 then the index lookup will require traversing a very long leaf node chain. Instead, we should look to index more selective column(s) used in that query to improve performance.

Composite indexes to increase selectivity

Sometimes you may have a table where no single column in your query is very selective. But you may be able to combine columns to create a very selective composite index, reducing the length of the lead node chain traversal.

For example, let's assume you have a table called video_views that tracks all the videos a user has viewed. This table has a user_id column and a video_id column. Neither columns by themselves are very selective because a user can potentially have watched thousands of videos and a video can be watched by millions of users. But the combination of (user_id, video_id) is very selective.

Assume you want to know if a user has watched a specific video. You might perform a query like

SELECT 1 as one FROM video_views WHERE user_id = 1 AND video_id = 10 LIMIT 1

Querying an index on either user_id or video_id alone would not be efficient, but an querying index on (user_id, video_id) would.

Caveat

This can get quite tricky when you have a composite index on 3 or more columns. If you are strictly using equality conditions on columns in your index, you will be fine. However, if you are using inequality conditions on the right-most columns, these conditions may not reduce the search space for your query. I have only ever faced this issue once, and I found an alternative way to get the data I wanted that was very specific to the use case. Personally, I am not aware of more general solutions to problems like these, but I'm sure you'll figure it out 😄.

Cluster data in your index

The second reason why your query might be slow is because of the number of table accesses the database needs to perform. Postgres does a lot of optimizations for us, but depending on the correlation statistic of your column and other factors, even a bitmap heap scan can result in a very large number of table accesses. To solve this, we can use our index as storage to prevent the need to perform any table lookups at all.

For example, let's say that I want to query for all videos the user has viewed after some day.

SELECT video_id from video_views where user_id = 1 and viewed_at > '2020-01-01'

This query can leverage the index on (user_id, video_id) but still has to perform many table accesses to check the condition on viewed_at. We can instead create a composite index on (user_id, video_id, viewed_at) to prevent the need from performing any table accesses at all. Now instead of performing an index scan or a bitmap index scan, Postgres will perform an index-only scan because it has all the data it needs without accessing the table.

TLDR

This post covers how to debug slow queries that are using your B-Tree index and proposes some solutions to some of these problems. Here is a quick summary of what was covered

  • A B-tree index is composed of two parts
    • The search tree
    • the leaf nodes connected connected as a doubly linked list
  • The Index lookup consists of
    • The tree traversal
    • leaf node chain traversal
    • table accesses
  • If you have an index and your query is still slow, it is probably caused by the following
    • A very long leaf node chain traversal
    • lots of table accesses
    • query is not using your index
  • Solutions
    • create highly selective indexes to avoid large leaf node chain traversals
    • cluster data using composite indexes
    • tweak autovacuum and autoanalyze params so they are run frequently on your tables

Questions, thoughts or concerns

I don't permit any comments on my posts to prevent spam but if you have any questions, thoughts or concerns, feel free to email me from my contact form on my about page. I am a wasteman but I do my best to respond as soon as I can.

Previous
Previous

Snowflake to Postgres ETL

Next
Next

3 Best Practices for your Sidekiq Jobs