Active Record Query Optimization Tips

Simple But Effective Ways To Improve Active Record Query Performance

Steven Li
3 min readDec 17, 2017

Eager Loading With `includes` to Avoid n+1 Queries

Use includes to avoid n+1 queries. For example, if there are n rows in the users table, the following ActiveRecord code would generate n+1 queries:

However, using includes will only run 2 queries. One to load all the users, and another to load all posts associated with those users.

Use `find_each` When Loading a Large Amount of Records

This next tip will actually increase the number of queries made, but it’s meant to reduce memory use when loading a huge amount of records.

Rather than loading a huge amount of records into memory for processing (and possibly crashing the server if there are too many records) find_each loads one batch at a time, and garbage collects them after use.

`Select`ing And `Pluck`-ing Only Attributes We Need

Sometimes we don’t need all the model attributes, but only a few. Use pluck to query the needed attribute(s) and return them as an array of strings instead of an ActiveRecord_Relation. However, if an ActiveRecord_Relation is needed , use select instead.

Check existence of record with `exists?` instead of `present?`

Sometimes, we only care to know if a record exists, but don’t need to actually do anything with that record. Use exists? instead of present? .

Prefer `ActiveRecord::Relation#size` over `ActiveRecord::Calculations#count`

Assuming users is an ActiveRecord::Relation, both users.count and users.size return the number of users. However, users.count will always run the query SELECT COUNT(*) FROM users WHERE ... even if the relation has already been loaded. On the other hand, size is smarter in that it will call length on the relation if it already has been loaded (saving us a database query) but calls count if the relation has not been loaded.

From the ActiveRecord source code:

# File activerecord/lib/active_record/relation.rb, line 210
def size
loaded? ? @records.length : count(:all)
end

Bulk Delete With `delete_all`

When destroying a large number of records that do not require Active Record callbacks (which are almost considered an anti-pattern now) to be invoked, use delete_all or instead of invoking destroy on individual objects.

Bulk Creation

For those who didn’t realize that ActiveRecord::Base#create can accept an array of hashes, it can and it will run one query instead of n , but only if the underlying database engine supports batch INSERT .

Bulk Update

Similarly, there is also a bulk_update command that can update multiple records in one query, assuming the underlying database supports this.

Sometimes In-memory Operations Are Better Than Querying

Sometimes, it’s better to actually use in-memory selection instead of querying. For example, suppose we want to select from an array of emails those which aren’t in the database yet.

Note this example is essentially the opposite of the find_each example earlier. Here, we are reducing the number of queries, but increasing the memory use. In the find_each example, we are increasing the number of queries but reducing the memory use. As such, there is no absolute rule for whether it’s more important to reduce the number of queries or reduce memory use. It depends on the situation, so exercise good judgment.

Bullet Gem

The bullet gem is a helpful tool for determining whether or not you need to eager-load but also if you are unnecessarily eager-loading. To use, simply enable it in the test environment

Then run all your specs: bundle exec rspec and watch for warnings on n+1 query issues.

Benchmarking Your Changes

To see the results of your query optimizations, use the Ruby’s Benchmark.realtime to time them.

But note that databases like MySQL will often cache queries in memory so if you run the same query twice, the second time will always be faster. As such, be sure to clear the cache query between benchmarks. In MySQL, you can do this by running this in the MySQL console:

RESET QUERY CACHE;

Don’t Forget to Create Indexes

Indexes on table columns will increase retrieval speed at the expense of write speed as well as using more disk space, so use them wisely.

Feel free to leave comments, questions, suggestions, corrections below.

--

--

Steven Li

Writing About Rails, React, Web Application Technology, Databases, and Software Engineering