Active Record Query Optimization Tips
Simple But Effective Ways To Improve Active Record Query Performance
Eager Loading With `includes` to Avoid n+1 Queries
Use includes
to avoid n+1
queries. For example, if there are n
rows in the users
table, the following ActiveRecord code would generate n+1
queries:
However, using includes
will only run 2 queries. One to load all the users
, and another to load all posts
associated with those users
.
Use `find_each` When Loading a Large Amount of Records
This next tip will actually increase the number of queries made, but it’s meant to reduce memory use when loading a huge amount of records.
Rather than loading a huge amount of records into memory for processing (and possibly crashing the server if there are too many records) find_each
loads one batch at a time, and garbage collects them after use.
`Select`ing And `Pluck`-ing Only Attributes We Need
Sometimes we don’t need all the model attributes, but only a few. Use pluck
to query the needed attribute(s) and return them as an array of strings instead of an ActiveRecord_Relation
. However, if an ActiveRecord_Relation
is needed , use select
instead.
Check existence of record with `exists?` instead of `present?`
Sometimes, we only care to know if a record exists, but don’t need to actually do anything with that record. Use exists?
instead of present?
.
Prefer `ActiveRecord::Relation#size` over `ActiveRecord::Calculations#count`
Assuming users
is an ActiveRecord::Relation
, both users.count
and users.size
return the number of users. However, users.count
will always run the query SELECT COUNT(*) FROM users WHERE ...
even if the relation has already been loaded. On the other hand, size
is smarter in that it will call length
on the relation if it already has been loaded (saving us a database query) but calls count
if the relation has not been loaded.
From the ActiveRecord source code:
# File activerecord/lib/active_record/relation.rb, line 210
def size
loaded? ? @records.length : count(:all)
end
Bulk Delete With `delete_all`
When destroying a large number of records that do not require Active Record callbacks (which are almost considered an anti-pattern now) to be invoked, use delete_all
or instead of invoking destroy
on individual objects.
Bulk Creation
For those who didn’t realize that ActiveRecord::Base#create
can accept an array of hashes, it can and it will run one query instead of n
, but only if the underlying database engine supports batch INSERT
.
Bulk Update
Similarly, there is also a bulk_update
command that can update multiple records in one query, assuming the underlying database supports this.
Sometimes In-memory Operations Are Better Than Querying
Sometimes, it’s better to actually use in-memory selection instead of querying. For example, suppose we want to select from an array of emails those which aren’t in the database yet.
Note this example is essentially the opposite of the find_each
example earlier. Here, we are reducing the number of queries, but increasing the memory use. In the find_each
example, we are increasing the number of queries but reducing the memory use. As such, there is no absolute rule for whether it’s more important to reduce the number of queries or reduce memory use. It depends on the situation, so exercise good judgment.
Bullet Gem
The bullet gem is a helpful tool for determining whether or not you need to eager-load but also if you are unnecessarily eager-loading. To use, simply enable it in the test
environment
Then run all your specs: bundle exec rspec
and watch for warnings on n+1
query issues.
Benchmarking Your Changes
To see the results of your query optimizations, use the Ruby’s Benchmark.realtime
to time them.
But note that databases like MySQL will often cache queries in memory so if you run the same query twice, the second time will always be faster. As such, be sure to clear the cache query between benchmarks. In MySQL, you can do this by running this in the MySQL console:
RESET QUERY CACHE;
Don’t Forget to Create Indexes
Indexes on table columns will increase retrieval speed at the expense of write speed as well as using more disk space, so use them wisely.
Feel free to leave comments, questions, suggestions, corrections below.