Active Record Query Optimization Tips
Eager Loading With `includes` to Avoid n+1 Queries
includes to avoid
n+1 queries. For example, if there are
n rows in the
users table, the following ActiveRecord code would generate
includes will only run 2 queries. One to load all the
users, and another to load all
posts associated with those
Use `find_each` When Loading a Large Amount of Records
This next tip will actually increase the number of queries made, but it’s meant to reduce memory use when loading a huge amount of records.
Rather than loading a huge amount of records into memory for processing (and possibly crashing the server if there are too many records)
find_each loads one batch at a time, and garbage collects them after use.
`Select`ing And `Pluck`-ing Only Attributes We Need
Sometimes we don’t need all the model attributes, but only a few. Use
pluck to query the needed attribute(s) and return them as an array of strings instead of an
ActiveRecord_Relation. However, if an
ActiveRecord_Relation is needed , use
Check existence of record with `exists?` instead of `present?`
Sometimes, we only care to know if a record exists, but don’t need to actually do anything with that record. Use
exists? instead of
Prefer `ActiveRecord::Relation#size` over `ActiveRecord::Calculations#count`
users is an
users.size return the number of users. However,
users.count will always run the query
SELECT COUNT(*) FROM users WHERE ... even if the relation has already been loaded. On the other hand,
size is smarter in that it will call
length on the relation if it already has been loaded (saving us a database query) but calls
count if the relation has not been loaded.
From the ActiveRecord source code:
# File activerecord/lib/active_record/relation.rb, line 210
loaded? ? @records.length : count(:all)
Bulk Delete With `delete_all`
When destroying a large number of records that do not require Active Record callbacks (which are almost considered an anti-pattern now) to be invoked, use
delete_all or instead of invoking
destroy on individual objects.
For those who didn’t realize that
ActiveRecord::Base#create can accept an array of hashes, it can and it will run one query instead of
n , but only if the underlying database engine supports batch
Similarly, there is also a
bulk_update command that can update multiple records in one query, assuming the underlying database supports this.
Sometimes In-memory Operations Are Better Than Querying
Sometimes, it’s better to actually use in-memory selection instead of querying. For example, suppose we want to select from an array of emails those which aren’t in the database yet.
Note this example is essentially the opposite of the
find_each example earlier. Here, we are reducing the number of queries, but increasing the memory use. In the
find_each example, we are increasing the number of queries but reducing the memory use. As such, there is no absolute rule for whether it’s more important to reduce the number of queries or reduce memory use. It depends on the situation, so exercise good judgment.
The bullet gem is a helpful tool for determining whether or not you need to eager-load but also if you are unnecessarily eager-loading. To use, simply enable it in the
Then run all your specs:
bundle exec rspec and watch for warnings on
n+1 query issues.
Benchmarking Your Changes
To see the results of your query optimizations, use the Ruby’s
Benchmark.realtime to time them.
But note that databases like MySQL will often cache queries in memory so if you run the same query twice, the second time will always be faster. As such, be sure to clear the cache query between benchmarks. In MySQL, you can do this by running this in the MySQL console:
RESET QUERY CACHE;
Don’t Forget to Create Indexes
Indexes on table columns will increase retrieval speed at the expense of write speed as well as using more disk space, so use them wisely.
Feel free to leave comments, questions, suggestions, corrections below.