Composable SQL Queries in Rails Using Arel
Ryan Stenberg, Former Developer
Article Category:
Posted on
If you've spent any time working with Rails, you've been using Arel. Arel is a SQL abstraction that ActiveRecord uses to build SQL queries. Arel wraps each component of the SQL query language with Ruby objects and provides an expressive DSL for composing SQL queries. When using Arel, you're mainly interacting with tables (Arel::Table) and nodes (Arel::Nodes::Node subclasses).
In a nutshell, Arel gives you the tools to build SQL queries in Ruby-land. Sounds great, right? Mostly! Documentation for Arel is pretty sparse, so it takes some digging around to figure out what all you can do with it. In this post, I'd like to save you some of that time and trouble! 
How ActiveRecord Uses Arel
The query interface for ActiveRecord is built on top of Arel. Anytime you call things like Model.find_by, Model.where, Model.joins, ActiveRecord is using Arel to construct the SQL query string. Scope chains are basically chained Arel nodes. At any point in the scope chain, you can check the SQL query string that Arel will build by tossing a .to_sql at the end.
Advantages of Arel
The core advantage of Arel is that working in a Ruby abstraction of SQL allows developers to tap into language features of Ruby -- method chaining, inheritance, metaprogramming -- without sacrificing any aspect of the SQL language. If you can express something in SQL, you can do it with Arel.
With Arel, we're able to abstract segments of a SQL query behind expressive, reusable Ruby methods. This makes Arel an excellent tool for situations where you're working with long, complex SQL queries, which can be difficult to read and reason about.
Demonstrating Arel
I could keep talking about why Arel is great, but I'd rather just show you! The remainder of this post will lay out examples to further demonstrate the concepts and advantages discussed.
A Simple Example
Using ActiveRecord, we frequently write query statements like:
Employee.where(first_name: 'Ryan', last_name: 'Stenberg')
If we were to drop down a level and generate the same SQL query from scratch using Arel, here's what it would look like:
table = Employee.arel_table
table
.project(table[Arel.star])
.where(table[:first_name].eq('Ryan').and(table[:last_name].eq('Stenberg')))
.to_sql
Using Arel inside a Ruby object might look like this:
class EmployeeFullNameSearch < Struct.new(:first, :last)
def sql
arel_table
.project(arel_table[Arel.star])
.where(where_clause(first, last))
.to_sql
end
private
def arel_table
@arel_table ||= Employee.arel_table
end
def where_clause
arel_table[:first_name].eq(first).and arel_table[:last_name].eq(last)
end
end
Arel, when laid out in a similar fashion, is easier to read through and reason about. This becomes exponentially more useful as the complexity of the SQL queries increases.
While Arel seems to make a lot of sense and is pretty intuitive, the project(arel_table[Arel.star]) portion of the sql method might be confusing. Calling project is how you tell Arel what to return in the SELECT statement. The Arel.star bit generates a *, so putting those two together will result in a SQL query fragment like SELECT * FROM employees. It knows the table from arel_table and it knows to select all the things with project(arel_table[Arel.star]).
The above example demonstrated how to build the full SQL query string via the sql method, which you'd have to use with a .find_by_sql:
Employee.find_by_sql(EmployeeFullNameSearch.new('Ryan', 'Stenberg').sql)
This isn't really desirable since .find_by_sql returns an array, so we wouldn't be able to scope chain off of it.
In practice, we'd make the sql method focus on the WHERE conditions and let ActiveRecord do the rest:
class EmployeeFullNameSearch < Struct.new(:first, :last)
def sql
arel_table[:first_name].eq(first).and(arel_table[:last_name].eq(last)).to_sql
end
def results
Employee.where(full_name_matches('Ryan', 'Stenberg'))
end
# ...
end
..and in use:
EmployeeFullNameSearch.new('Ryan', 'Stenberg').results
With the above code, we'd get an ActiveRecord::Relation back. By default, Employee.where will select all columns and we get something we can scope chain.
While this example is very simple and doesn't necessarily warrant the use of Arel, it illustrates the function and purpose of Arel.
Dynamic SQL Generation
You can dynamically build SQL statements through Arel node chains using Enumerable methods like reduce:
class Conference < ActiveRecord::Base
DEADLINES = %i(
speaker_submission_deadline
early_bird_registration_deadline
standard_registration_deadline
)
scope :with_deadlines_on, -> (date, deadlines = DEADLINES) { where(deadlines_on(date, deadlines)) }
def self.deadlines_on(date, deadlines = DEADLINES)
Array(deadlines).reduce(nil) do |query, deadline|
deadline_check = arel_table[deadline].eq(date)
if query.nil?
deadline_check
else
query.or(deadline_check)
end
end.to_sql
end
# Given the default deadlines, we're basically doing this:
# arel_table[:speaker_submission_deadline].eq(date)
# .or(arel_table[:early_bird_registration_deadline].eq(date))
# .or(arel_table[:standard_registration_deadline].eq(date))
# .to_sql
end
In this example, using Arel frees us from having to maintain the with_deadlines_on scope if we end up adding additional deadlines. Since we're dynamically building the Arel query, we're able to make deadlines an optional parameter so our scope can be used to find a given subset of deadlines as well:
# With given deadlines: Conference.with_deadlines_on( 1.month.from_now, %i(early_bird_registration_deadline standard_registration_deadline) ) # Without (matching any deadline): Conference.with_deadlines_on(1.month.from_now)
Complex Example Demonstrating Composability, Re-usability, and Expressiveness
In this example, let's say we're working on an auditing platform for a client that tracks employment records between companies and employees. The application is built in Rails and uses MySQL as its database.
class Employee < ActiveRecord::Base has_many :employments has_many :companies, through: :employments end class Company < ActiveRecord::Base has_many :employments has_many :employees, through: :employments end class Employment < ActiveRecord::Base belongs_to :employee belongs_to :company end
The Problem
Our app needs to allow its users to find employees who've worked at a specific set of companies within a given time period.
There's trickiness anytime you need to search against a record's associated things -- in this case, that's an employee's companies. That information isn't captured in any single field. It's described through database relationships. This complexity results in a complex SQL query:
SELECT employees.*
FROM employees
INNER JOIN (
SELECT employee_id, GROUP_CONCAT(
DISTINCT company_id
ORDER BY company_id
) AS search_values
FROM employments
WHERE employments.start_date <= ? AND employments.end_date >= ?
GROUP BY employee_id
) AS associated
ON associated.employee_id = employees.id
WHERE associated.search_values REGEXP '(2,){1}([[:digit:]]+,)*(4,|4$){1}'

Stay with me! Basically, we're attaching a serialized list of company IDs to each employee. For each employee, the list contains IDs for companies the employee has worked at within the given time frame.
It's worth noting that MySQL doesn't have array-types like Postgres, which is why we need to utilize serialized lists (created via the MySQLGROUP_CONCAT() function).
At the end of our SQL, we're matching that serialized list of company IDs against a regular expression that catches -- in this case -- employees having worked at companies 2 and 4 within the given time frame.
To implement this, we'd need to build that regex from user input and then stick the whole SQL string inside a scope or class method.. or we could refactor with Arel!
The Solution
Arel lets us do some cool stuff. Let's build a search object that builds the same SQL query using Arel:
class EmploymentHistorySearch
attr_reader :company_ids, :start_date, :end_date
def initialize(company_ids, options = {})
@company_ids = Array(company_ids).uniq
@start_date = options[:start_date]
@end_date = options[:end_date]
end
def results
Employee.where(sql)
end
def sql
query.to_sql
end
private
def query
employees[:id].in(matching_result_set)
end
def matching_result_set
employees
.project(employees[:id])
.join(join_clause)
.on(join_condition)
.where(where_condition)
end
def employees
@employees ||= Employee.arel_table
end
def join_clause
Arel.sql("(#{associated_values.to_sql})").as(derived_table)
end
# INNER JOIN (#{associated_values.to_sql}) AS associated
def associated_values
employments
.project(foreign_key, grouped_values)
.where(within_time_period)
.group(foreign_key)
end
# SELECT employee_id, #{grouped_values}
# FROM employments
# WHERE employments.start_date <= ? AND employments.end_date >= ?
# GROUP BY employee_id
def employments
@employments ||= Employment.arel_table
end
def foreign_key
'employee_id'
end
def grouped_values
Arel.sql(values_list).as(values_alias)
end
def values_list
%Q(
GROUP_CONCAT(
DISTINCT #{search_column}
ORDER BY #{search_column}
)
)
end
def search_column
'company_id'
end
def values_alias
Arel.sql('search_values')
end
def within_time_period
employments[:start_date].lteq(end_date).and(employments[:end_date].gteq(start_date))
end
def join_condition
fk_with_table.eq employees[:id]
end
# ON associated.employee_id = employees.id
def fk_with_table
Arel.sql [
derived_table_name,
foreign_key
].join('.')
end
def derived_table_name
'associated'
end
def where_condition
Arel.sql [
derived_values,
values
].join(' REGEXP ')
end
def derived_values
Arel.sql [
derived_table_name,
values_alias
].join('.')
end
def derived_table
Arel.sql derived_table_name
end
def values
@values ||= main_class.sanitize input.sort.map { |value| value_regex(value) }.join(filler_regex)
end
# Given [4,2], #values would return '(2,|2$){1}([[:digit:]]+,|[[:digit:]]+$)*(4,|4$){1}'
def regex_partial(value, frequency = '*')
"(#{value},|#{value}$)#{frequency}"
end
def filler_regex
regex_partial '[[:digit:]]+'
end
def value_regex(value)
regex_partial value, '{1}'
end
end
Using Arel, we've created an EmploymentHistorySearch object with a simple API (#sql and #results) that abstracts each component of our query into descriptive, concise, and re-usable methods.
Here's our search object in action:
# Standalone EmploymentHistorySearch.new(company_ids, 2.months.ago, 1.month.ago).results # Class Method def self.with_employment_history(ids, start_date, end_date) where(EmploymentHistorySearch.new(ids, start_date, end_date).sql) end

This approach really shines when there are a number of complex, mostly similar SQL queries where we can utilize inheritance to override only what's necessary. If, later on, we had to support an employment history search that returned all employees who had terminated employments during a given range and in a particular manner -- whether they quit or were fired, we'd be able to re-use almost all of our previous search object:
class EndedEmploymentHistorySearch < EmploymentHistorySearch
attr_reader :termination_type
def initialize(company_ids, options = {})
@termination_type = options[:termination_type]
super
end
def associated_values
employments
.project(foreign_key, grouped_values)
.where(within_time_period.and(matches_termination_type))
.group(foreign_key)
end
def within_time_period
employments[:end_date].gteq(start_date).and(employments[:end_date].lteq(end_date))
end
def matches_termination_type
employments[:termination_type].eq(termination_type)
end
end

In Closing
Ruby tends to be pretty expressive and readable -- to which Arel is no exception. Arel is smart -- taking care of SQL subtleties like precedence and correct ordering of SQL clauses for you. For example, when generating a SQL query containing an ORDER BY and a WHERE clause, Arel ensures the WHERE clause comes first.
Simply put, Arel allows Rails developers to focus on the domain in their language. So, next time you find yourself dealing with complex SQL queries, look to Arel for great justice!