Implementing Full-Text Search in Rails with Postgres

Need full-text search in your Rails app? Running Postgres as your DB? PgSearch is just the thing for you!

With very little effort, you can tap into Postgres' native full-text search functionality with PgSearch. PgSearch also exposes a number of options/configurations that allow you to tweak how full-text search happens inside Postgres. Alright, let's do it!

Step 1: Gemfile

Add it for great justice!

gem 'pg_search'

Step 2: Single-Model Search or Multi-Model

PgSearch offers two distinct search strategies depending on whether or not you need to search against a single model or multiple. The configuration and options available are mostly specific to a particular strategy.

Let's start out with Single-Model.

Step 3 (Single-Model): Configure the Model

PgSearch gives you the class-level pg_search_scope method for configuration. Feed it a name for the search scope, the columns to search against, and any additional fine-tuning via the using option. Given a basic BlogPost model, you'd probably have something like this:

class BlogPost < ActiveRecord::Base
  include PgSearch

  pg_search_scope :search_for, against: %i(title body)
end

When I set it up on a recent project, I was searching against a single JSON-type column where I wanted to match against multiple words. Here's what that looked like:

pg_search_scope :search_content_for, against: :content, using: { tsearch: { any_word: true } }

That's pretty much all you need. You're sporting Postgres-powered full-text search in two lines of code.

heavy-breathing

Digging a Little Deeper: pg_search_scope

PgSearch allows you to customize a handful of things through the pg_search_scope method and the options it takes. I wanted to briefly touch on some of these to give an idea of what you can do with PgSearch.

The against option, as we saw, will take a single column or an array of columns. It also supports weighting! To weight the columns, pass a hash or two-dimensional array with the values or second elements as A, B, C, or D:

pg_search_scope :search_full_text, against: {
  title:   'A',
  content: 'B'
}

pg_search_scope :search_full_text, against: [
  [:title, 'A'],
  [:content, 'B']
}

The using: option is the thing that lets you tap into Postgres full text search features:

  • tsearch: PostgreSQL's built-in full text search supports weighting, prefix searches, and stemming in multiple languages.
  • dmetaphone: Double Metaphone is an algorithm for matching words that sound alike even if they are spelled very differently. For example, "Geoff" and "Jeff" sound identical and thus match. Currently, this is not a true double-metaphone, as only the first metaphone is used for searching.
  • trigram: Trigram search works by counting how many three-letter substrings (or "trigrams") match between the query and the text.

PostgreSQL ships with everything you need for full-text search, but you'll need to install additional PostgreSQL packages to support the other two types.

In the above example I gave from my own experience, I used tsearch to tap into the any_word option. It also has the following options:

Step 3 (Multi-Model): Configuration!

Now that we've touched on how to set up PgSearch for Single-Model, let's take a look at Multi-Model.

Step 3.1: Run PgSearch's Multi-Model Generator

To support multi-model search, PgSearch basically sets up a PgSearch::Document model with its own database table. To add the model and its migration, run the following from your Rails app's project root:

$ rails g pg_search:migration:multisearch
$ bundle exec rake db:migrate

Step 3.2: Specify the Models to Include in Multi-Search

Here's our BlogPost model from before (demonstrating conditional inclusion in multi-search results based on a published flag):

class BlogPost < ActiveRecord::Base
  include PgSearch

  multisearchable against: %i(title body), if: :published?
end

Step 3.3: Optional Initializer

We can optionally configure multi-search in an initializer. In my case, I still wanted to return results where any word matched:

# config/initializers/pg_search.rb
PgSearch.multisearch_options = {
  using: { tsearch: { any_word: true } }
}

Digging a Little Deeper: PgSearch::Document

Going back to the PgSearch::Document model -- it contains a polymorphic association that points to an instance of one of the multiple models being searched against as well as a text column that aggregates the string contents of each column from a given, multi-searchable model. When you search against multiple models, you're really just searching against PgSearch::Document as it serves as the aggregation of all text across your models.

If we had a Comment model alongside our BlogPost model where we want to search against a comment's body along with the title and body of any blog posts, PgSearch would build a PgSearch::Document record for each BlogPost and Comment. Let's look at some mock data to demonstrate how it works.

Given the following records:

post1   = BlogPost.create(title: 'Single-Model Search', body: 'So easy.')
post2   = BlogPost.create(title: 'Multi-Model Search', body: 'Surprisingly easy.')
comment = Comment.create(body: 'PgSearch makes search easy!')

We'd end up with PgSearch::Document records like this:

[
  #<PgSearch::Document:0x007fab39232af8
    id: 1,
    content: "Single-Model Search So easy."
    searchable_id: 1,
    searchable_type: "BlogPost"
  >,
  #<PgSearch::Document:0x007fab39232af8
    id: 2,
    content: "Multi-Model Search Surprisingly easy."
    searchable_id: 2,
    searchable_type: "BlogPost"
  >,
  #<PgSearch::Document:0x007fab39232af8
    id: 3,
    content: "PgSearch makes search easy!"
    searchable_id: 1,
    searchable_type: "Comment"
  >
]

The actual full-text search functions the same as it did in the single-model strategy now that everything's contained in our PgSearch::Document records.

Step 4: Use It!

With our single-model example, search is as simple as:

BlogPost.search_for('postgres 5ever')

With multi-model:

PgSearch.multisearch('easy')

And for extra fun, tack on the .with_pg_search_rank to either of those search scopes to expose the pg_search_rank on the returned records. It'll show the numeric relevancy ranking from Postgres. I found the pg_search_rank was helpful when validating search in my tests and also when used in a multi-condition sort.

Parting Words

I have two "gotchas" to share before we part ways.

  1. I couldn't use distinct with PgSearch (issue), so I had to fall back to calling .to_a.uniq on the final result set. This was necessary because you'll get multiple instances of the same record if it matches against multiple keywords.
  2. Results using full text search are automatically ordered by relevance (pg_search_rank). To override the ordering, you have to apply the .reorder scope.

Overall, PgSearch was a really pleasant surprise that made me love Postgres and the Ruby/Rails community even more. It's powerful, simple, and will most likely cover most use cases around search.

phew

Ryan is a developer in Viget's Falls Church, VA, HQ, where he believes in being a liason for both the technical and non-technical. He builds elegant tools for clients such as Bozzuto and Millitello Capital—as well as internal tools that we use at Viget every day.

More posts by Ryan