Progressive Caching In-Depth

Ben Scofield, Former Viget

Article Category: #Code

Posted on

Recently, I've been presenting on a technique that takes advantage of Rack support in Rails to revitalize page caching; I've taken to calling it "progressive caching."  There are a couple of places around the web where you can find an introduction to the topic, but this post will go into significantly more depth.

Note: This technique isn't particularly new (I've seen at least one blog post from 2007 about it), though some of the advances in Rails make it more effective. As far as I can tell, however, it hasn't really been explored in depth, but I welcome disagreement on that point!

Background

When you implement page caching in Rails, an action is only processed once; after that, the generated markup is saved to a static file in /public. Subsequent requests to the same URL are then processed by the web server (Apache, nginx, etc.), which makes them orders of magnitude faster than if they were processed dynamically.

This performance boost is wonderful in some cases, but comes with two major challenges: cached pages are both public and static. If you present a page only to authenticated users, you can't use page caching — after the first valid presentation, the content would be cached and anyone could see it, because the web server doesn't respect your application's authentication and authorization code. Also, cached pages are (by definition) static, which means that this is unsuitable for pages that change frequently, or change based on who is viewing them.

Progressive caching is intended to overcome both of these challenges by combining page caching with highly-efficient AJAX requests and client-side JavaScript.

Example

To see how this works, let's start with a sample page:

Ask a Ninja on Odeo

This is a channel page from odeo.com; it shows, among other things, the most recent episodes for the "Ask a Ninja" series. Notice that each episode has a small, pink plus icon. When you add an episode to your quicklist on Odeo, the pink changes to gray and the icon is disabled — and this state persists across all pages where the episode appears.

This practice means that every time a page on Odeo is rendered, the system has to check whether the episodes on the page are in the current user's quicklist, and it has to show the correct icon for each. Let's show some (entirely faked) code for that:

class ChannelsController def show @quicklist = current_user.quicklist @channel = Channel.find(params[:id]) @episodes = @channel.episodes.recent # ... end end 
<% @episodes.each do |episode| %> 
<%= image_tag episode.thumbnail_url, :alt => h(episode.title) %>   <%= link_to image_tag('quicklist_icon.png', :alt => "Add this episode to your quicklist"),   foo_url, :class => (@quicklist.includes?(episode) ? 'listed' : '') %>  
<% end %>

In the view, we're changing a class on the link based on whether the episode has been added to the quicklist or not. Through CSS, this changes the icon displayed to the user.

Changing this fairly costly action to use progressive caching is easy. First, we remove the conditional from the view, which leaves us with content suitable for page caching (since it is no longer dependent on the logged-in user):

<% @episodes.each do |episode| %> 
<%= image_tag episode.thumbnail_url, :alt => h(episode.title) %> <%= link_to image_tag('quicklist_icon.png', :alt => "Add this episode to your quicklist"), foo_url, :id => "quicklist_#{episode.id}" %>
<% end %>

Next, we cache the page and remove unnecessary code (note: at this point you'd also have to add the appropriate cache-expiration code — say, when new episodes are uploaded):

class ChannelsController caches_page :show def show @channel = Channel.find(params[:id]) @episodes = @channel.episodes.recent # ... end end 

At this point, the bulk of the original page's content is being processed through the web server, but we're obviously missing the bits that depend on the logged-in user. To add those, we'll add some Metal:

./script/generate metal Personalizer 
# Allow the metal piece to run in isolation require(File.dirname(__FILE__) + "/../../config/environment") unless defined?(Rails) class Personalizer def self.call(env) if env["PATH_INFO"] =~ /^\/personalize/ [ 200, {"Content-Type" => "application/javascript"}, [User.find(env['rack.session'][:user]).quicklist.episode_ids.to_json] ] else [404, {"Content-Type" => "text/html"}, ["Not Found"]] end end end 

And finally, a bit of jQuery that runs when the page loads:

$(document).ready(function() { $.getJSON('/personalize', function(data) { $.each(data, function() { $('#quicklist_'+this).addClass('listed'); }); }); }); 

After that, we're pretty much done. When someone visits the page, they see all the episodes as if none were in the quicklist. After a brief delay (during which the AJAX call is sent back to the Metal action), the page updates and the episodes that are in the quicklist change. If users are interrupted or confused by the delay, the original state might instead have activity indicators (e.g., spinners) in place of the quicklist icons, but that all comes down to a UX decision.

By now (if not well before), you may have asked yourself about the payoff. Well, on the sample app I describe in my presentation, using progressive caching took a locally-hosted page from 617ms (including database and rendering time) to 135ms (39ms for the cached, static content + 96ms for the AJAX call), which is a decrease of nearly 80%. A second version (which avoided ActiveRecord in the Metal action) took a total of 66ms (43ms for the page, 23ms for the AJAX). These might seem like small numbers in the absolute sense, but when added up over a day, week, or month of an application, they can become extremely significant.

Problems

Of course, there are no perfect techniques, and progressive caching is no exception. We've already mentioned one issue (the delay before the page updates), but there are also problems with accessibility and testing.

Clearly, this approach requires JavaScript on the client. Some people browse without JavaScript (for a variety of reasons), and there are known accessibility issues with dynamically-updating pages. Progressive caching runs into these headlong, so if your application is one that must degrade gracefully, it may not be the correct approach for you.

Also, the state of testing for Rails Metal is ... primitive, to say the least. As far as I know, the best approach is somewhere between an integration test and a homebrewed unit test. That's far from ideal, which means that using Metal can tempt you into testing less. If you do decide to use this strategy, remain vigilant and keep testing!

So, that's it — progressive caching in one form or another has been around for a while, but with the recent changes in Rails (Rack, Metal, and the like), it's poised to come into its own. If you're working on applications that present mostly-similar content to everyone, with pages that vary only in discrete pieces, it may be a viable option for boosting performance by a noticeable amount.

Related Articles