Making Rails Wicked Fast: Pagecaching Highly Personalized Web Pages

November 4, 2007 Pivotal Labs

Consider the following snippet for a page showing blog articles. Notice how content on the page differs based on who is viewing it:

<% if current_user.nil? %>
  You are logged out
<% elsif current_user.admin? %>
  You are an admin
<% elsif @article.author == current_user %>
  You are the author of this blog article
<% end %>

Pagecaching such a page is difficult because all of this conditional logic would need to be translated to Javascript. The appropriate data (whether the user is logged in, etc.) needs to be available to the client–usually this is stored in a cookie or comes from an Ajax request (presumably the Ajax request is much faster than having Rails generate the entire page).

While we can translate this conditional logic to Javascript, a much simpler approach is to use CSS:

<style>
  .logged_out, .admin, .author { display: none; }
  body.logged_out .logged_out { display: block; }
  body.admin .admin
</style>
<body class="">
  <div class="logged_out">
    You are logged out
  </div>
  <div class="admin">
    You are an admin
  </div>
  <div class="author">
    You are the author...
  </div>
</body>

By default, anything of class admin, logged_out, etc. is invisible. But simply by adding a class to the body tag, we can “unlock” these hidden parts of the page:

<body class="admin author">
</body>

And voila! both the admin and author sections are visible to the end user.

Implementation Details

So how do we add classes to the page? And where do we get the appropriate data for the end user? Use Javascript to add classes to the body tag:

for(var i = 0; i < classes.length; i++) {
  $$('body').first().addClassName(classes[i]);
}

Data then comes from one of three places.

Constant Data about the Current User

Set constant data about the current user at the start of a session (for example, we know whether the person is logged in and we know whether she is an admin):

class ApplicationController < ActionController::Base
  after_filter :set_classes
  def set_classes
    cookies[:user_classes] = current_user.classes
  end
end
class User
  def classes
    [admin?? 'admin' : 'not_admin', ...]
  end
end

Personalized Data

Set data about the current user’s relationship to the presently displayed content (for example, whether she the author of the article she is currently looking at) using an Ajax request.

new Ajax.Request('#{url_for(:format => :js)}', {
  method: 'post',
  onComplete: function(transport, json) {
    if (json.classes) {
      for(var i = 0; i < json.classes.length; i++) {
        $$('body').first().addClassName(json.classes[i]);
      }
    }
  }
}

class ArticlesController < ApplicationController
  caches_page :show
  def show
    @article = Article.find(params[:id])
    respond_to do |format|
      format.html {}
      format.js do
        headers['X-JSON'] = @article.to_json_for(current_user)
      end
    end
  end
end

class Article < ActiveRecord::Base
  def to_json_for(user)
    {
      :classes => [
          [user == author ? :is_author : :is_not_author
          ...
          ]
    }.to_json
  end
end

A couple gotchas here. First, You must override Rails pagecaching functionality to ensure it doesn’t cache requests for Json. Put this in ApplicationController:

def self.caches_page(*actions)
  return unless perform_caching
  actions.each do |action|
    class_eval "after_filter { |c| c.cache_page if c.action_name == '#{action}' && c.params[:format] != 'js' }"
  end
end

(Some details are missing in the above implementation since respond_to seems to delete the :format parameter from the params hash.)

Second, since we’re making an Ajax request, we are hitting the Rails stack; nevertheless, this is still wicked fast because the Ajax request returns only that data that is essential–very few objects need be instantiated. Also, you can almost always avoid making this Ajax request for logged out users, which should take enormous load off the server.

Stateful Session Data

Some data related to the current user is not constant–it lasts only for some finite part of the session. Still it should persist longer than just the current page. An example is flash[:error] content, but many sophisticated web sites utilize this kind of personalization extensively (think wizards and contextual help). The easiest way to populate this data is as part of the Ajax request but rather than return it in the Json, return it in a cookie.

class ArticlesController < ApplicationController
  caches_page :show
  def show
    @article = Article.find(params[:id])
    respond_to do |format|
      format.html {}
      format.js do
        cookies[:temporary_classes] += ...
      end
    end
  end
end

You may want to remove these “temporary” classes from the cookie as you use them on the client side.

CSS is Easy to Use

The reason this technique works is that CSS selectors permit the ability to do complex Boolean expressions. Though far from Turing Complete, CSS is powerful enough to express and and or, and it expresses it in an elegant way. It’s far easier and more elegant than translating your conditional logic to inlined Javascript.

Disadvantages

There are a few downsides to this approach. One is security. Because we render content for all possible people into the page, there is a potential security violation. Though no sensitive data appears in the browser, it is visible in the source code. For most applications, these security concerns are unimportant because the “real” security rules are enforced during write operations. But your mileage may very.

Some Cache Design Principles

There are a few principles to bear in mind when implementing a caching strategy.

  • Distinguish data a) independent of the current user (e.g., the title of an article) and data b) dependent on the current user.
  • In the latter category, distinguish data b1) concerning the current user only (e.g., whether the user is an admin) and data b2) concerning the relationship between the current user and some other object (e.g., whether the user is the author of an article).
  • Data concerning the relationship between the current user and some other object (b2) can usually be segmented into axes of variability; that is, we can reduce the “space” of the data from the number of users, to some smaller set of criteria. For example, we can usually categorize the kinds of relationships into is_author, is_not_author, is_friend, is_not_friend, etc.

Data of type (a) and (b2) are usually pre-rendered into the pagecached page. Data of type (b2) is then shown and hidden using the CSS technique. Data of type (b1) is usually set into the cookie when the user logs in, and alternatively any time the user hits the Rails app.

Designing Cachable URLs

It is very important to know when to use and when to avoid putting a user id in your url. For example, if you model the current user’s profile page with the following url:

http://www.mysite.com/profile

You’ve effectively made that page uncachable, since all of the content on that page depends upon the current user. A better URL would be:

http://www.mysite.com/profiles/nick

If my profile looks differently to me than others, model that as data of type (b2), that is data concerning the relationship between the current user and some other object.

That said, never put the current user into the url:

http://www.mysite.com/users/nick/articles/1

If this isn’t an article written by Nick (data of type a), but is rather Article 1 as seen by Nick (data of type b2), you’ve just pagecached a page with a cache hit ratio of zero. So consider carefully how you model a resource like the following:

http://www.mysite.com/account/password vs. http://www.mysite.com/users/nick/account/password

The latter format ensures that the password page has a zero cache-hit ratio, so screw that. Furthermore, a page that links to the latter url (for example, an “Account Settings” link in the site’s header) that happens to be cached must now generate that link using Javascript since the url differs based upon the current user. To reiterate this point:

Ensure that any pages that concern only the current user do not have the user identifier in the url. Examples include: the logged-in home page, account settings page, edit my profile page, my message inbox page, etc.

Hope you guys find these techniques helpful. Along with a little bit of scriptaculous templates, this technique will make it easy to make your highly personalized Rails apps scale massively.

About the Author

Biography

More Content by Pivotal Labs
Previous
Installing Freeimage/ImageScience on OS X 10.5 Leopard
Installing Freeimage/ImageScience on OS X 10.5 Leopard

Gleaned these instructions from a rubyonrails-talk thread. Another thing is mysql is a bit funky, you can't...

Next
Entrepreneur Panel: You’ve Launched, Now What?
Entrepreneur Panel: You’ve Launched, Now What?

Pivotal Labs, in collaboration with VentureArchetypes (www.venturearchetypes.com), hosted an entrepreneur p...

Enter curious. Exit smarter.

Register Now