Ten Things I Hate about Proxy Objects, Part I

August 26, 2007 Pivotal Labs

Has Many Through Has Many Through Has Many Through …

In which the author relates several things he hates about Rails’ association Proxies, along with workarounds to fix them. Part I of VIII

From time to time, you want to join through more than one join table. Consider the following example:

class Essay < ActiveRecord::Base
  has_many :chapters
  has_many :pages, :through => :chapters
  has_many :paragraphs, :through => :pages
  has_many :words, :through => :pages
end

This kind of scenario has arisen in many of the Rails Apps that I’ve worked on. In the current version of ActiveRecord, has_many :through cannot go through another has_many :through. There are various ways to work around this, like mapping through the associations, but most workarounds are either inefficient or are difficult to extend with pagination and such.

In theory, it’s easy to make ActiveRecord support this; we need merely to walk down a has_many :through chain, joining tables as we go. This took me a long time to implement though, as I had to decipher the opaque Reflection object model, and that’s where all the magic happens.

What is a Reflection?

Every particular association declaration, like has_many :chapters, etc., is represented as an Object by means of the Reflection class. From an instance of a Reflection, you can access all of the details of the has_many declaration as well as information about the database schema that ActiveRecord was able to infer. Reflections look different for each kind of association, and (unsurprisingly) the has_many :through Reflection is the scariest. Consider this example:

has_many :words, :through => :paragraphs

The Reflection representing this declaration is composed of two sub-reflections. The first, called the :source, represents the Word class. From it we know all about the words table and its foreign keys. The second, is called the :through; it represents the Paragraph class. Now let’s dive into some code.

Mucking through ActiveRecord

So our goal is to join a bunch of tables together. For convenience ActiveRecord joins using the INNER JOIN ... ON ... form rather than the more traditional FROM ... WHERE ... form. The method that currently does the work is:

class ActiveRecord::Associations::HasManyThroughAssociation
  def construct_joins(custom_joins = nil)

This method handles a number of complex cases dealing with the directionality of the foreign key relation and polymorphic relations. In the simplest case, the source code looks like this (I’ve added comments indicating the values of the perplexing expressions):

reflection_primary_key = @reflection.source_reflection.primary_key_name # paragraphs_id
source_primary_key     = @reflection.klass.primary_key # id
"INNER JOIN %s ON %s.%s = %s.%s" % [
  @reflection.through_reflection.table_name, # paragraphs
  @reflection.table_name, reflection_primary_key, # words, paragraphs_id
  @reflection.through_reflection.table_name, source_primary_key, # paragraphs, id
]

The current algorithm is hard-coded to deal with exactly one join. Note the use of @reflection! Imagine taking the same source code, but parameterizing it to deal with an arbitrary reflection. Let’s call this new function construct_one_join:

def construct_one_join(reflection)
  reflection_primary_key = reflection.klass.primary_key
  ...

As you can see, all we really need to do is remove the @ characters and we’re good to go. Next we need to walk the line, iterating down the through chain until the end. Let’s overwrite the old function to do this:

def construct_joins(custom_joins = nil)
  reflection = @reflection
  joins = []
  while reflection.through_reflection
    joins << construct_one_join(reflection)
    reflection = reflection.through_reflection
  end
  "#{joins.join(' ')} #{custom_joins}"
end

There’s still a little more work to do. Ultimately all these joins need to terminate at some particular record’s primary key. That is, when we say:

my_essay.words

Ultimately there should be some clause in the query saying

AND essay_id = '#{my_essay.id}

The old code to effect this is:

def construct_conditions
  table_name = @reflection.through_reflection.table_name
  conditions = construct_quoted_owner_attributes(@reflection.through_reflection).map do |attr, value|
    "#{table_name}.#{attr} = #{value}" # paragraphs.essay_id = #{my_essay.id}
  end
  conditions << sql_conditions if sql_conditions
  "(" + conditions.join(') AND (') + ")"
end

As you can see from the comment in the above code, this is incorrect since paragraphs doesn’t have an essay_id column; rather, chapters does. We really want to say chapters.essay_id = #{my_essay.id}

If we had a function that could get us chapters (i.e., the last through reflection):

def last_through_reflection
  reflection = @reflection
  while reflection.through_reflection
    reflection = reflection.through_reflection
  end
  reflection
end

Then we could replace construct_conditions with the following:

def construct_conditions
  table_name = last_through_reflection.table_name
  conditions = construct_quoted_owner_attributes(last_through_reflection).map do |attr, value|
    "#{table_name}.#{attr} = #{value}"
  end
  conditions << sql_conditions if sql_conditions
  "(" + conditions.join(') AND (') + ")"
end

That’s almost it. There’s just a little more work to handle the :conditions on queries in the chain. See the attached source code for this final detail.

Whew. The idea is of the algorithm is almost trivial: walking down a through chain, joining tables as we go, but the implementation is complex because of the impenetrable interface to Reflection. But it took very few modifications to the Rails source to make this happen… And voila! Now we can has_many :through a has_many :through.

I’ll conclude this article with an exercise for the reader (I’d do it myself but I’m a bit lazy). The clumsy iteration patterns I’ve used (while reflection = reflection.through_reflection) would look much nicer if we implemented Enumerable on Reflection. Then, without any lack of clarity we can rewrite #construct_joins using #inject; similarly last_through_reflection becomes a trivial call to #last. Anyone up for it?

Download all of the source code:
http://www.pivotalblabs.com/files/associations_on_steroids2.rb

About the Author

Biography

Previous
HasFinder — It’s Now Easier than ever to create complex, re-usable SQL queries
HasFinder — It’s Now Easier than ever to create complex, re-usable SQL queries

HasFinder is an extension to ActiveRecord that makes it easier than ever to create custom find and count qu...

Next
Treetop: Bringing the Elegance of Ruby to Syntactic Analysis
Treetop: Bringing the Elegance of Ruby to Syntactic Analysis

(This is my proposal for this year's RubyConf. Fingers crossed!) Treetop is a parsing framework that bring...