Introducing ActiveApi – A sane way to translate your data to xml

July 6, 2009 Pivotal Labs

ActiveApi allows you to define a schema in Ruby, and use that schema to convert ruby objects to xml. An example looks like this:

Schema.version(:v1) do |schema|
  schema.define :article do |t|
    t.attribute :id
    t.string :title
    t.date :published_on
    t.has_many :comments
  end
end

On one of the apps I’m working on now, we have to expose our data as xml. This XML will be used as the datasource for reports, and as a way for external clients to import data into a data warehouse. The clients who will be importing the data are enterprisey – they have lots of tools that work with DTDs and XSD’s, and they’ve never heard of ActiveResource. (Many barely know what Rails is.)

On this app the data model is quite large and complex, and changes will be inevitable. We may want to generate reports using the latest and greatest xml, but some clients may take longer to update their data import code, so it’s very likely that we’ll have to maintain overlapping versions for short periods of time.

Through this process I’ve come to think that exposing your data via XML is not the job of the model. Instead, it’s the job of a separate class that is specifically designed to translate your model schema into a schema appropriate for public consumption.

A good api tool will have built-in support for:

  • XSD or DTD generation
  • Versioning
  • The ability to represent your model in a way that is not tightly coupled to the model itself – so you’re models can change at a different rate than your xml schema

ActiveApi attempts to provide this functionality.

Installation

sudo gem install zilkey-active_api

Usage

You define a schema like so:

Schema.version(:v1) do |schema|
  schema.define :article do |t|
    t.attribute :id
    t.string :title
    t.date :published_on
    t.has_many :comments
  end

  schema.define :comment do |t|
    t.belongs_to :user
    t.string :article_title, :value => proc { |element|
      element.object.article.title
    }
  end

  schema.define :user do |t|
    t.string :username, :value => :user_name
  end
end

To create xml from this schema, you could write code like this:

ActiveApi::Schema.find(:v1).build_xml(@articles, :node => :article).to_xml

Which will give you xml output that looks like this:

<?xml version="1.0"?>
<articles>
  <article id="1">
    <title>target efficient applications</title>
    <published_on>2004-08-22</published_on>
    <comments>
      <comment>
        <article_title>target efficient applications</article_title>
        <user>
          <username>foo</username>
        </user>
      </comment>
    </comments>
  </article>
  <article id="2">
    <title>recontextualize viral e-services</title>
    <published_on>2004-12-05</published_on>
    <comments>
      <comment>
        <article_title>recontextualize viral e-services</article_title>
        <user>
          <username>foo</username>
        </user>
      </comment>
    </comments>
  </article>
</articles>

Extending

ActiveApi is highly extensible. The general pattern used to extend ActiveApi is to subclass the default ActiveApi implementation, and specify that you’d like to use your subclass instead.

For example, if you are working with a database that has audit columns such as timestamps, you might want to do this:

class MyDefinitionClass < ActiveApi::Definition
  def timestamps
    date_time :created_at
    date_time :updated_at
  end
end

@schema = Schema.version(:v1, :definition_class => MyDefinitionClass) do |xsl|
  xsl.define :article do |t|
    t.timestamps
  end
end

Schema.find(:v1).build_xml([@article], :node => :article).to_xml

Which will produce the following xml:

<?xml version="1.0"?>
<articles>
  <article>
    <created_at>1945-12-21T00:00:00+00:00</created_at>
    <updated_at>1992-04-05T00:00:00+00:00</updated_at>
  </article>
</articles>

NOTE: when specifying custom definition classes, those classes must be loaded before calling Schema.version

You can also create custom classes for any element you define, like so:

@schema = Schema.version(:v1) do |xsl|
  xsl.define :article, :builder_class => "MyCustomClass"
end

class MyCustomClass < ActiveApi::ComplexType
  def build(builder)
    builder.send :foo, :bar => "baz" do |xml|
      xml.send :woot, "lol"
    end
  end
end

Schema.find(:v1).build_xml([@article], :node => :article).to_xml

Which will produce the following xml:

<?xml version="1.0"?>
<articles>
  <foo bar="baz">
    <woot>lol</woot>
  </foo>
</articles>

NOTE: since the builder classes are evaluated at runtime, you can specify a string name for the class, and the class does not have to be loaded before calling Schema.version

Features

You define your schema completely separately from your data. So you could in theory render multiple types of objects with the same schema, provided that they have the same interface. You could also render a single object in any number of ways.

You can choose to have the builder send methods on your object, or provide more complex values by using the :value => proc{} syntax. Since you can define the value of the elements separately from the names, aliasing your objects field names is built in.

The element keeps track of all of it’s ancestors, so you can access objects that were rendered as ancestors, even if those objects aren’t ancestors in your object graph.

The Schema definition just creates an array of Ruby objects, which you could use to create documentation or XSD files. One of my major goals for this gem is to create valid XSD from the schema itself. Given how extensible it is, I’m not sure how that will work yet, but I’m psyched to give it a shot.

You define your schema versions using whatever versioning scheme you want – could be a string, symbol or any other object. You can render the same objects with different schemas easily. This library is totally agnostic as to how or if you version your schemas – but if you decide to version, it makes it easy.

The schema defining DSL allows you to define any valid XSL data format – it formats dates, times and datetimes in XSD compliant formats, it URI escapes any element tagged as anyURL and has helper methods for all XSD data types in both their native form (anyURL) and a more ruby-friendly form (any_url) – so you can write:

  schema.define :article do |t|
    t.anyURL :foo
    t.any_url :foo
  end

And they will be emitted identically.

ActiveApi also allows you to define polymorphic relationships via the XSL “choice” instruction:

      xsl.define :comment do |t|
        t.belongs_to :commentable, :choice => {
          "Article" => :article,
          "User" => :user
        }
      end

This allows your API to determine which elements to render at runtime, while still giving you complete control over the schema to use for that element. This allows you to have more than one representation of your objects – one that gets rendered if the object is the root, and a different one that gets rendered if it’s being rendered as a sub-element of another object.

Integration with Rails

ActiveApi is framework agnostic. While the belongs_to / has_many syntax is Rails-like in name, it does not depend on ActiveRecord. It also does not modify or interfere with AR in any way. To use xml generated from ActiveApi in a controller, you can do this:

def index
  @articles = Article.all

  respond_to do |format|
    format.html # index.html.erb
    format.xml  do
      render :xml => ActiveApi::Schema.find(:v1).build_xml(@articles, :node => :article)
    end
  end
end

Implementation

ActiveApi uses’s Nokogori::XML::Builder to create the xml nodes. As such, the creation of the xml is fast. The rest of the code likely needs major refactoring to be performant and have a small footprint with large datasets. In particular, it creates a new object for every collection, element and value. That’s a lot of objects. I’d love to hear any comments about how to improve this aspect of the design.

Authors

While I wrote all of the code in this repo, the code was inspired by pairing on a similar project with:

  • Mike Dalessio
  • Peter Jaros
  • Ben Woosely

Development

http://github.com/zilkey/active_api/tree/master

See the issue tracker on github for the list of features and bugs I know about.

About the Author

Biography

Previous
The Engine Yard Cloud: A Programmable Deployment Platform
The Engine Yard Cloud: A Programmable Deployment Platform

Ezra Zygmuntowicz of Engine Yard details their new Flex cloud hosting product, including the extensive use ...

Next
Use Autospec to Your Advantage
Use Autospec to Your Advantage

I like Autotest because it allows me to stay within my code editor and let my test suite run automatically ...