Embedding Mongoid documents and data migrations

January 7, 2011 Pivotal Labs

When first starting out with mongodb, it’s easy to make the wrong decision on whether to embed a document or not. Even if you made the correct decision at that moment, changing requirements may force you into a migration. So how do you migrate existing data when transitioning from a standalone document to an embedded document? This is what I came up with.

Initial Data Structure

class User
  include Mongoid::Document
  field :name
  references_many :sales
end

class Sale
  include Mongoid::Document
  field :price, :type => Integer
  referenced_in :user
end

Now with Sale embedded in User

class User
  include Mongoid::Document
  field :name
  embeds_many :sales
end

class Sale
  include Mongoid::Document
  field :price, :type => Integer
  embedded_in :user, :inverse_of => :sales
end

Migrating Sales Data

class EmbedSalesInUsers < Mongoid::Migration
  def self.up

    # pull your existing data into memory
    # consider batching for large data sets
    # Note that you must call query methods on the object you are migrating
    # for this method to work (i.e. you can not pull via User#sales)

    sales_attributes = while_stand_alone_doc(Sale) do
      Sale.all.map(&:attributes)
    end

    # now when you save your data, your fields will be embedded

    sales_attributes.each do |attributes|
      user = User.find(attributes[:user_id])
      user.sales << Sale.new(:price => attributes[:price])
    end

    # remove all the documents from the original collection

    while_stand_alone_doc(Sale) do
      Sale.destroy_all
    end
  end

  def self.while_stand_alone_doc(klass)
    # by changing the Mongoid::Document.embedded you can temporarily
    # modify which collection Mongoid looks to for your model's data store

    begin
      klass.embedded = false

      yield
    ensure
      klass.embedded = true
    end
  end

end

There are a couple things to note here.

  • The embedded flag in Mongoid::Document is not documented so it could easily change. This was working as of 2.0.0.beta.20
  • When you create the new embedded document, make sure you pass only the attributes you care about. Passing all attributes will add things that you no longer need like user_id in this case. (For clarity, attributes you assign will be persisted, though you will only have setters and getters for the fields you explicitly define in your document.
  • I am using mongoid_rails_migrations in this example

About the Author

Biography

Previous
Riak Overview and Schema Design
Riak Overview and Schema Design

Want to get on the NoSQL bandwagon but don't know where to start? Riak is one of the lesser known entrants ...

Next
Testing Service Integrations with Bash and cURL
Testing Service Integrations with Bash and cURL

One of the most important parts of testing a system is "finding a seam". Testing a whole system can be a fo...