Converting Rails application data from MySQL to PostgreSQL

November 23, 2009 Pivotal Labs

One of our projects had a pending chore in Tracker to move its backend to PostgreSQL from MySQL. This project has about a quarter of a million rows of production data and around a hundred tables in its schema which needed to be exactly migrated into PostgreSQL.

Forklifting the data proved more complicated than expected due to incompatibilities in the two DBMS’ syntax such as in the way string escaping worked, how booleans were represented and a bunch of other small but painful differences. Despite MySQL’s mysqldump utility including a command-line option to write statements in PostgeSQL format, it became clear that it wasn’t going to be simple to create a repeatable procedure to do this work across our environments.

There’s a bunch of information out there about how to approach this problem but none felt right. Most are multi-step manual procedures that require altering a dump file using sed or perl and others require the data to be loaded into an intermediary database and massaged prior to import. After testing some of these approaches, Todd and I decided to timebox ourselves to an hour to test the viability of a Ruby script using the DBI gem to move the data. We came up with:

require 'dbi'
require 'dbd/mysql'
require 'dbd/pg'

begin
  mysql = DBI.connect("DBI:Mysql:source:localhost", "username", "password")
  postgres = DBI.connect("DBI:Pg:destination:localhost", "username", "password")

  mysql.select_all("SHOW TABLES") do |table|
    next if ['schema_migrations', 'sessions'].include?(table.to_s)
    select = mysql.execute("SELECT * FROM #{table}")
    columns = select.column_names.map { |key| ""#{key}"" }.join(', ')
    placeholders = (['?'] * select.column_names.size).join(', ')
    insert = postgres.prepare("INSERT INTO #{table} (#{columns}) VALUES(#{placeholders})")
    select.each { |row| insert.execute(*row) }
    insert.finish
  end
rescue DBI::DatabaseError => e
  puts "Error #{e.err}: #{e.errstr}"
ensure
  mysql.disconnect if mysql
  postgres.disconnect if postgres
end

Our antiquely Perl-like script worked better than we expected — our application started right up with all of its data intact.

Has anybody out there encountered this need before? What kinds of solutions did you come up with?

About the Author

Biography

More Content by Pivotal Labs
Previous
Thanks Startup Crawlers!
Thanks Startup Crawlers!

Thanks to all who came out for the Startup Crawl last Friday night. It was a great crowd and by all account...

Next
National Lab Day website launches
National Lab Day website launches

President Obama today announced the establishment of an annual National Lab Day, a nationwide initiative to...

Enter curious. Exit smarter.

Learn More