Moving from Subversion to Git

October 16, 2008 Pivotal Labs

Moving from Subversion to Git

We recently moved our project from subversion to git, and so far the move has gone very smoothly. The following post will detail what we did to make the move.

Our setup

For this project there 2 pairs working and we have 6 machines and one hosted service:

  • Two Mac OSX workstations with IDEA
  • One continuous integration server running Cruise Control
  • A staging server running a 2-year old version of Ubuntu
  • A subversion server
  • A production server hosted on Engine Yard
  • A Github account with the ability to create private repositories

The goal was to have one pair continue to work while the migration from svn to git was happening.

Getting the right software

Every machine must have git installed for this to work, and at least one workstation must have git svn installed. On Mac OSX we were able to install git reliably via macports with sudo port install git-core.

For the workstation that needed git svn it was a little more difficult. On a fresh install of OSX with macports, you should just be able to run sudo port install git-core +svn, but this requires perl bindings for svn and there are a number of dependencies that might be wrong if you’ve installed subversion from source or have conflicting macports.

On one machine, we were able to get past it by ininstalling and reinstalling subversion before running sudo port install git-core +svn

On a modern Ubuntu machine you should be able to run sudo apt-get install git-core, but since our Ubuntu machine is a few versions behind we had to install from source from http://git.or.cz/.

We planned on converting our svn externals to piston once we moved to git, so we also installed piston. We followed the instructions from http://technicalpickles.com/posts/piston-and-git-for-the-win to install it locally, since it’s not available as a standard gem.

Cloning the subversion repository

To clone the repository we used git svn. We only had a trunk on our project, so it was a bit easier for us – to start, all we had to do was:

git svn clone svn://our/repository/trunk

If you have branches and tags you may also want to add the --trunk, --branches and --tags flags as well – see the docs for more info.

For us the cloning took almost half an hour. When it was done we had a branch named git-svn (visible only with git branch -a), which we converted to master with:

git checkout -b master git-svn

We noticed when we looked at git log that we were missing several months from the commit history, even though the git svn command exited without error. To grab the rest of the revisions we ran git svn rebase and it finished the import. I’m not sure if this is a documented behavior or a bug, but it was surprising to us even though it was easily fixed.

Attributing checkins to pairs

We used to add the pair’s initials to the beginning of our svn commit comments, but with git there is a more elegant way to do it by changing the name of the pair.

In it’s simplest form, you can just change the git config user.name every time a new pair sits down. Brian Takita suggested listing all the pairs in the .git/config file, and just uncommenting the right one in the morning, which saves some time.

There are much more complex solutions, like [this][http://www.brynary.com/2008/9/1/setting-the-git-commit-author-to-pair-programmers-names] if you like as well (well worth the read even if you don’t end up using it)

Re-ignoring files

Any files that were ignored in svn are no longer ignored in git. We added several standard files to our ignore list, like the

  • log/*
  • IDEA project files
  • .DS_Store

Ignoring the log entries seemed to remove the log directory entirely, which caused us to have to manually create the log file on our different servers, but seemed like the right thing to do.

Grabbing externals

Our git svn clone did not pull in any of our externals. We added these back using piston, although it only worked for a few of the externals. The latest version of piston has support for adding svn repositories to git projects. The syntax is the same as it normally is:

cd your-project
piston import svn://some/svn/repo vendor/plugins/some_plugin

For us this worked about half the time, and the other half it hung endlessly and never finished the svn checkout. We still don’t know why that happened. For the plugins that we couldn’t piston we just svn exported those to our project, and once we figure out how to fix piston we’ll piston those as well.

Brian Takita mentioned that when there is a big pending changelist, piston becomes slower on git. In our case, we tried when there were no pending changes, and it still hung.

Adding git support to IDEA

Since we use IDEA for development, we decided to install a git plugin for IDEA to make it easier to view. We decided on git4idea because the project has been more active on github recently and seemed to have a decent feature set.

git clone git://github.com/markscott/git4idea.git
cd git4idea
cp Git4Idea.jar /Library/Applications/<your IDEA directory>/plugins

Then restart idea to make sure it takes effect. We added the .git directory to the list of ignored modules so that it wouldn’t appear in search history as well. A few things to note with this plugin:

  • when you commit, it does not push – the git workflow typically involves committing locally and then pushing in two separate steps
  • when you push it will ask you for a target – the default is default/origin or something like that – in our case typing “origin” works
  • the project explorer seems to get out of date easily. We run Version Control > Refresh File status often whenever it looks funky (or type “Alt + C, E” if you’ve got the alt keys working)
  • when you commit, there are annoying comments in the commit message textarea – it’s focused by default, so you can easily delete it, but it’s still annoying

We haven’t used the plugin extensively, but so far it looks decent.

When switching IDEA projects, you might find it helpful to copy your old project files into the new project to save some time re-setting your settings.

Pushing to github

Once you are setup, you can push to github. In our case we did the following:

  • added our local machine’s public ssh key to the account’s ssh keys (on githubs main account page)
  • created a new repository and marked it as private
  • followed the github instructions to add the remote repo and push to origin master

We decided to only push the master branch to github, since we wouldn’t be using git svn for more than a few hours as we migrated all of the machines. We didn’t test the continued use of git svn after cloning the repo, so if you need to continue with svn you may need to push the original branch as well, or change your .git/config file to make sure that dcommits still work etc…

Once we had it on github, we renamed our original project directory and did a fresh git clone of the github project, then:

  • updated our .git/config again with our pair names (alternately, you can set the config variables globally, like git config --global user.name 'foo & bar')
  • added the log directory (mkdir log)

Updating CI

To update the CI box, we:

  • Added a deploy key to github
  • Stopped all running cruise processes
  • Renamed our current project to -SVN
  • Updated to the latest version of CCRB
  • Added our new project with the standard cruise command
  • Updated our “scratch pad” checkout from the svn repo to the git repo
  • Rebooted CI to make sure everything worked (you probably don’t need to reboot)

Github provides the ability to create read-only accounts for specific repositories. These users are identified by public keys and they must be unique across github. For us, we had to:

  • Log into our CI box
  • Create a public/private key pair with ssh-keygen
  • Copy the public key to github’s deploy key area in the Project Name > Admin section

Then we updated to the latest version of Cruise Control by going to CCRB directory and running git pull. If you’ve added Cruise Control by some other method, you’ll have to update to the latest source from http://github.com/thoughtworks/cruisecontrol.rb/tree/master.

In our setup, our projects are in ~/.cruise/projects. Cruise Control loops through every directory in /projects and loads it’s cruise_config.rb file, so you can have multiple builds running at the same time. After renaming our original project, we added the new project by going to ~/.cruise and typing:

~/.cruise./cruise add your-projectname --repository git@github.com:projectname.git --source-control git

This checked out the repository and added the necessary config file. We then went in and manually created the log directory.

On our CI box we also have a scratch-pad checkout of our code, useful for debugging, located at ~/workspace/project-name. We blew that away and git cloned our repo, then added the log directory.

Once the CI build is green, you can delete the old project that was based on svn.

Git on capistrano

To deploy our app to our staging server, we:

  • Created a deploy key for the staging server and added it to github
  • Deleted all files from the remote cache directory
  • Updated deploy.rb to use git
  • Deployed twice (the second time to ensure that it worked from the remote cache)

We followed the excellent guide here to get our deploy settings correct.

The only stumbling block we found was that we use deploy_via :remote_cache, which stores a checkout of the repository on the server to make deployments faster. Since the remote cache had a subversion checkout, it was necessary to delete all files in the cache before deploying.

After the cache was cleared and deploy.rb was updated we deployed twice, both times without incident.

Deploying git on Engine Yard

We haven’t had the opportunity to deploy on git to EY, but given how easy it was to deploy on our staging server we anticipate that EY will be easy to deploy.

Josh Susser pointed out that because Github itself if hosted on Engine Yard, git deploys from git repos on Github to slices on Engine Yard servers are blazingly fast.

Managing the transition

While one pair was updating these servers, the other pair was checking into the existing subversion repository. Once all the work stations were set up, CI was up and deployments were working, we ran a final svn rebase and pushed to github. It looked something like this:

cd svn-project-dir
svn commit -m "made some well tested changes"`
cd git-project-dir
git pull        # => gets all of the changes that the other pair made, i.e. pistoned directories etc...
git svn rebase  # => gets latest changes from svn
git push        # => sends changes to github

So by the end of the day, there was almost no interruption for one pair.

Timing

All in all it took about a day for a single pair, but if we could do it again knowing what we know now we could probably do it in about half that time.

References

About the Author

Biography

Previous
Effective Markdown Editing with the WMD editor and the Save Text Area Firefox plugin
Effective Markdown Editing with the WMD editor and the Save Text Area Firefox plugin

Anytime I need to edit Markdown, I reach for the WMD editor. Their splitscreen demo is the most effective w...

Next
Version 0.3.2 of Desert gem released
Version 0.3.2 of Desert gem released

We've uploaded a new version of the Desert gem on to RubyForge. This fixes the issues with template loadin...

×

Subscribe to our Newsletter

!
Thank you!
Error - something went wrong!