Adventures in Subversion: Repository Migration

Adventures in Subversion: Repository Migration

I recently wanted to migrate our subversion repository at work from a local machine to a remote file server. This was due mostly to the fact that our new remote systems were running in a cloud based environment powered by VMWare. Along with our normal backups, this would provide us with high availability and a greatly reduced risk of loss.
The first step in the process was to figure out a way to migrate from one server to the other. This is a fairly simple task, but made a little more tedious due to the fact that we have a collection of multiple repositories. I decided to use a dump / load procedure to move from the local machine to the remote host. In order to make it as painless as possible, I wrote a bash script to automate the dump of the entire collection into a single directory that could then be archived up and copied across the WAN. The following is a copy of the backup script used:

#!/bin/bash
 # Subversion backup script

for dir in `ls /var/www/svn/projects/`;
 do

echo "Creating SVN dumpfile ($dir)"

mkdir ./BAK_$(date +"%m-%d-%Y")

svnadmin dump /var/www/svn/projects/$dir | gzip >  ./BAK_$(date+"%m-%d-%Y")/svn.$dir-backup-$(date+"%m-%d-%Y").gz

echo "Backup Complete ($dir)"

done

Essentially this script loops through the repositories in the parent folder and creates a gzip dump of each in a folder that is time stamped for the current day.
After creating the dump file it was a matter of copying it to the remote server and reloading it. In order to accomplish this in a efficient manner, another bash script was required. This one takes the multiple from the dump directory, gunzips them, and loads them into the new location. Since this was a new installation of Subversion, the repositories needed to be created before the load. I used some string manipulation to get the original repository name from my backups, create the repo, and them load the individual dump files. There may be an easier way to do this, but I accomplished it using a bash script just as well. The following is the repository restore script:

#!/bin/bash
# Subversion restore script
echo -n "Provide the backup directory path to restore from: "
read -e DIRECTORY

for f in `ls $DIRECTORY`;
do
   file=${f%-backup*}
   repo=${file:4}
   echo "Creating repo for ($repo)"
   svnadmin create /var/www/svn/projects/$repo
   echo "Loading repo ($repo)"
   gunzip -c $DIRECTORY/$f | svnadmin --force-uuid load /var/www/svn/projects/$repo
   echo "Load Complete ($repo)"
done

Ok, now all the repositories are loaded. Now I just want to make sure that I can sync them up. This requires creating some hooks. Again I turned to bash scripting to make this easier than it would be touching each individual repository.
The first thing I ran into was that svnsync needs to have a hook file called pre-revprop-change available in order to work. The file just needs to be present, so I created on the fly. I wrote a quick bash script to create this file on both repositories so I could sync back and forth:

#!/bin/bash
# pre-revprop-change script used for syncs
for dir in `ls /var/www/svn/projects/`;
do
    file=/var/www/svn/projects/$dir/hooks/pre-revprop-change
    OPT=$1
    if [ ! -e "$file" ]
       then
           touch $file
           echo "#!/bin/bash"$'\n'"exit 0" >> $file
           echo "Adding hook owner and permissions ($file)"
           chown apache:apache $file
           chmod +x $file
       else
           echo "Removing hook ($file)"
           rm -f $file
     fi
done

Once the pre-revprop-change file was in place on both sides, I could call the svnsync on a repo by repo basis, or just write another bash script to do it for me. I chose the later, but it required some setup in order to avoid being prompted for passwords when remoting. The easiest thing to do was create an ssh key on both sides and export it to the remote subversion host.

If you already have an SSH key, you can skip this step… Just hit Enter for the key and both passphrases:

$ ssh-keygen -t rsa -b 2048
Generating public/private rsa key pair.
Enter file in which to save the key (/home/username/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/username/.ssh/id_rsa.
Your public key has been saved in /home/username/.ssh/id_rsa.pub.
Copy your keys to the target server:
$ ssh-copy-id id@server
id@server’s password:

For the command line sync process I just used root to connect, but from what I understood, each side would need to have an svn user to perform the sync process. I created a user on both ends call svnsync to perform any actions needed. Following the normal procedures to create an additional users:
htpasswd -m /etc/subversion/users svnsync
Once this was done I wrote my syncing bash script to loop through the repositories. It would first init the repository and then perform a synchronization. Since I was doing this on a live subversion server, I would likely have to do this several times until I knew everyone had relocated their projects.

#!/bin/bash
# Remote sync script
SERVER=10.160.10.100
for dir in `ls /var/www/svn/projects/`;
  do
    echo "Syncronzing $dir to BTI repository"
    svnsync init --sync-username svnsync --sync-password SVNPASS svn+ssh://$SERVER/var/www/svn/projects/$dir \
     file:///var/www/svn/projects/$dir --allow-non-empty
    svnsync synchronize --sync-username svnsync --sync-password SVNPASS svn+ssh://$SERVER/var/www/svn/projects/$dir \
     file:///var/www/svn/projects/$dir
  done

Everything went rather smoothly and only took an hour or two to complete. This might not be the optimal way to perform a migration, but I felt it was straight forward and got the job done with minimal effort. I began working on a mirrored repository, making the local server a mirror to the remote one, but ran into issues performing a post-commit from the remote box. I’ll still keep tinkering with the mirroring, but for now at least our versioning system is in a safer environment.