Rsync Backup System

Last February I started looking at using my old PowerMac G4 (dual 500mhz, gigabit ethernet model) as a live backup server since I’d moved to the PowerBook full time for development. Nine months later, I’ve finally done it! I know several people that have recently had drives fail on them; it was time to move beyond a weekly CD backup of the essentials. It’s hard to automate backups to CDs and DVDs, but automating a backup to another server or hard drive is quite doable.

Here is the set-up:

  • Two 200GB external FireWire drives hooked up to the PowerMac
  • Shell script that rsyncs my PowerBook and web server to each drive on alternating nights
  • CRON job on the PowerMac to run the shell script
  • Energy Saver setting on the PowerMac and PowerBook to wake in the middle of the early morning to run everything
  • PowerMac set up headless, with VNC access from the PowerBook

Now that it’s all set up it looks pretty simple, but it took a little doing. Let’s start at the top. To do the backups, I needed some disk space.

I’m still doing archival/snapshot backups on CDs and DVDs, but it would be really nice to have everything backed up. I decided I wanted to mirror my entire PowerBook hard drive and my server (including my IMAP e-mail) onto two drives. Everything would go to one backup hard drive or the other on alternating days (even days go to one disk, odd days to the other) just in case something bad happened to a source (PowerBook or server) and I didn’t catch it before it updated that day’s backup.

I found a good deal ($89 after rebates) on 200GB Western Digital drives at Circuit City, so I decided to go pick up a pair of them. Unfortunately, it wasn’t that simple. There was a limit of one rebate per household, so the sales guy recommended he ring the drives up separately and I could have one sent to my PO box and one to my home. Not ideal, but not terrible.

I then went next door to CompUSA to see if they had any decent FireWire enclosures. When I walked in the door, I saw a display of 200GB Seagate Barracuda drives for $79.99 (after rebate, with no rebate restrictions – I asked). I’ve got more faith in Seagate drives than I do in Western Digital drives (plus there were no hassles with the rebate), so I decided to get a pair of these instead. They had FireWire enclosure there that didn’t look too bad; I got one of those too so I’d be ready to rock ‘n roll. Then I went back to Circuit City and returned the other drives.

My intention was to pull the 60GB drive currently in the second drive bay in the PowerMac and replace it with one of the new 200GB drives, then put the other drive in the FireWire enclosure. Having both in FireWire enclosures would be kind of nice, but having one on the internal bus is kind of nice too. I put one of the 200GB drives into the FireWire enclosure and connected it to the PowerMac – bingo! It mounted right up and I partitioned the drive with Disk Utility.

Next, I disconnected all the cables and pulled the PowerMac out from under the desk to put in the internal drive. I opened it up1, pulled the 60GB drive and put the 200GB drive in2, closed up the machine, reconnected all the cables and booted it up.

This is where I first got unexpected results. Disk Utility would only format the disk as 128GB. Hmm. Ok, I guess my PowerMac is so old that it only supports up to 128GB on the internal bus. Bugger. Back to the store to get another FireWire enclosure. From here, I ran into other problems, and finally gave up (CompUSA took the enclosures back) and ordered a pair of new enclosures3.

That took care of the disk capacity issue. The Bytecc enclosures are much nicer6 and were a little cheaper including shipping. The worked like a charm the first time and even stack more nicely. As Scott said: “We solve those problems… y’know, with money.”

The next step was to set up the rsync commands to transfer the data. Rsync is great because it just works, it transfers the data compressed over a secure connection and it will transfer only the data that has changed since the last sync (incremental backup).

The first step to setting up a scheduled rsync is setting up your SSH keys so the machines can connect without you needing to enter your password first. This article walks you through it nicely.

I’d read several positive things about RsyncX so I downloaded it to give it a go. It’s basically a GUI front end for a Mac OS X specific version of rsync – sounds perfect, right? Perhaps this is a great tool if you already know the ins and outs of rsync, but I couldn’t make heads or tails of the interface.

Frustrated by a lack of included docs with RsyncX, a search turned up the very thing I was looking for. Unfortunately, it was little help. I wrestled with a number of tutorials and the output from RsyncX for several hours before deciding to throw in the towel.

The cavalry was coming into town this week anyway, so I decided to put things on hold for a couple days. With Scott, we went straight to an rsync solution (dropping RsyncX) and had it working within an hour!

Command line rsync isn’t hard when you have example syntax to work from. Here are the commands I use to back up a directory on the web server:

rsync -rtlzv --delete --ignore-errors --exclude dir_name -e ssh "/Volumes/$disk/path/to/dir"

and the entire PowerBook4:

rsync -rtlzv --delete --ignore-errors --exclude Network -e ssh "root@Computer-Name.local:/Volumes/Drive\ Name/*" "/Volumes/$disk/path/to/dir"

Then I created a shell script (bash) with each of the rsync commands5 that I wanted to run. I also added a few lines to set the $disk variable in the commands above to the name of the drive I want to back up to that day:

i=`date +%j`
j=`expr $i % 2`
k=`expr $j + 1`
disk="Drive $k"
if [ -e "/Volumes/$disk" ]
echo "backing up to $disk"
echo "could not find backup device: $disk"

That code will alternate between ‘Drive 1’ and ‘Drive 2’. For testing, you can hard code the $disk value to a particular drive:

$disk="Drive 1"

Then set up a CRON job to run the shell script. (Unlike RsyncX) CronniX is a nice simple interface for setting up CRON jobs.

Now I don’t want the PowerMac running all the time, so I went into Energy Saver in System Preferences and told it to wake up at a specific time, 2 minutes before the CRON job is scheduled to run. I did the same thing on the PowerBook. Since both machines are set to sync with the network time server, they should be waking up at just about the same time.

Now there was just one more trick. I only have one monitor which I normally hook up to my PowerBook. I wanted to use VNC to run the PowerMac from the PowerBook when needed, but when I was looking into setting this up at the beginning of the year, I’d read a number of things that all said that Mac OS X wouldn’t run headless. The best solution seemed to be to use a monitor adapter to trick the Mac into thinking there was a display attached. Yuck, not very elegant.

OS X Server is supposed to run headless, so I was looking into getting that installed when I found out that (at least with Mac OS X 10.3.6) a Mac will run like a champ in a headless configuration. I installed OSXvnc on the PowerMac and Chicken of the VNC on the PowerBook – gravy! If I need to wake up the PowerMac during the day, I just plug the mouse into the USB port and it fires right up.

I’m quite pleased and much less likely to lose all of my digital photos, source code, documents, music, databases, etc.

  1. I still like the way Apple designed the old PowerMacs with the logic board on the side that folds down. [up]
  2. Remember to set the jumpers. [up]
  3. These were recommended (and already tested and in use) by Adam who got them on recommendation from Eric. [up]
  4. To backup everything on the PowerBook, I needed to enable the root user on the PowerBook and connect as root. [up]
  5. Duplicate the server line and change the paths for each directory you need to back up. [up]
  6. 1/8th” thick aluminum casing and they stack beautifully. [up]