Last February I started looking at using my old PowerMac G4 (dual 500mhz, gigabit ethernet model) as a live backup server since I’d moved to the PowerBook full time for development. Nine months later, I’ve finally done it! I know several people that have recently had drives fail on them; it was time to move beyond a weekly CD backup of the essentials. It’s hard to automate backups to CDs and DVDs, but automating a backup to another server or hard drive is quite doable.
Here is the set-up:
- Two 200GB external FireWire drives hooked up to the PowerMac
- Shell script that rsyncs my PowerBook and web server to each drive on alternating nights
- CRON job on the PowerMac to run the shell script
- Energy Saver setting on the PowerMac and PowerBook to wake in the middle of the early morning to run everything
- PowerMac set up headless, with VNC access from the PowerBook
Now that it’s all set up it looks pretty simple, but it took a little doing. Let’s start at the top. To do the backups, I needed some disk space.
I’m still doing archival/snapshot backups on CDs and DVDs, but it would be really nice to have everything backed up. I decided I wanted to mirror my entire PowerBook hard drive and my server (including my IMAP e-mail) onto two drives. Everything would go to one backup hard drive or the other on alternating days (even days go to one disk, odd days to the other) just in case something bad happened to a source (PowerBook or server) and I didn’t catch it before it updated that day’s backup.
I found a good deal ($89 after rebates) on 200GB Western Digital drives at Circuit City, so I decided to go pick up a pair of them. Unfortunately, it wasn’t that simple. There was a limit of one rebate per household, so the sales guy recommended he ring the drives up separately and I could have one sent to my PO box and one to my home. Not ideal, but not terrible.
I then went next door to CompUSA to see if they had any decent FireWire enclosures. When I walked in the door, I saw a display of 200GB Seagate Barracuda drives for $79.99 (after rebate, with no rebate restrictions – I asked). I’ve got more faith in Seagate drives than I do in Western Digital drives (plus there were no hassles with the rebate), so I decided to get a pair of these instead. They had FireWire enclosure there that didn’t look too bad; I got one of those too so I’d be ready to rock ‘n roll. Then I went back to Circuit City and returned the other drives.
My intention was to pull the 60GB drive currently in the second drive bay in the PowerMac and replace it with one of the new 200GB drives, then put the other drive in the FireWire enclosure. Having both in FireWire enclosures would be kind of nice, but having one on the internal bus is kind of nice too. I put one of the 200GB drives into the FireWire enclosure and connected it to the PowerMac – bingo! It mounted right up and I partitioned the drive with Disk Utility.
Next, I disconnected all the cables and pulled the PowerMac out from under the desk to put in the internal drive. I opened it up1, pulled the 60GB drive and put the 200GB drive in2, closed up the machine, reconnected all the cables and booted it up.
This is where I first got unexpected results. Disk Utility would only format the disk as 128GB. Hmm. Ok, I guess my PowerMac is so old that it only supports up to 128GB on the internal bus. Bugger. Back to the store to get another FireWire enclosure. From here, I ran into other problems, and finally gave up (CompUSA took the enclosures back) and ordered a pair of new enclosures3.
That took care of the disk capacity issue. The Bytecc enclosures are much nicer6 and were a little cheaper including shipping. The worked like a charm the first time and even stack more nicely. As Scott said: “We solve those problems… y’know, with money.”
The next step was to set up the rsync commands to transfer the data. Rsync is great because it just works, it transfers the data compressed over a secure connection and it will transfer only the data that has changed since the last sync (incremental backup).
The first step to setting up a scheduled rsync is setting up your SSH keys so the machines can connect without you needing to enter your password first. This article walks you through it nicely.
I’d read several positive things about RsyncX so I downloaded it to give it a go. It’s basically a GUI front end for a Mac OS X specific version of rsync – sounds perfect, right? Perhaps this is a great tool if you already know the ins and outs of rsync, but I couldn’t make heads or tails of the interface.
Frustrated by a lack of included docs with RsyncX, a search turned up the very thing I was looking for. Unfortunately, it was little help. I wrestled with a number of tutorials and the output from RsyncX for several hours before deciding to throw in the towel.
The cavalry was coming into town this week anyway, so I decided to put things on hold for a couple days. With Scott, we went straight to an rsync solution (dropping RsyncX) and had it working within an hour!
Command line rsync isn’t hard when you have example syntax to work from. Here are the commands I use to back up a directory on the web server:
rsync -rtlzv --delete --ignore-errors --exclude dir_name -e ssh email@example.com:/path/to/dir/ "/Volumes/$disk/path/to/dir"
and the entire PowerBook4:
rsync -rtlzv --delete --ignore-errors --exclude Network -e ssh "root@Computer-Name.local:/Volumes/Drive\ Name/*" "/Volumes/$disk/path/to/dir"
Then I created a shell script (bash) with each of the rsync commands5 that I wanted to run. I also added a few lines to set the
$disk variable in the commands above to the name of the drive I want to back up to that day:
j=`expr $i % 2`
k=`expr $j + 1`
if [ -e "/Volumes/$disk" ]
echo "backing up to $disk"
echo "could not find backup device: $disk"
That code will alternate between ‘Drive 1’ and ‘Drive 2’. For testing, you can hard code the
$disk value to a particular drive:
Then set up a CRON job to run the shell script. (Unlike RsyncX) CronniX is a nice simple interface for setting up CRON jobs.
Now I don’t want the PowerMac running all the time, so I went into Energy Saver in System Preferences and told it to wake up at a specific time, 2 minutes before the CRON job is scheduled to run. I did the same thing on the PowerBook. Since both machines are set to sync with the network time server, they should be waking up at just about the same time.
Now there was just one more trick. I only have one monitor which I normally hook up to my PowerBook. I wanted to use VNC to run the PowerMac from the PowerBook when needed, but when I was looking into setting this up at the beginning of the year, I’d read a number of things that all said that Mac OS X wouldn’t run headless. The best solution seemed to be to use a monitor adapter to trick the Mac into thinking there was a display attached. Yuck, not very elegant.
OS X Server is supposed to run headless, so I was looking into getting that installed when I found out that (at least with Mac OS X 10.3.6) a Mac will run like a champ in a headless configuration. I installed OSXvnc on the PowerMac and Chicken of the VNC on the PowerBook – gravy! If I need to wake up the PowerMac during the day, I just plug the mouse into the USB port and it fires right up.
I’m quite pleased and much less likely to lose all of my digital photos, source code, documents, music, databases, etc.
- I still like the way Apple designed the old PowerMacs with the logic board on the side that folds down. [up]
- Remember to set the jumpers. [up]
- These were recommended (and already tested and in use) by Adam who got them on recommendation from Eric. [up]
- To backup everything on the PowerBook, I needed to enable the root user on the PowerBook and connect as root. [up]
- Duplicate the server line and change the paths for each directory you need to back up. [up]
- 1/8th” thick aluminum casing and they stack beautifully. [up]
I have a configuration similar to yours — using my powerbook to back up to my g4/400 using RsyncX nightly. One experience I’ve had is that because rsync runs autonomously in the background, it is possible that something konks out but you not be aware about it until too late. I had a situation where I performed a system update on my server and it overwrote rsyncx so my powerbook rsyncx was… well…. not rsyncing. I caught it about 3 weeks after it happened and thankfully it was just a check I was doing, not a need I had. but it underscored the importance that the system alert you when something doesn’t work.
I still have to work on more elegant solution but basically I log all my rsyncs to a rysnc log file that I can view in console. the rsync command looks something like this:
date >> ~/Library/Logs/rsync.log; time /usr/local/bin/rsync -av -z –eahfs –exclude-from=”/~/backupExclude.txt” -e ssh “~/Documents” “~/Pictures/iPhoto Library” “~/Library” “user@serverIP:path/to/backup” >> ~/Library/Logs/rsync.log
the –exclude-from= switch is a text file containing patterns to not backup… (eg cache files, itunes stuff. etc.
I still want it to email me on failure or something… but I will fuddle around with it.
Also — I understand what you said about the rsyncx front end… why bother — the command line with the man pages is much less confusing.
The other thing I like about rsyncx is that it does this though ssh. this means that you can backup your powerbook anywhere in the world, so long as you have the domainname/IP of your server (and ssh is turned on)
Scott, I just send the output of my shell script to a log file – same deal basically.
Regular rsync uses SSH too, the only difference (as far as I can tell) betwee RsyncX and rsync is that RsyncX preserves Mac OS 9 file meta data. Plus normal rsync is standard on the OS so you don’t need to worry about it being overwritten. 😉
I think you can do something like this to have it mail you the log file:
/usr/bin/uuencode /path/to/filename.log filename.log > /path/to/filename.log.uu
/usr/bin/mail -s "CRON log" firstname.lastname@example.org < /path/to/filename.log.uu
Have you actually tried to boot from one of your backups?
One crucial difference between rsync and rsyncx is that rsync is _not_ aware of the differences between the HFS+ filesystem and the typical unix filesystems (UFS, ext2, etc.) RsyncX is — although it has other problems. As you mentioned, rsync therefore doesn’t copy some of the metadata which Mac users expect. More importantly, however, it does not copy file resource fork of any file — only the data fork.
The last time I checked, OS X would not boot from a disc whose files had lost all resource forks. This may have changed, and Apple is definitely moving away from them. But this doesn’t solve the problem of old files. For example, in older Microsoft Word documents created on the Mac, the document was entirely stored in the resource fork and it was the data fork that was superfluous. Turbotax apparently still depends on it. (see this link)
There are two “solutions” that I know of with which one can retain both folks and use rsync for backing up. One is to use the disk image facilities to create an image, need not require a resource fork. The other is rather ugly: back up to an SMB server (which puts the resource fork data files that are named dot, understore, the normal filename and then rsync _that_ file tree somewhere else. The disk images can be restored to a bootable volume, but the SMB server method does not easily result in a bootable volume — it merely preserves all of the data.
One of my hopes for Tiger is that the rumors that Apple will support a version of rsync that transparently handles multiple fork files in it pan out.
I don’t need (or ever intended) to boot from the backup… I just need to backup all the data files.
Use the command-line version of rsyncX. It’ll sync the resource forks and finder info, but it’s the exact same command line interface. It’s easy enough to use, and you don’t have to use the rsyncX GUI. Otherwise, you’ll be very surprised when a backed-up application or data file doesn’t work when you really need it to…
And you don’t have to learn anything new, just use the path to rsyncX’s version of rsync.
Since the current version (still installed) of RsyncX doesn’t clash with the standard rsync anymore, I’ll give that a try.
Honestly, I can’t think of any document types that I really care about that would be affected… text files and JPEGs for the most part. Perhaps I’m forgetting something important?
Now is there a way for rsync to zip the files before moving them over?
The -z option compresses the files before transferring them.
I may already be running RsyncX because I ran the installer before getting everything working and I think the installer moves rsync and replaces it with rsyncx. I’ve done
rsync --versionat /usr/bin/rsync and /usr/local/bin/rsync but both show the same general version information (no mention of rsyncx).
While it looks like you’ve already created a working system, you may be interested to learn about psycn. It’s a Perl utility that can provide incremental backup with HFS+ support, so it keeps Mac OS resource forks intact.
Here’s the Mac OS X Hints story on it.
Dodged a Bullet
Looks like the time I put into setting up my backup system was time well spent. Today I couldn’t commit code to my Tasks Subversion repository, and ‘svnadmin recover’ didn’t work. Um – yeah.
The good news is that the backup that ran earl…
You should look into rdiff-backup. It’s librsync-based and does a couple of things better than your current setup:
1. It has incremental backups, so you can go back in time in a much more sophisticated way than picking the older of drives 1 or 2. Handy if you don’t discover for a week that you corrupted some essential file.
2. It handles resource forks, starting with the 0.13 series. I just got bitten by Quicken – it stores significant data in the resource fork. If you restore Quicken files from an rsynced copy, it will say “Unable to load file”. I had to do some voodoo to recover any data from them at all. I was lucky that I used the backup only a few days after starting to use Quicken. If I had more data, that would have really sucked.
Synchronizing Data on Multiple Macs
Now that my new Mac Mini arrived, I am facing the age-old problem of synchronizing information across multiple machines. I have a lot of transient information (addresses, mail, bookmarks, etc.) on my PowerBook that will need to be synchronized with…
I for bootability I use the Carbon Copy Cloner command line scheme to a disk image that I can restore from. I do this for local/Applications as well.
Otherwise I have not experienced any problems with rsync and resource forks.
I have the automated rsync backups take care of my Users/ every hour and cpio -pdl every day, week, month and year.
The boot volume I snap shot after major upgrades or application installs. (a few times a year.)
This way I get a bootable image that may only need a few fink packages, or applications installed, and my data is backed up every hour with regular periodic snapshots.
[…] I first got my Quad, I had an automated backup system in place so I set my drives up as RAID 0 (striped) to get the best performance. After this, I’m […]