This is the first in what is going to be a (fingers crossed) long line of tutorials and tips and tricks that users of this site find and use throughout the course of their days. Anyway - lets get on with the rest of the story!
rsync is a Unix command-line based remote update protocol. Basically what this means is that it is an intelligent file transfer protocol - the intelligent part being that it allows the transfer of just the differences between two sets of files. Sounds technical... but it's not really that complicated and can be very useful. Here's a couple of potential use cases:
- You would like to run nightly backups of your data from your PC to another PC (say a server in your department/company). But, you don't really want to be backing up ALL of your data every night - that would take way too long and waste far too much space on your server. Why can't you just have a copy of your documents on your server and only transfer the files that have changed? With rsync - you can.
- You're the manager of a web server hosting quite a complex web application or a rather large group of web pages and images. You have a testing server where you do all of your work and make sure everything is okay, but it always takes too long to transfer the new build of your website to your production server as you don't really keep track of the changes and always end up re-uploading everything! With rsync, you could only transfer the files that have been altered.
So the basic usage of rsync is to take an exact mirror of one set of documents/files from one location to another - and also save bandwidth by the fact that after the initial mirroring, only altered files are then copied in the future. Sounding useful yet?
As mentioned earlier rsync is a command-line protocol for Unix systems (Unix/Linux/Mac OS X) - rsync can be used on Windows PC's via Cygwin, but the set-up and running of that is something i've never done before, so we won't talk about that anymore today, but if you are interested in trying - this page might be of some use.
Now, let's get onto the basics of the syntax! For more complete documentation that my rough overview, read the rsync man page. The following is going to be more of a quick "how to"
So, the "source" from above is the location of the files that we would like to mirror/copy, and the "destination" is the destination of where we would like the backup to go to... Here's a quick example:
The above would copy the folder "Documents" from my home directory, into "/home/public" (creating the folder "Documents" too). The "-r" option tells rsync to copy the files recursivley - so it includes everything within the "Documents" folder (otherwise, child folders would be ignored), and the "-v" option gives us verbose output - so that we get told what's going on.
So, that's a backup of my files into another location on my hard drive. Useful, but it's not really securing my data against the possibility of a hard drive failure. What would be more useful would be to backup all of my data to another PC or server. We can do this by making rsync use the ssh remote login protocol...
So once again, we're backing up the "Documents" folder in "/home/username", but this time we're copying it to another PC in the group called "dept-server" (which I have a user account on, and is also a Unix based system), into the folder "/home/username/Backups". The differences from the previous command are quite noticeable - firstly, the "-e ssh" option tells rsync that we would like it to use the SSH protocol for all file transfers, secondly is the destination of the files, we're declaring the server that we're transferring data to, and the user account to use (in the "destination" section). The syntax for this is as follows:
It's as simple as that! Now let's add some refinements...
What we've added to the above are a couple of extra options that some might find useful, some might not. First was the "-z" option - this compresses all of the files prior to transfer (speeding things up and reducing bandwidth usage); the "-P" option gives us a progress indicator of the file transfers as they are happening; finally the "--delete" option makes rsync delete any files on the backup server that are no longer within our (original) "Documents" folder.
However, be warned when using the "--delete" option above - this WILL delete everything on the destination server, that is not in the original folder, even including hidden system files! So, I would not recommend making backups into your main "/home/username/" directory using this option as it will remove all of your config files and possibly render your user account on the server unusable! Create a folder specifically to transfer your backups into. REMEMBER - YOU HAVE BEEN WARNED...
Finally - to put this together nicely, so that it becomes slightly easier to perform our backup nightly, why not put the rsync commands into a bash script? This way it stops any mistakes that can be made by a typo and it stops you having to really remember all of the above - once your script is ready, you just call that whenever you want to perform a backup! So, our script would look something like this:
rsync -e ssh -rvzP --delete /home/username/Documents username@dept-server:/home/username/Backups/
I would recommend creating a "/home/username/bin" directory to save small scripts like this in and call it something meaningful like "rsync-desktop-to-server.sh". Finally, make the file executable:
Now all I have to do to backup my "Documents" folder nightly is type the following command before I leave:
I hope some of you will find this as useful as I do!




News Feed
Very nice tutorial Daz, will come in handy for many people I am sure.
For those of you who are not on a unix based system here is a link to a nice little run through for using rsync along with cygwin:
Lifehacker windows rsync tutorial
Also someone has created a GUI based interface for using rsync in windows to make it a little more user friendly, this can be found here.