Linux backup: the complete guide

6th Mar 2011 | 10:00

Linux backup: the complete guide

Data loss is inevitable. Backing yours up is prudent

Introduction

There are two kinds of data in computing: the sort that's already lost and the stuff that isn't - yet.

You can spend a fortune on a storage medium that's anti-scratch, dust-resistant, heat-proof and contains no moving parts, but it'll all come to naught eventually if you haven't also invested effort in backing your data up.

Although it isn't particularly time consuming, backing up data requires careful thought and preparation, and involves more than just zipping files into a tarball. This means it's often neglected.

Note that an archive isn't a backup and it's important to know the difference between the two. An archive is a primary copy of data that's put away for future use. A backup, on the other hand, is a secondary copy that you call upon to recover your important files and information from data loss disasters.

So no matter what kind of user you are, or how you use your Linux distribution, this article has got something for you. Most of the backup tools discussed here only require a bit of thought and a little time to set up. Best of all, unless you've got terabytes of data, you can safely file it for little or no cost both on and offline.

We'll also discuss ways to organise and store your data more efficiently so that it's easily accessible and simple to back up. You need never lose data again.

A primer to the thought process behind making your data safe

Preparing for a backup involves careful consideration. For starters, where do you store your data? Keeping it on another partition of the same disk isn't advisable - what if the whole disk fails? A copy on another disk is one solution.

To protect your data against physical disasters, such as fires, foods and theft, keep the backup as far away from the original as possible, perhaps on the cloud.

Each method has it's advantages: hard disks offer the best price-to-space ratio and are also a convenient and readily available option, Flash drives offer portability, optical media's easily distributable, and online storage is globally accessible.

The kind of data also influences the choice of storage medium. A DVD might be useful for holiday snapshots, but is of limited use to a pro photographer. If you'll be backing up large quantities of data, it's advisable to get multiple, high-capacity hard disks. Or you might want to invest in a NAS (network attached storage) box.

Another option would be to create your own cloud by attaching USB disks to network accessible devices such the PogoPlug or TonidoPlug. Figure out which of these options best suits your needs.

What to backup?

Depending on the size of your home directory, backing it up completely could be overkill. Here are the essentials:

Your documents and files
/Documents, /Downloads, /Desktop

Most modern distros keep the files you've created or downloaded under these directories. Don't forget to check /home for any important documents.

Your email data (Evolution/Thunderbird/Kmail)
/.evolution, /.thunderbird, /.kde/share/apps/kmail

Depending on your client, one of these should contain your emails, plus their attachments, your address book and so on.

Other apps' data

Other apps create their own data repositories to store files. Most prompt you for the location, while some create their own. Check under their Preferences to search these out.

Installed software
/var/cache/apt, /var/cache/yum

If there's a piece of software that's crucial to you and you don't want to spend time downloading it again, back it up.

Personal settings
.bashrc, .profile, .gnupg/, .local/, .openoffice/, .mozilla/

These are some of the essential hidden directories that store user settings. Back them up for every user in your installation. Be vigilant, though. Some contain Cache directories, such as Firefox (under /.mozilla/firefox/whvmajqx.default/Cache for us), which needlessly add to the backup's size.

System settings
/etc, /var/spool/cron/, /var/spool/mail, /boot,

Pay close attention to these directories if you're backing up your entire installation. You'll find system settings in /etc. Although it's got a large number of files, it isn't very bulky. This is unlike /var. It contains cache directories for several apps you can miss out, plus /var/spool/mail, which houses the user mail files, and /var/spool/cron, which has the settings for cron, both of which you should back up.

If you've made changes elsewhere in the system, consider backing up those files under /usr/ and /usr/local/.

Data considerations

Now we know what to back up, so let's consider how to go about it. Do you want to back up manually or automatically based on a schedule? The correct frequency varies based on the kind and value of data being safeguarded.

Depending on the size of the files, it might not be a good idea to back them up completely every day either. Many backup tools enable you to do incremental backups - only creating copies of files that have changed since the last backup.

Will you manipulate the data before safeguarding it? If you're backing up large quantities of data, it's advisable to compress it. If the data's sensitive, you can encrypt it too. Remember that both add to backup overheads.

Finally, to ensure the data's integrity, checksum and validate it regularly.

Step-by-step: Crontab entries from a GUI

1. Create your crontab

Corntab 1

Despite its simplicity, automating tasks with Cron can be a tricky task if you are not used to it. Corntab (www.corntab.com) is a browser-based visual front-end that helps you cook up an appropriate crontab entry.

2. Email it

Corntab 2

The Corntab interface has sliders and check boxes to help you pick both the time (in minutes, hours, days of the month, months and days of the week) and command that you wish to schedule with Cron.

3. Paste into crontab

Corntab 3

When you're done, copy or email the crontab entry, and paste it into crontab from the command line with the crontab -e command. When you save and exit the crontab editor, the new entry will be activated.

Protect your data easily with these no-fuss tools for beginners

Déjà Dup

DejaDup

Aren't yet used to the ways of a backup tool? Then Déjà Dup is for you. It has a minimal interface so as to not overwhelm new users, yet it's based on the powerful command linebased Duplicity and integrates nicely with Gnome.

Pulled from the repositories, Déjà Dup installs under Applications > System Tools. Before you use it, you'll need to set its Preferences. Start by pointing it towards the location where you want to house your backups. This can be a local hard disk, a remote location via SSH, or Amazon's S3 web storage.

Then specify the list of directories you want to include in and exclude from the backup. By separating these two, Déjà Dup gives you the flexibility to include a large directory - for instance, /home - in your backup, while specifying parts to leave out, such as .cache/.

By default, Déjà Dup encrypts your backups, but you can ask it not to do so by unchecking the Encrypt Backup Files box. Next to it is a pull-down menu that enables you schedule regular backups.

When you're done, click the Backup icon to invoke the process. If you've opted to encrypt the data, Déjà Dup now prompts you for a password. It then provides a summary list of the directories involved and begins.

This initial backup may take some time, but subsequent backups are incremental - dealing only with what's changed - and thus much faster.

When restoring backups, Déjà Dup enables you to restore them to their original location or under a specific directory. Since the backup's directory contains encrypted material, you'll be prompted for your password again.

Finally, you're presented with a time-stamped list of backups to restore. That's all there's to it.

Déjà Dup is ideal for backing up files under a user's /home directory, but you might run into authorisation issues with system files. Also Déjà Dup doesn't allow you to create backup sets. So if you wish to back up a different directory, you'll have to modify the Preferences.

Similarly, in order to restore from different locations, you'll have to change the location first under Preferences.

LuckyBackup

LuckyBackup

While Déjà Dup is suitable for most users, if you want something that's able to handle multiple backup schemes, then use LuckyBackup.

Among its strong points is that it supports multiple profiles, enabling you to manage different backup sets. A default profile is created when you first launch the app and, like all profiles, must have a task attached - either to perform a backup or restore data from one.

Tasks can be one of three types: you can select to back up just the contents of a directory, replicate the entire source directory as is, or you can synchronise the source and destination, which is handy when you need to keep files found under two directories in sync.

When the synchronisation task is executed, LuckyBackup checks for the newest version of a file under both the source and destination directories and copies them to the other. So newly created files in one location are replicated in the other. The only drawback is that if you have deliberately deleted a file/folder in one location but not its counterpart, these will be automatically recreated.

Elsewhere, the Advanced button expands the New Task dialogue to give you fine control over the files to include in, and exclude from, the backup. If you'll be backing up to a remote directory, specify your connection details under the Remote tab.

Power users will appreciate the convenience of the Also Execute tab, which enables you to specify a list of commands to execute before and after the backup.

When you're done creating a backup, click the Validate button to ensure your settings are good to go. With all your tasks for multiple locations set up, it's time to schedule them. Head over to Profile > Schedule, and click Add. Now select the profile to schedule and customise its run time.

Finally, click the CrontIT! button, which automatically creates a Cron job for the backup. To manually run a backup, select the task to execute and click Start. You might also want to check the Simulator box to simulate the backup and ensure it will run properly.

The process of restoring a backup in LuckyBackup is just a backup task with the directories reversed. Also remember to uncheck the Skip Newer Destination Files box under the Command Options tab in the Advanced view.

Finally, execute the restore task as usual and your backed up data will be reinstated in its original place.

Enterprise solutions

BackupPC

BackupPC

If you manage a computer lab or work in an enterprise setting, backing up individual computers using the tools we've covered so far would be a chore. When you have a bunch of machines to take care of, it's best to rely on BackupPC. Be warned, however, that it's not for the faint of heart, despite its web-based interface and extensive documentation.

While it can be used on individual machines, it's best called upon when you want to safeguard data on multiple computers. Not only that, but it will work across Linux, Mac, or Windows, and is well suited for environments that have a mix of different OSes.

It has impressive features too, including pooling. This reduces backup sizes by saving only one copy of identical files that exist on many computers. For example, if you have the same distro running on all computers, BackupPC will only keep one copy of the system files.

Install and configure

You can install BackupPC from your distro's repository, or get the latest version via the tarball.

Before you extract and install it, make sure you have the following Perl modules: Compress::Zlib, Archive::Zip, XML::RSS, Net::FTP and File::RsyncP.

You can install them using CPAN a la: perl -MCPAN -e 'install Compress::Zlib

With the various libraries in place, you should download the tarball, untar it and then enter the following: perl configure.pl

When you run configure.pl, you'll be prompted for the full paths of various executables and for configuration information such as the BackupPC user, the data directory and so on. By default, the configuration files will be stored in /etc/backuppc.

Once it's set up, you can start the program with /etc/init.d/backuppc start

The basic BackupPC configuration can be edited via the app's web interface, which you'll find by pointing your browser towards localhost/backuppc. Use the username and password you specified when configuring BackupPC to login to this.

The interface also lets you browse the various hosts as well as initiate backup and restore operations. You can edit basic configuration settings from the Edit Config menu. Use the Add button under the Edit Hosts section to include a client to back up.

In order to set up individual clients, you'll have to manually edit their configuration files, and provide details depending on the method used for backing up (BackupPC supports SMB, TAR, Rsync and FTP).

An /etc example

For example, the following backs up the /etc directory on localhost using TAR: $Conf{XferMethod} = 'tar'; $Conf{TarShareName} = ['/etc']; $Conf{TarClientCmd} = '/usr/bin/env LC_ALL=C $tarPath -c -v -f - -C $shareName'

To begin the back up, head to the web interface, select a host and then click Start Full Backup. The Status page will show you which backups are running. Alternatively, you could also perform an incremental backup if you have previously archived files to add to.

With backup data in place, BackupPC enables you to view and restore individual files, or complete filesystems. You can either download the backed up files as zipped archives, or directly restore them into their original computer.

There's far more to BackupPC than we can touch on here; it's the most comprehensive program in this feature. As such, you'll need to spend time browsing its documentation and adapting it to your network to make full use of it. Our in-depth tutorial in LXF125 may also help if you have access to it.

Tools restore or clone a working system

MondoRescue

Mondo rescue

MondoRescue isn't your everyday backup program, but rather specialises in recovery after catastrophic data loss. It's ideal for backing up the core filesystem, say once a month. It can also be used to clone an installation on larger partitions.

While your distro might include MondoRescue in its repositories, it's best to grab packages for the app from ftp://ftp.mondorescue.org. You'll also need Mindi, Mondo's companion tool that packages backups into bootable distros, and mindi-busybox, which contains the tools Mindi needs.

When you're all set, launch MondoRescue as root with sudo mondoarchive You'll see the tool's crude-but-effective Ncurses-based interface. You're asked for your choice of backup medium, how much compression you'd like to use, and whether it should divide the backups to fit CDs or DVDs.

Then you'll be asked what to back up. By default, the app backs up everything under the root directory. MondoRescue can also back up Windows partitions if it detects them on your disk. You should let MondoRescue verify the archives it creates too - this takes time but is well worth it.

When it's ready to copy data, MondoRescue creates a catalogue of files, divides them into sets, then calls Mindi and finally begins backing up, which can take several hours. If you've asked MondoRescue to back up to a hard disk, when it's done you'll find one or several ISO images inside the directory you specified. Boot from the first image and enter compare at the boot prompt to check the archived copies against your filesystem.

At the end of the process, this prints the non-matching files. There might be some immediately after backing up, but these are often just cache files, which can be safely ignored. To format and restore all files, type nuke, or interactive.

If you're restoring to a blank hard disk, MondoRescue will also partition it and adjust the backed up partitions to suit. It'll also regenerate the bootloader, which you can then fine tune.

Tonido

Tonido

Back in LXF 122, we looked at a piece of software called Tonido to help you create your own personal cloud server. It's a wonderful tool for sharing your files over an internal network as well as the internet. It might not be open source, but it gets the job done without you having to mess with your router and firewall settings.

Tonido is available as a binary for both Deb and RPM-based distros, or you can download it from www.tonido.com.

The only bit of setting up it requires is a username, which becomes part of your tonido web address. So if you choose Fluffy as your user name, you can access your files from anywhere by pointing your browser at fluffy.tonodoid.com.

Note that your data is still stored on your computer, not external servers, and is simply served over the internet, which may help qualm any fears you have about the security of what you store.

Tonido also includes an application to back up data to a local disk or remote computer. To perform a backup, log into Tonido's web interface and click the Backup app. This then opens another interface that enables you to add and schedule backups. Click on the New button to add a new backup record.

The process involves selecting the device and the backup source and destination folders if you want to backup to a local disk. If you want to back up to a remote computer, you'll be presented with a list of peers. You can only back up to remote machines that are in your group.

Tonido identifies machines with their globally addressable peer ID. So you can back up to any machine on the internet, as long as it's in your group.

Once the backup is good to go, you can schedule it to run at periodic intervals, or run it manually. If you're particularly paranoid, you'll also be glad to know that Tonido encrypts data using AES encryption and transfers it directly from the source computer to the remote computer.

Tonido has many other features too. It enables you to collaborate, share, and sync files with others on the internet via Group Workspaces. To sync content through Tonido Groups, other users will need to have Tonido installed.

Since the software runs and functions the same way on both Windows and Mac OSX, however, you can share your data with them regardless of their chosen operating system.

How to make crash-proof discs

DVDisaster

DVD disaster

Optical discs are the commonly preferred media for keeping backups. However, even when stored carefully, they'll go bad over time. One option is to make new copies of the backup discs. Depending on your backup catalogue, this could be an exhaustive and expensive exercise.

A better option is to use DVDisaster. The tool creates an error correction code (ECC) file from a healthy disk, which can be used later to recover data when the media is damaged.

DVDisaster works on ISO images. To create one, insert the disc into the drive and launch DVDisaster after it's spun down. Now click on the Image File Selection icon, type in a name for the ISO image and select a directory for it to be stored in, then click the Read button. The app will read the disk sector by sector, then create the image as per the name and location you specified earlier.

Correction corner

Now it's time to create an ECC file. DVDisaster supports two types: RS01 and RS02. The former stores the ECC file in a remote location, while the latter bundles it along with the ISO image.

To make your selection, head over to Preferences > Error Correction, and select the storage method from the drop-down menu. We'd advise you to stick to the default RS01 method and store the ECC file using a separate medium.

Using the default settings, the ECC file is about 15% the size of the ISO file. For better protection, head back over to Preferences > Error Correction and select the High option. This balloons the ECC file to about 35% the size of the image, but gives you a better chance of restoring badly damaged media.

With an ECC in place, it's now a good idea to regularly check backup media with DVDisaster. Just insert the media in the drive, and click on the Scan button. If the scan detects bad sectors in the media, it's time to recover the lost data.

For that, first create an ISO image of the damaged media using the same procedure as before. Then find the ECC file you created earlier for the damaged media and point to it using the button for ECC file selection. With the image and ECC file in place, click on the Fix button, which reads and repairs the damaged image.

The success rate of the recovery depends on the state of the damaged disk, which is why it is necessary to scan the media regularly and repair it as soon as bad blocks show up.

Step-by-step: Back up a disk or partitions

1. Where to save?

clone 1

With a Clonezilla Live CD you can back up your entire disk. After booting the CD and opting to create a clone, select where the images are saved, which can be on a local device or over the network.

2. Disk or partitions

clone 2

Now you'll need to choose your mode. The Save disk option clones whole disks, and will later prompt you to select a disk on the computer. To save individual partitions, select the Saveparts option instead.

3. Back up selection

clone 3

Depending on your previous selection, you're shown a list of disks or partitions. Use the Spacebar to mark multiple partitions to back up. Once done, follow the onscreen instructions to complete the process.

Store your files online

SpiderOak

SpiderOak

The most convenient place to back up is online. There are plenty of services that enable you to store files online and access them from anywhere you want. In fact, newer versions of Ubuntu bundle clients for the Ubuntu One service, but this isn't as cross-platform as Dropbox.

In turn, Dropbox has the drawback of restricting you to a single directory for backups and synchronisation. SpiderOak, on the other hand, has a consistent interface across Linux, Windows, and Mac, and enables you to back up any file or folder.

The service offers 2GB of free space, or 100GB for $10 per month. When you install the client and register for the service, the installer generates encryption keys that it then uses to encrypt the data before transmission.

The app's interface is divided into tabs. To back up files, simply head to the Backup tab and select your files or directories. Switch to the Advanced view to fine tune your file selection. When you're done, click on the Save Settings button. That's it.

Now SpiderOak compares the contents of the local folder with the one it keeps online. Whenever there's a change, it automatically starts the backup. Moreover, the service keeps track of changes to the files using version control with a date stamp, which lets you roll back to previous versions of a file.

This makes SpiderOak ideal for keeping copies of important documents you're working on, or photos you've transferred from your camera. Your files are kept on the server unless you explicitly ask SpiderOak to remove them.

In addition to its backup features, the service can help you share files with others via virtual isolated silos. Others can subscribe to these silos via RSS, which keeps them updated of any new additions.

JungleDisk

JungleDisk

Although it's proprietary, JungleDisk works across platforms, and enables you to keep data in Amazon's S3 service or the Rackspace storage equivalent. The Desktop Edition costs $3 per month with 5GB of free storage. You can get additional storage for $0.15 per GB per month - find out more at https://www.jungledisk.com.

What sets JungleDisk apart from other online solutions is that it lets you mount your online storage as a network device in your filesystem, so you can directly save files to the cloud. To restore the files, just mount your drive and copy them onto your desktop.

Besides the network drive, JungleDisk also enables you to schedule automatic backups, which are kept separate from the network drive. The data is encrypted and compressed using data de-duplication. So although it keeps multiple timestamped copies of your data, it minimises online disk space usage by avoiding backing up redundant data.

What's more, when you upload a file, JungleDisk automatically creates a public URL with an expiry date one week in the future in order to help you share this file with anyone.

Once installed, the JungleDisk client sits in your taskbar. Use it to configure backup settings, such as selecting files and folders to back up. You can also use it to change the schedule of an automatic backup or run one manually.

What's more, you can set up JungleDisk to keep certain files and folders on your local disk in sync with the online disk. Any changes to files locally will be automatically copied to your online storage.

Step-by-step: Back up browser data

1. Download

Xmarks 1

Head to www.xmarks.com to get hold of XMarks. It works with Firefox, Opera, Chrome, Internet Explorer and Safari; is cross-platform; and even works on mobile devices.

2. Configure

xmarks 2

From the Addons window, click Preferences and then run the Setup Wizard to configure XMarks to back up your browser's collection of bookmarks and passwords.

3. Restore

xmarks 3

Now when you install XMarks on a new computer, you can download and sync your bookmarks from the server. You may also manually restore them.

This is the real world of backing up - here's how to deal with it

With your hard disk's contents now more secure than a locked box in a reinforced vault that's buried in concrete at the bottom of the Mariana Trench, you might imagine you're done, but think beyond your hard disks for a moment.

Do you blog? Run a website? Use a web-based email service that also holds your calendars and contacts? Then you'll want to keep that safe too.

Back up blogs

Most blogging software and content management systems, such as Wordpress and Drupal, have plugins or modules to help you download and save your content offline, which you can then file away with your favourite backup tool. If your web host runs PhpMyAdmin, you can also use its Export feature to download entire databases - or selected tables inside them - in a variety of formats.

Alternatively, if you have shell or telnet access to your database server, you can back up the database from the command line with mysqldump, as in the following example: mysqldump -u [username] -p [password] [databasename] > [backupfile.sql]

The backupfile.sql file will contain all the SQL statements needed to create and populate the tables in a new database server.

Some web hosting control panels, such as cPanel, also enable you to back up your entire website with a single click.

Back up online email

Then there's web-based email services such as Yahoo and Gmail. Yahoo lets you archive messages via POP, but you'll have to sign up for Yahoo Mail Plus, which costs $19.99 a year. Once subscribed, however, you can configure offline email clients such as Evolution and Thunderbird to fetch messages from the Yahoo servers, and keep them on your hard disk.

Gmail uses the IMAP protocol to synchronise your online mailbox with the one on your disk. In your Gmail account, make sure IMAP access is enabled under Settings > Forwarding And POP/IMAP.

Thunderbird will automatically configure itself for sending and receiving emails once you've pointed it towards your Gmail account, and the setup procedure isn't much different with Evolution. Once it's been prepared, right-click on a folder and select the Copy Folder Content Locally For Offline Operation option. Then head to File > Download Messages For Offline Usage to download messages.

Evolution also enables you to save individual messages with the File > Save As Mbox option. To make your emails easy to back up, Evolution will also compress them in a single tarball. Head to File > Backup Settings and specify the location where you want to keep this.

To restore your email, head to File > Restore settings, and point it towards the compressed tarball.

Backupify

There's a lot of other data you already have online on Facebook, Twitter and other such services. Like your blog and email, it's a good idea to take occasional snapshots of this data and back it up locally, which is where Backupify comes in.

Backupify

It's a web-based service that backs up data on other internet services and enables you to download it all to your local disk. It can even handle your blog and email if you want an all-in-one solution. It requires no installation either; just register on its website and authorise the service to back up your accounts.

It currently works with over a dozen different services, including the ever-popular Facebook, Twitter, Flickr, Google Docs, Gmail, Blogger and Hotmail, but check the website for a full list.

The basic service is free, offers 2GB of free storage, and backs up data from your online accounts weekly. There are also paid-for plans that offer more storage and let you adjust the backup frequency.

Backupify backs up data it receives from the services as is, which is generally in XML. However, for some services, such as Twitter, it can also generate a PDF.

Currently, the service doesn't enable you to download emails in bulk and the ability to search backed up messages is under beta testing. You do have the option to download individual messages in the EML format, though, and Backupify can also restore backed up messages to Gmail directly.

------------------------------------------------------------------------------------------------------

First published in Linux Format Issue 142

Liked this? Then check out Best Linux backup software: 8 tools on test

Sign up for TechRadar's free Weird Week in Tech newsletter
Get the oddest tech stories of the week, plus the most popular news and reviews delivered straight to your inbox. Sign up at http://www.techradar.com/register

Follow TechRadar on Twitter*Find us on Facebook

software Linux TRBC
Share this Article
Google+

Apps you might like:

Most Popular

Edition: UK
TopView classic version