Anatomy of a hard drive: what really goes on inside your PC's storage

9th Dec 2012 | 10:00

Anatomy of a hard drive: what really goes on inside your PC's storage

The most valuable component inside your PC?

Here's a thought: what's the most valuable component inside your PC? Valuable to you that is - not in terms of resale value.

PC kit doesn't really make for good family heirlooms, unless your grandfather bought an AdLib sound card in his final days and it was passed on to you.

But for most of us, the most valuable component is definitely the hard drive.

If a CPU blows up or a graphics card buys the farm, we can simply buy new ones. But if a hard drive says "goodbye cruel world", taking all of your vital files with it (and you don't have recent backups), well, no amount of money can fix that.

And yet, despite its importance, the humble hard drive doesn't get much attention.

We all have a tendency to focus on flashy things such as new distros and desktop environments, but there's a wealth of useful information to discover and learn about these devices.

For instance, there are many different strategies for splitting up the disk into different chunks (partitions), affecting security and performance. There are different types of filesystem you can use, and tricks you can employ to recover data if something goes wrong.

Anatomy of a Hard Drive

New technologies, such as SSDs, are changing the role of hard drives. If you've accidentally deleted a file, there's still a chance that you can recover it thanks to some cunning tools.

So far from being a boring box of bytes stuffed into a random space in your PC, the hard drive is actually a world of technology, with many options for customisation.

Our aim in this feature is to teach you everything that's worth knowing about hard drives - and a little bit more as well. We've also included a few bits you can cut out and stick on the wall next to your PC, in case you have an emergency.

Just to be on the safe side (for us and you!), a quick disclaimer: this guide covers making modifications to the structure of hard drive data. We absolutely recommend trying out commands and options for yourself, as it's the best way to learn, but only on a test machine (or in VirtualBox).

Don't experiment on your main PC, unless you want to risk losing data!

What are partitions?

A blank hard drive isn't much use to anybody; it needs some structure before it can start storing files. From a low-level perspective, drives are made up of sectors, which are very small units of data storage at fixed locations on the disk.

There can be many millions of sectors in a drive, and they are organised into meaningful groups at multiple levels. First off, at the foundation level, we have partitions (we'll look at filesystems later).

Essentially, a partition is a collection of sectors assigned to a specific storage task. Most brand new PCs ship with only Windows (sadly), so in their hard drives there is just a single, large partition that occupies almost the entire disk.

This appears as the C: drive when Windows boots up. Some machines have a second 'rescue' partition, containing a backup of the OS for when it needs to be reinstalled.

The purpose of partitions is to keep data areas separate from one another.

When you install Linux on a Windows PC, for instance, the Linux installer typically shrinks down the Windows partition to make room for Linux ones. At the end, you have a drive with multiple partitions, as in the diagram.

Windows knows it shouldn't mess around with unrelated Linux partitions, and vice-versa. The sizes of these partitions vary from system to system, depending on how much you allocate to each OS.

Cut out and keep: emergency partitioning

The fdisk program is much like the Vi text editor, but for partitioning: it's terse, minimal and available in virtually every distro. Start it (as root) by providing a drive path (to the device node) like this:

fdisk /dev/sda

In a typical Linux installation, /dev/sda refers to the first hard drive, while /dev/sdb refers to the second, and so forth. Enter p and you'll see a list of partitions on the drive, as in the screenshot.

Note here the Start and End columns, which show sectors in use. Each partition has a number, so sda1 is the first partition on the first drive, and sdb3 is the third partition on the second drive.

To delete a partition, enter D and you'll be prompted for the number.

To add a new partition, enter N. You'll be asked whether to make it primary (maximum 4) or extended; go for the former if you have room, for simplicity's sake.

Then enter a start sector number (taking into account the list earlier) and size. Back at the main prompt, enter P and you'll see the new partition in the list.

It has no ID at the moment, though, so enter T, then the partition number, and then Shift+ L to list available types. Enter 83 for a Linux partition, 82 for a swap partition, or 7 for a Windows (NTFS) partition. Now enter W to write the changes to disk, or Q to quit without writing.

Linux and Windows partitions on a hard drive

>An fdisk session, showing the Linux and Windows partitions on a hard drive.

A separate /home: yes or no?

One of the biggest choices you face when installing Linux and partitioning a hard drive is this: do you put the / home directory on a separate partition?

This is where user files live - that is, personal documents and settings for user accounts, as opposed to operating system files, which live in separate directories.

Some Linux distributions recommend using a separate partition, whereas others default to dropping everything into the same partition. So, what do you do? The answer depends on how you want to use your machine.

If you plan to try many different distros, and you're often installing new ones over the top of old ones, then it makes sense to have a separate /home partition.

In this way, you can do what you like with the operating system - upgrade it, downgrade it, or wipe it all and try some random new distro from the Faroe Islands.

Whichever Linux flavour you happen to be running, your personal files will always be there, stored safely on a separate part of the disk.

If you're careful, you can even have multiple Linux distributions on the same machine, all using the same partition for the /home directory after booting.

But why do we say you have to be careful? Well, think about settings and configuration files. If you do an ls -a in your home directory, for instance, you'll see a large number of hidden files and directories starting with full-stops - these store settings for programs. If you try to use the same settings between different versions of a program, it can really confuse that program.

For instance, let's say you have Distro A and Distro B on your machine. You boot Distro A and run FooProgram 2.0 for the first time, which creates a .fooprogram/ settings folder in your home directory.

Then you boot Distro B with the same home directory, and start FooProgram - but in this case, it's version 1.0. It'll get confused by differences in the configuration files, and could crash or corrupt data. Another potential problem with separate /home partitions is the size constraint.

If you put everything on one partition, then the OS and home directories both have access to free space.

If you put /home on a separate partition and run out of room, you can't easily take space from the OS partition (if you use LVM, the Logical Volume Manager as offered during the installation phase of many distros, you can overcome this, as it supports partition resizing).

There are plus points to the separate-partition approach, though, especially now that SSD (solid state) drives are becoming more affordable and popular.

Because they're screamingly fast in comparison to spinning hard drives, you could put the OS files on an SSD for fast system and app startup times, and then /home on a traditional hard drive (after all, you're not too bothered how long it takes your LibreOffice documents and photos to load).

For general spinning hard-drive installations for home desktops, where you're not going to be trying a new distro every other day, though, we recommend the 'putting everything in one partition' approach.

The most important directories

Have a look in the root (/) directory, and you'll see lots of directories that may be unfamiliar to you.

You may see a lot of unfamiliar directories

>The root (/) directory might look like a jumbled mess of random words to Unix newcomers, but it actually makes sense. Everything has its own place.

While most users rarely need to venture into these directories, it's worth knowing what they do:

>/bin Binary files, or more specifically executables that are used by the base system. However, this doesn't include larger desktop applications, such as Firefox (they are kept in /usr).

>/boot Files used for booting, such as the Linux kernel.

>/dev Device nodes. Here are files which can be used to access hardware devices.

>/etc Configuration files for the system (per-user settings are stored in the /home directories).

>/media Removable media, such as USB keys, are often mounted here.

>/mnt Another place for mounting drives (rather confusingly), but usually hard drives or network shares.

/opt Optional application packages.In some distros huge beasts of software, such as KDE or LibreOffice, live here.

/proc Process information. Only really useful for admins wanting to monitor a program's resource consumption.

/sbin Critical executables for running the system, but which should only be executed by the super user (root).

/usr This contains non-critical files, such as applications. Inside is /usr/lib, which contains most of the libraries used by apps.

/var Variable files - ie, data which changes a lot, such as databases, mail spools and system logs.

Cut out and keep: sync your disk

Here's something that might surprise you: when you save a file in a program, it doesn't actually get written to the disk straightaway. At least for small files (eg, less than a megabyte), anyway.

For performance reasons, operating systems don't write data to the hard drive at every request, but wait until there's a lot of data from multiple write requests.

So, the OS stores all of these write operations in a RAM buffer and then commits them all to disk in one fell-swoop. If you've ever been unlucky enough to have a power cut a few seconds after hitting Ctrl+S in a program, for instance, you'll have seen this in action.

Fortunately, there's a solution. At any time, you can enter sync in a terminal window to guarantee that all write operations are written to the physical disk.

And then there's a special key sequence you can use if the X Window System freezes, ie the graphical layer has totally locked up, but you want to sync everything to the drives and reboot safely. It's called the Magic SysRq key, enabled in most distros, and is enabled as follows:

Hold down Alt+SysRq (usually top-right on the keyboard) and then press the following keys in order:

R (get keyboard control back), E (terminate processes), I (kill errant processes), S (sync data to disks), U (unmount drives), and B (reboot).

What's in a filesystem?

A hard drive without a filesystem is just a jumble of data. The filesystem helps the operating system to make sense of the disk - finding out where files start, where they end, and which directories they belong to.

In a simple filesystem, such as DOS FAT, you have a table in the first few sectors, describing where the files are located. Every file has an entry in this table (which is why most filesystems have a limit on the number of files), containing its name, the time it was created, how big it is in bytes, which sector it starts at, and so forth.

By far the most common filesystem in the Linux world is ext4, which is an excellent, reliable, general purpose filesystem for hard drives.

For a while, there was some competition in the Linux world in the form of ReiserFS, an innovative filesystem whose development suffered a setback when the lead developer was charged with and later convicted of the murder of his wife…

There are other filesystems worth being aware of, typically suited to more specialised tasks than a regular desktop PC.

ZFS, for instance, provides great performance and reliability across multiple disks, as covered in our FreeNAS tutorial on page 90.

Then there's LogFS, designed to be used on flash drives (which internally work completely differently to spinning hard drives, and therefore can benefit from a dedicated filesystem).

It's interesting to note that with all the advanced filesystems in use today, on both Linux and Windows, typical USB flash keys come pre-formatted with FAT32 (and its limitations). It feels a little bit strange to use such backward technology today, but it does mean that these keys are compatible with virtually everything.

Cut out and keep: recover lost files

Making backups is the single most important thing you can do for your data. And the second thing is making even more backups.

But despite good intentions, we can all make mistakes and accidentally delete files. Due to the way modern filesystems work - shuffling files around on the drive to avoid fragmentation issues - file recovery software isn't always 100% successful. But there's hope.

First, get hold of a rescue distro with Photorec installed. One of the best is Recovery Is Possible, aka RIP, a mini distro designed for fixing damaged Linux installations.

Burn it to a CD-R and keep it handy near your PC for emergencies.

When you want to recover an accidentally deleted file, shut down the machine and boot RIP. In a terminal, enter 'photorec' and you'll be prompted to select the drive and partition that contained the file.

Then you'll be asked for the filesystem type, and whether to scan the whole drive or just space marked as empty (the latter is quicker). Finally, you'll be asked for a location to store recovered files.

Afterwards, recoverable files will be placed in recup_dir folders, followed by a number. You won't have the original filenames, so an image such as kitten.jpg could become f0015362.jpg.

If you have lots of files, you'll have to look at them and rename them manually. But at least you have the data back...

Photorec

> During the recovery process, Photorec shows which file types it has identified in searching through the raw disk data.

hard drive partitioning SSDs linux
Share this Article
Google+

Most Popular

Edition: UK
TopView classic version