How Linux works
5th Jun 2010 | 09:00
Have you ever wondered what exactly DCOP is, or where your drivers are hidden?
Leve1 1: Userspace
The main problem you face when you're attempting to lift the lid on what makes Linux tick is knowing where to start. It's a complicated stack of software that's been developed by thousands of people.
Following the boot sequence would be a reasonable approach, explaining what Grub actually does, before jumping into the initiation of a RAM disk and the loading of the kernel. But the problem with this is obvious. Mention Grub too early in any article and you're likely to scare many readers away. We'd have the same problem explaining the kernel if we took a chronological approach.
Instead, we've opted for a top-down view, tackling each stratum of Linux technology from the desktop to the kernel as it appears to the average user. This way, you can descend from your desktop comfort zone into the underworld of Linux archaeology, where we'll find plenty of relics from the bygone era of multi-user systems, dumb terminals, remote connections and geeks gone by.
This is one of the things that makes Linux so interesting: you can see exactly what has happened, why and when. This enables us to dissect the operating system in a way we couldn't attempt with some alternatives, while at the same time, you learn something about why things work the way they do on the surface.
Level 1: Userspace
Before we delve into the Linux underworld, there's one idea that's important to understand. It's a concept that links userspace, privileges and groups, and it governs how the whole Linux system works and how you, as a user, interact with it.
It's based on the premise that a normal desktop user shouldn't be able to make important system changes without proving that they have the correct administrator's privileges to do so. This is why you're asked for a password when you install new packages or open your distribution's configuration panels, and it's why a normal user can't see the contents of the /root directory or make changes to specific files.
Your distribution will use either sudo or an administrator account to grant access to the system-wide configurable parts of your system. The former will work typically only for a single session or command, and is used as an ad-hoc solution for normal day-to-day use, much like the way both Windows 7 and OS X handle privileges.
USER CONTROL:Groups make it possible to enable and disable certain services on a per-user basis
With a full-blown system administrator's account, on the other hand, it's sometimes far too easy to stay logged in for too long (and thus more likely that you'll make an irreversible mistake or change). But the reason for both methods is security.
Linux uses a system of users, groups and privilege to keep your system as secure as possible. The idea is that you can mess around with your own files as much as you like, but you can't mess about with the integrity of the whole system without at least entering a password. It might seem slightly redundant on a system when you are the only user of your system, but as we'll see with many other parts of Linux, this concept is a throwback to a time when the average system had many users and only a single administrator or two.
Linux is a variant of the Unix operating system, which has been one of the most common multi-user systems for decades. This means that multi-user functionality is difficult to avoid in Linux, but it's also one of the reasons why Linux is so popular – multi-user systems have to be secure, and Linux has inherited many of the advantages of these early systems.
A user account on Linux is still self-contained, for example. All of your personal files are held within your own home directory, and it's the same for other users of the system. You can usually see their names by looking at the contents of /home with your file manager, and depending on their permissions, even look inside other people's home folders.
But who can and can't read their contents is governed by the user who owns the files, and that's down to permissions.
Every file and directory on the Linux filesystem has nine attributes that are used to define how they can be accessed. These attributes correspond to whether a user, a group or anyone can read, write and execute the file.
You might want to share a collection of photos with other users of your system, for example, and if you create a group called 'photos', add all the users who you'd like access to the group and set the group permissions for the photos folder, you'll be able to limit who has access to your images.
Any modern file manager will be able to perform this task, usually by selecting a file and choosing its properties to change its permissions. This is also how your desktop will store configuration information for your applications, tools and utilities.
Hidden directories (those that start with a full stop), are often created within your home directory, and within these you'll find text files that your desktop and applications will use to store your setup.
No one else can see them, and it's one of the reasons why porting your current home directory to a new distribution can be such a good idea – you'll keep all your settings, despite the entire operating system changing.
Level 2: Desktops
If you come to Linux from Windows or OS X rather than through the server room, the idea that there's something called a desktop is quite a strange one. It's like trying to explain that Microsoft Windows is an operating system to someone who just thinks it's 'the computer'.
The desktop is really just a special kind of application that has been designed to aid communication between the user and any other applications you may run.
This communication part is important, because the desktop always needs to know what's happening and where. It's only then it can do clever things like offer virtual desktops, minimise applications, or divide windows into different activities.
There are two ways that a desktop helps this to happen. The first is through something called its API, which is the Application Programming Interface. When a programmer developers an application using a desktop's API, they're able to take advantage of lots of things the desktop offers. It could be spell checking, for example, or it could be the list of contacts you keep in another app that uses the same API.
MOBLIN:Moblin and UNR make good use of the Clutter framework to offer accelerated and smooth scrolling graphics
When lots of applications use the same API, it creates a much more homogeneous and refined experience, and that's exactly what we've come to expect of both Gnome and KDE desktops.
The reason why K3b works so well with your music files is because it's using the same KDE API that your music player uses, and it's the same with many Gnome apps too.
But applications designed for a specific desktop environment don't have to use any one API exclusively. There are probably more APIs than there are Linux distributions, and they can do anything from complex mathematics to hardware interfacing.
This is where you'll hear terms like Clutter and Cairo bandied around, as these are additional toolkits that can help a programmer build more unified-looking applications.
Clutter, for example, is used by both Ubuntu Netbook Remix and Moblin to create hardware-accelerated, smoothly animated GUIs for low-power devices.
It's Clutter that scrolls the top bar down in Moblin, for instance, and provides the fade-in effects of the launch menu in UNR. Cairo helps programmers create vector graphics easily, and is the default rendering engine in GTK, the toolkit behind Gnome, for many of its icons.
Rather than locking an image to a specific resolution, vector-based images can be infinitely scaled, making them perfect for images that are going to be used in a variety of resolutions. Inter-process communication The second way the desktop helps is by using something called 'inter-process communication'.
As you might expect from its name, this helps one process talk to another, which in the case of a desktop, is usually one application talking to another. This is important because it helps a desktop feel cohesive: your music player might want to know when an MP3 player has been connected, for example, or your wireless networking software may want to use the system-wide notification system to let you know its found an open network.
In general terms, inter-process communication is the reason why GTK apps perform better on the Gnome desktop, and KDE apps work well with KDE, but the great thing about both desktops is that they use the same compatible method for inter-process communication – a system called D-BUS.
So why do Gnome and KDE feel so different to each another? Well, it's because they use different window managers.
The idea of a window manager stretches right back to the time when Unix systems first crawled out of the primordial soup of the command line, and started to display a terminal within a window. You could drag this single window across the cross-hatched background, and open other terminals that you could also manipulate thanks to something called TWM, an acronym that reputedly stood for Tom's Window Manager.
It didn't do much, but it did free the user from pages of text. You could move windows freely around the display, resize them, maximize them and let them overlap one another. And this is exactly what Gnome and KDE's window managers are still doing today.
KDE's window manager, dubbed KWin, augments the moving and management components of TWM with some advanced features, such as its new-found abilities to embed any window within a tabbed border, snap applications to an area of the screen or move specific applications to preset virtual activities on their own desktops.
KWin also recreates plenty of compositing effects, such as window wobble, drop shadows and reflections, an idea pioneered by Compiz. This is yet another window manager, but rather than adding functionality, it was created specifically to add eye-candy to the previously static world of window management.
Compiz is still the default replacement for Gnome's window manager (Metacity), and you can get it on your Gnome machine if you enable the advanced effects in the Visual Effects panel. You'll find that it seamlessly replaces the default drawing routines with hardware-accelerated compositing.
One of biggest hurdles for people when they switch to Linux is the idea that you can't simply download an executable from the internet and expect it to run.
When a new version of Firefox is released, for example, you can't just grab a file from www.mozilla.org, save it to your desktop and double-click on the file to install the new version. A few distributions are getting close to this ideal, but that's the problem.
It's distribution-dependent, and we're no closer to a single solution for application installation than we were 10 years ago. The problem is down to dependencies and the different ways distributions try to tame them.
A dependency is simply a package that an application needs if it's to work properly. These are normally the APIs that the developers have used to help them develop the application, and they need to be included because the application uses parts of its functionality.
When they're bundled in this way they're known as libraries, because an app will borrow one or two components from a library to add to its own functionality.
Clutter is a dependency for both Moblin and UNR, for instance, and it would need to be installed for both desktops to work. And while Firefox may seem relatively self-contained on the surface, it has a considerable list of dependencies, including Cairo, a selection of TrueType fonts and even an audio engine.
Other operating systems solve this problem by statically linking applications to the resources they require. This means that they bundle everything that an app needs in one file.
All dependencies are hidden within the setup.msi file on Windows, for example, or the DMG file on OS X, giving the application or utility everything it needs to be able to run without any further additions.
The main disadvantage with this approach is that you'll typically end up with several different versions of the same library on your system. This takes up more space, and if a security flaw is found, you'll have to update all the applications rather than just the single library.
Level 3: Beneath the surface
Xis a stupid name for the system responsible for drawing the windows on your screen, and for managing your mouse and keyboard, but that's the name we're stuck with. As with the glut of programming languages called B, C, C++ and C#, X got its name because its the successor to a windowing system called W, which at least makes a little more sense.
X has been one of the most important components in the Linux operating system almost from its inception. It's often criticised for its complexity and size, but there can't be many pieces of software that have lasted almost 20 years, especially when graphics and GUIs have changed so much.
But there's something even more confusing about X than its name, and that its use of the terms 'client' and 'server'. This relationship hails back to a time before Linux, when X was developed to work on dumb, cheap screens and keyboards connected to a powerful Unix mainframe system.
XTERM:The original XTerm is still the default failsafe terminal for many distributions, including Ubuntu
The mainframe would do all the hard work, calculating the contents of windows and the shape of the GUI, while all the screen had to do was handle the interaction and display the data. To ensure that this connectivity wasn't tied to any single vendor, an open protocol was created to shuffle the data between the various devices, and the result was X.
What is counter-intuitive is that the server in this equation is the terminal – the bit with the screen and keyboard. The client is the machine with all the CPU horsepower.
Normally, in client–server environments, it's the other way around, with the more powerful machine being called the server. X swaps this around because it's the terminal that serves resources to the user, while the applications use these resources as clients.
Now that both the client and the server run on the same machine, these complications aren't an issue. Configuration is almost automatic these days, but you can still exploit X's client–server architecture. It's the reason why you can have more than one graphical session on one machine, for example, and why Linux is so good for remote desktops.
The system that handles authentication when you log into your system is called PAM (Pluggable Authentication Modules), which, as its name suggests, is able to implement many different types of security systems through the use of modules.
Authentication, in this sense, is a way of securing your login details and making sure they match those in your configuration files without the data being snooped or copied in the process. If a PAM module fails the authentication process, then it can't be trusted.
Installed modules can be found in the /etc/ pam.d/directory on most distributions. If you use Gnome, there's one to authenticate your login at the Gdm screen, as well as enabling the auto-login feature. There are common modules for handling the standard login prompt for the command line, as well as popular commands like passwd, cvs and sudo.
Each will use Pam to make sure you are who you say you are, and because it's pluggable, the authentication modules don't always have to be password-based. There are modules you can configure to use biometric information, like a fingerprint, or an encrypted key held on a USB thumb drive.
The great thing about PAM is that these methods are disconnected from whatever it is you're authenticating, which means you can freely configure your system to mix and match.
The thing that controls the inner workings of your computer is known as a shell, and shells can be either graphical or text based.
Before graphical displays were used to provide interactive environments to people over a network, text-based displays were the norm, and this layer is still a vitally important part of Linux. They hide beneath your GUI, and often protrude through the GUI level when you need to accomplish a specific task that no GUI design has yet been able to contain.
There are many graphical applications that can open a window on the world of the command line, with Gnome's Terminal and KDE's Konsole being two of the most common. But the best thing about the shell is that you don't need a GUI at all.
You may have seen what are known as virtual consoles, for example. These are the login prompts that appear when you hold the Alt key and press F1–F6. If you log in with your username and password through one of these, you'll find a fully functional terminal, which can be particularly handy if your X session crashed and you need to restart it.
Consoles like these are still used by many system administrators and normal desktop users today. It takes less bandwidth to send text information over a network and it's easier to reconstruct than its graphical counterpart, which makes it ideal for remote administration.
This also means that the command line interface is more capable than a graphical environment, if you can cope with the learning curve.
By default, if you don't install the X Window System, most distributions will fall back to what's known as the Bourne Again Shell – Bash for short.
THE TERMINAL:Most Linux installations offer more than one way of accessing a terminal, and more than one terminal
Bash is the command line that most of us use, and it enables you to execute scripts and applications from anywhere on your system. If you don't mind the terse user interface of text-based systems like this, you can accomplish almost anything with the command line.
There are many different shells, and each is tailored for a specific type of user. You might want a programming-like interface (C-Shell), for example, or a super-powerful do-everything shell (Z Shell), but they all offer the same basic functionality, and to get the best out of them, you need to understand something about the Linux filesystem.
Level 4: The kernel and friends
We're moving into the lower levels of the Linux operating system, leaving behind the realm of user interaction, GUIs, command lines and relative simplicity.
The best way of explaining what goes on at this level is to go through the booting process up to the point where you can choose either a graphical session or work with the command line, and the first thing you see when you turn your machine on.
The init process is used by many distributions, including Debian and Fedora, to launch everything your operating system needs to function from the moment it leaves the safety of Grub. It's got a long history – the version used by Linux is often written as sysvinit, which shows its Unix System V heritage.
Everything from Samba to SSH will need to be started at some point, and init does this by trawling through a script for each process in a specific order, which is defined by a number at the beginning of the script's name. Which scripts are executed is dependent on something called the runlevel of your system, and this is different from one distribution to another, and especially between distros based on Fedora and Debian.
GUFW:You don't have to mess around with Iptables manually if you don't want to. There are many GUIs, like GUFW, that make the job much easier to manage
You can see this in action by using the init command to switch runlevels manually. On Debian-based systems, type init 1 for single-user mode, and init 5 for a full graphical environment. Older versions of Fedora, on the other hand, offer a non-networking console login at runlevel 2, network functionality at level 3, and a full blown GUI at level 5, and each process will be run in turn as your system boots. This can create a bottleneck, especially when one process is waiting for network services to be enabled.
Each script needs to wait for the previous to complete before it can run, regardless of how many other system resources are being under-utilised.
If you think the init system seems fairly antiquated, you're not alone. Many people feel the same way, and several distributions are considering a switch from init to an alternative called upstart. Most notably, the distribution that currently sponsors its development, Ubuntu, now uses upstart as its default booting daemon, as does Fedora, and the Debian maintainers have announced their intention to switch for the next release of their distribution.
Upstart's great advantage is that it can run scripts asynchronously. This means that when one is waiting for a network connection to appear, another can be configuring hardware or initiating X. It will even use the same scripts as init, making the boot process quicker and more efficient, which is one of the main reasons why the latest versions of Ubuntu and Fedora boot so quickly in comparison with their older counterparts.
We've now covered almost everything, with one large exception, the kernel itself. As we've already discussed, the kernel is responsible for managing and maintaining all system resources. It's at the heart of a running Linux system, and it's what makes Linux, Linux.
The kernel handles the filesystem, manages processes and loads drivers, implements networking, userspaces, memory and storage. And surprisingly, for the normal user, there isn't that much to see.
Other than the elements displayed through the /proc and /sys filesystems, and the various processes that happen to be running in the background, most of these management systems are transparent. But there are some elements that are visible, and the most notable of these is the driver framework used to control your hardware.
Most distributions choose to package drivers as modules rather than as part of the monolithic kernel, and this means they can be loaded and unloaded as and when you need them. Which kernel modules are included and which aren't is dependent on your distribution. But if you've installed the kernel source code, you can usually build your own modules without too much difficulty, or install them through your distribution's package manager.
To see what modules are running type lsmod as a system administrator to list all the modules currently plugged into the kernel. Next to each module you'll see listed any dependencies. Like the software variety, these are a requirement for the module to work correctly.
Modules are kernel-specific, which is why your Nvidia driver might sometimes break if your distribution automatically updates the kernel. Nvidia's GLX module needs to be built against the current version of the kernel, which is what it attempts to do when you run the installer.
Fortunately, you can install more than one version of a module, and each will be automatically detected when you choose a new kernel from the Grub menu. This is because all the various modules are hidden within the /lib/modules directory, which itself should contain further directories named after kernel versions.
You can find which version of the kernel you're running by typing uname -a. Depending on your distribution, you can find many kernel driver modules in the /lib/modules/kernel_name/kernel/drivers directory, and this is sometimes useful if your hardware hasn't been detected properly.
If you know exactly which module your hardware should use, for example, you can load it with the modprobe module name. You may find that your hardware works without any further configuration, but it might also be wise to check your system logs to make sure your hardware is being used as expected.
You can remove modules from memory with the rmmod command, which is useful if Nvidia's driver installer complains that a driver is already running.
One of the more unusual modules you've find listed with lsmod is ip_tables. This is part of one of the most powerful aspects to Linux – its online security.
Iptables is the system used by the kernel to implement the Linux firewall. It can govern all packets coming into and out of your system using a complex series of rules. You can change the configuration in real time using the iptables command, but unless you're an expert, this can be difficult to understand, especially when your computer's security is at risk.
This is a reflection of the complexity within the networking stack, rather than Iptables itself, and is a necessary side effect of trying to handle several different layers of network data at the same time. But if you're used to other systems and you want to configure Iptables manually, we'd recommend a GUI application like Firestarter, or Ubuntu's ufw, which was developed specifically to make Iptables easier to use.
When it's installed, you can quickly enable the firewall by typing ufw enable as root, for instance. You can allow or block specific ports with the ufw allow and ufw deny commands, or substitute the port with the name of the service you want to block.
You can find a list of service names for the system in the /etc/services file, and if you're really stuck, you can install an even more user-friendly front-end to Iptables by installing the gufw package.
It's not the end
We've uncovered all the essential aspects of the Linux operating system, and we hope you've now got a much better understanding of how it all hangs together. One of the best things about Linux is that you're free to experiment and change things freely. This is one of the best ways of learning about the system and what it's capable of – as long as you don't try it in a production environment!
Try a virtual machine running your favourite distribution instead, and if you need any help or clarification, try the LXF Forums at www.linuxformat.co.uk/forums.
First published in Linux Format Issue 131
Liked this? Then check out 10 Linux commands for beginners
Sign up for TechRadar's free Weird Week in Tech newsletter
Get the oddest tech stories of the week, plus the most popular news and reviews delivered straight to your inbox. Sign up at http://www.techradar.com/register