How Linux can help you protect your privacy online

15th Sep 2012 | 11:00

How Linux can help you protect your privacy online

The Web is the greatest system for sharing information but how do you stop it sharing too much?

How to use Wireshark

You're not paranoid - they really are watching you. Criminals, web companies and governments all have a reason to spy on your online life, and the methods that they use are becoming increasingly sophisticated.

2011 was the most dangerous year to be an online citizen, particularly if you happened not to agree with everything your government said. 199 people around the world were arrested or detained because of content they posted online. Many are still languishing in jail.

The offending information ranged from exposes of environmental damage to religious instruction and criticism of unelected autocrats.

In addition, there has been a recent increase in the use of netizens' information by web companies. Privacy policies have been extended, and Twitter now sells the rights to users' data.

Some of the self-protection methods shown here will have an impact on how you can use a computer. For most people, implementing all of them would be over the top. What we're aiming to do here is show you who can find out what about you, and how to stop them.

What you do with that information is, of course, up to you. Whether you are concerned about the scale of information gathered by web companies, or you are hiding from a corrupt government, read on to find out how to keep your data yours.

You can find out just how much information you're revealing to the world using Wireshark. This tool captures all information passing through your network interfaces and allows you to search and filter for particular patterns. It takes information from your network interface, so any information displayed in it is visible to other (potentially malicious) people on the network.

Wireshark should be available through your package manager, or from wireshark.org. Once installed, you can start it with: sudo wireshark

You will get a message telling you that you've started it with super user privileges and this isn't a good way of doing it. If you plan on using the tool a lot, you should follow their guide on a better set-up, but for a one-off, you can ignore this.

Wireshark

Click on your network device in the interface list (probably eth0 for a wired network and wlan0 a wireless) to start a capture. As soon as you start using the network, the top part of the screen will fill with variously coloured packages. The tool has a filter to help you make some sense of this multi-coloured mess.

For example, you can keep a prying eye on duckduckgo.com searches using the filter: http.request.full_uri contains "duckduckgo.com?q" If you now do a search using http://duckduckgo.com, it will appear in the list, and the search term will be in the Info column.

A similar technique could be used on any of the popular search engines. You may not be concerned about people being able to read your search terms, but exactly the same technique can be used to pull usernames and passwords that are sent in plain text.

For example, most forums send passwords in plain text (because they're not a serious security risk, and secure certificates can be expensive). The www.linuxformat.com forums are set up in this way.

To sniff LinuxFormat.com passwords, fire up Wireshark and start a package capture using the filter: http.request.uri contains "login.php" When you log in to www.linuxformat.com/forums/index.php (you will need to create an account if you don't already have one), the filter will capture the packet. The line-based text data will contain: Username=XXX&password=YYYY&login=Log+in

How many computers are you sharing this information with? Depending on your network set-up, probably every other computer on the LAN or Wireless network.

As well as these, every computer that sits in the route between you and the server you're communicating with. To discover what these are, use traceroute to map the path the packets take.

For example, traceroute www.google.com

If your computer's behind a firewall, you may find that this just outputs a series of asterisks. In this case, you can use a web-based traceroute such as the ones indexed at www.traceroute.org. This list is a little out of date, and not all of the servers are still hosting traceroute, but you should be able to find one that works in your area.

Do you know who's running these computers? Or who has remote access to them? Do you want these people to be able to see everything you do online? If you use services with unsecured passwords (and there's no reason you shouldn't, as long as you understand the implications), then it's important not to use the same password for a secure service.

The most basic piece of the web privacy puzzle is the Secure Sockets Layer (SSL). This rather obscure-sounding protocol is a way of creating an encrypted channel between an application running on your computer and an application running on another computer.

For each insecure network protocol, there's a secure one that does the same basic task, but through an SSL channel.

Any time you use an insecure protocol, an eavesdropper can read what you send, but if you use a secured one, only the intended recipient can see the data.

For web browsing, it's HTTPS that's important. As we saw before, many computers can read what we send in HTTP, but if we perform the same test again, but using duckduckgo's secure web page - https://www.duckduckgo.com (note the s) - then you will find that the information does not appear in Wireshark.

unsecure web page

Some web browsers show a padlock when connected to a secure website, but this can be spoofed easily using favicons. If you're unsure, click on the icon. A legitimate padlock will open a pop-up telling you about the security on the page.

Of course, this ensures only that the information can't be read as it's being transmitted between your computer and the server. Once there, the organisation running the server could pass it on to third parties, or transmit it insecurely between their data centres. Once you send information, you lose control of it.

Before hitting Submit, always ask yourself, do you trust the organisation receiving the data? If not, don't send it.

HTTPS is a great way to keep your web browsing private. However, because of the way it has been bolted on top of HTTP, it isn't always easy to make sure you use it. For example, if you use https://www. google.com to search for 'wikipedia', it will direct you to the HTTP version of the encyclopaedia, not the HTTPS version.

The Electronic Frontiers Foundation (EFF), a non-profit dedicated to defending digital rights, has developed an extension for Firefox that forces browsers to use HTTPS wherever it's available. A Chrome version is currently in beta. Get this from https://www.eff.org/https-everywhere to keep your web usage away from eavesdroppers.

Like all forms of encryption, SSL has a weakness, and that's the keys which are stored in certificates. Just as a hacker can easily get in to your accounts if they know your password, they can easily eavesdrop on SSL encrypted data - or spoof it - if they can trick your computer into using their certificates.

The main point here is that they are stored on the computer, not in your memory like passwords. If an attacker can put files on your system, they can break SSL encryption. You are at particular risk when using a computer you haven't personally installed the operating system on, such as a work machine or at an internet café.

You should be able to view the current certificates and authorities in your browser's security settings, but it isn't always easy to identify things that shouldn't be there. Here, live distros come to the rescue, since you can carry a trusted operating system with you and use that whenever you are at a computer of dubious provenance.

How to stop companies tracking you

Collusion

Using SSL will keep your data safe from eavesdroppers, but what if the companies that you're communicating with are spying on you?

Google, Facebook, Twitter and others have built business models out of providing users with a free service in return for information about you. This information can then be used to target advertisements at you.

Twitter has even gone a step further and sold users' tweets to market researchers. Some people may consider this a fair trade, but privacy campaigners are becoming increasingly concerned about the shear quantity of data these companies are holding about us. And this data goes way beyond what we voluntarily hand over to them.

Both Google and Facebook have established relationships with literally millions of other websites to help them track your movements around the web using cookies. These may sound like tasty treats, but are actually pieces of information stored on your computer to help sites identify you when your browser reaches them.

To find out just how much these companies are tracking us, we can use Wireshark to monitor our network connection and watch for the cookie data being sent back.

Start Wireshark and capture on your main network interface. In the filter box enter http.cookie

This will now show only packets that relate to cookies that are being sent to web servers. To display a little more of the information that is being acquired, go to the middle pane and click on the arrow next to Hypertext Transfer Protocol.

Apply As Column

There are two sections in here that allow the web company to track us: the host and the referrer. Right-click on each of these and select Apply As Column. This will then add these fields to the main view.

Each of these two domains allows the host (the organisation receiving the cookie) to monitor your activity on the referrer. In addition to this, the host uses a unique ID to track your activity between sessions. Google uses its advertising services to monitor what we do, whereas Facebook uses its Like buttons.

There's no way of knowing exactly what these companies are doing with the data they collect - we can see only what they're receiving.

Fortunately, most browsers allow you to control cookies. Depending on your personal feelings, you may choose to limit cookies to certain websites (where they can be useful to remember preferences), or block them completely.

If you use Firefox, go to Edit > Preferences > Privacy, and change Firefox Will to Use Custom Settings for History. If you untick Accept Cookies From Sites, Firefox will not store any cookies.

To do the same in Chromium go to Preferences (the spanner by the address bar) > Under the Bonnet and change Cookies to Block Sites From Setting Any Data. In Konqueror, this can be done through Settings > Configure Konqueror > Cookies and unchecking Enable Cookies. For lightweight KDE users, it can be done in Rekonq by going to Settings (the spanner by the address bar) > Network > Cookies and unchecking Enable Cookies.

As well as allowing you to completely block cookies, both Firefox and Chromium give you the option of blocking third-party cookies (In Konqueror and Rekonq, this is Only Accept Cookies From Originating Server). This means they block cookies from domains other than that of the current website. If you do this, websites can store data about you, such as your preferences, and can track your movements within the site, but other sites won't be able to follow your movements once you leave the domain. This will stop companies from tracking your movements across the web.

If you set this up, then run cookie tracking in Wireshark, as was done above, you will see that the referrer and the host are always the same domain. For many users, this will be a happy medium of letting cookies do their original purpose - letting sites use them to recognise returning viewers - but blocking organisations from following their online movements.

Zombie cookies

Cookies aren't the only way that websites can track you. Even if you have browser cookies disabled, sites can still store tracking information on your computer using Locally Shared Objects (LSOs). These function exactly like cookies, except that they're accessed through Flash rather than directly through your browser.

To view and control what websites are using these, go to Macromedia's Website Storage Settings Panel.

Webmasters intent on tracking you can use a combination of techniques to create zombie cookies. These store the same information in more than one place so that when you destroy one, they regenerate using the others. For example, if you delete all browser cookies, the website can recreate the cookie from an LSO, and visa versa. As long as one of these remains, all the others can regenerate.

Samy Kamkar has taken this to the extreme at samy.pl/evercookie, where he uses 12 different methods to resurrect the data! We think running the NoScript extension for Firefox should prevent this type of cookie from working, but it also disables the method of testing it! We found that neither Private mode in Firefox, nor Incognito mode in Chromium were able to prevent this.

If you need to be sure that your web browsing isn't being tracked across sessions, the best solution is to use a non-persistent system. That is, a system that doesn't carry any information over from one session to the next. You can still be tracked during a browsing session, but not between them.

For Linux users, the most obvious option is a live DVD. This doesn't have to be a physical disc running live - an ISO running in a virtual machine will do the job. This means that all data that the websites can use to track you is reset each time you restart the virtual machine. You can also run more than one virtual machine simultaneously to prevent anyone linking two sessions.

Author cookies

If it ever comes into being, a live version of Boot To Gecko would be a particularly convenient way to do this, but this is still in development.

Beating digital fingerprinting

Panopticlick

There is one, slightly more devious, technique that websites can use to identify you. This is by amalgamating all information about the capabilities of your browser and system into a digital fingerprint.

Because of the amount of information that your browser will, if asked, reveal about you, this fingerprint can often be used to uniquely identify you to a site. Once again, the EFF is active in this area, and hosts a website to help you understand what your fingerprint is.

Point your browser to panopticlick.eff.org to see how unique you are. At the time of writing, more than two million people had used the site to check their browsers, and we still found that most of our machines could be uniquely identified. This means any website could track us even without cookies, LSOs or any of the other storage techniques.

At the moment, this is a theoretical vulnerability, and there have been no known cases of browser fingerprinting in the wild. If you're concerned about being tracked this way, the best way to prevent it is to stop scripts from running. This reduces massively the amount of information that a website can use to form the fingerprint.

The NoScript extension for Firefox provides an easy way to control which scripts run on a site. However, this will severely limit the function of many interactive websites. Web pages are made up of a number of different elements that your browser reassembles to make a single document. These elements may come from many different places, organisations and servers.

Any of these could contain some degree of monitoring using a technique called web bugging (also known as web beacons or pixel tags). These use images to generate HTTP requests that log your activities with a different server to one hosting the website. These potentially could be able to track you using browser fingerprinting, but they're also used more widely. They're not restricted to web pages, and can be used in any HTML document.

Most commonly, they're used by spammers to identify active email addresses. If you open an email containing one of these images, the spammer will be able to identify that you're checking the address, and can be persuaded to open spam emails. Fortunately, most email clients and web mail providers disable image loading by default.

Locating

host ip

When you connect to the internet, your service provider assigns you an IP (Internet Protocol) address. This tells web servers and other computers you communicate with where to send the information. Any computer you interact with online can tell which IP address you use.

From this, they can find out some information, mainly your service provider and approximate location. Check out www.hostip.info to find out what you're transmitting to the world. Since IP addresses change periodically, web servers can't get closer to you than this. However, government agencies can force your service provider to reveal which subscriber was allocated to which IP address at what time. In short, they can link an online act with a physical computer.

For example, in April 2004 Shi Tao, a Chinese journalist, emailed the Asia Democracy Foundation with details of the Chinese Government's attempts to stifle news reports on the 15th anniversary of the Tiananmen Square massacre via Yahoo web mail. His government got the IP address he used from Yahoo, and since the ISP was state-controlled, could find out exactly where it was sent from. In November, he was arrested, and in March 2005 he was sentenced to ten years in prison.

To protect yourself from this level of scrutiny, you need to make sure that there's no link between you (and your IP) and the server you're communicating with. Simply encrypting your communication isn't enough, because it still allows the server to know who sent it - it just prevents eavesdroppers.

Tor

You can achieve the necessary privacy by passing your data through a series of encrypted relays. This technique is called onion routing, and has been implemented by the Tor Project

Step one: Communicate with the Tor directory server, which will reply with three random relays.

Step two: Encrypt your data with keys for each of the relays.

Step three: Send this encrypted package to the first relay. This server knows your IP address, but doesn't know what you're doing, since your data is encrypted with the keys to the other relays. The only piece of information they can access is the location of the second relay.

Step four: The first relay sends the encrypted package to the second relay, that can only decrypt the location of the third relay. This computer knows the location of the other two relays, but not your IP or what you are trying to communicate with.

Step five: The second relay sends the encrypted package to the third. This computer can decrypt your message and send it out of the Tor network on to the intended recipient. The third relay can see the final recipient of your data (and if you're using an unencrypted protocol, the actual data), and the location of the second relay, but he doesn't know your identity.

Step six: The recipient gets your request as though it had come from the third relay. They don't know your identity, or even that there is someone hidden behind the third relay. They respond to the third relay.

Step seven: The third relay passes the information back to you through the Tor network in the same manner as you sent it. No one on the network knows both the identity of the original sender, and the recipient.

However, Tor is an anonymisation system, not an encryption system. While the data is encrypted as it passes through the relays, once it leaves the network, it's no more or less secure than any other information on the internet. To keep your data private, you need to use the same precautions you would if you were not using Tor - ie use one of the encrypted protocols listed on the right of table one.

Sounds complicated? Fortunately, the Tor Project has put all the necessary tools in a single package with a secure version of Firefox. It's on the disk, or available from www.torproject.org - just unzip the file and run start-tor-browser. It will connect to the network and open a secure browser.

If you are on the run (in any sense of the phrase), you can browse securely via Orbot for Android or Covert Browser for iOS. There are potential statistical attacks against the network. For example, if an organisation can see all the data going into the network, and all the data coming out of it, the timing and quantities of packets may reveal which user sent what. However, due to the worldwide nature of the system, this would require co-ordinated and systematic monitoring across many countries.

You may think that using an internet account not linked to a physical location - such as mobile or satellite phones - will improve this situation, but it does the opposite. Mobile phone signals can be triangulated, and many satellite phones include the GPS co-ordinates of the phone in the connection to the service provider.

Polish firm TS2 sells a product that can pinpoint a satellite phone user: www.ts2.pl/en/News/1/151. It's possible that technology similar to this was used by the Syrian regime to target and kill journalists in Homs earlier this year.

Some regimes, most notably in China, appear to have taken steps to stop their citizens accessing Tor. The simplest way of doing this is to download a list of Tor relays and stop all connections to those machines.

Tor network overview

To allow users to bypass this, Tor has introduced a series of bridges. These are routes into the network that aren't published. A game of cat and mouse has now begun between the Tor Project and organisations trying to block access to the anonymisation service.

Like many community-based projects, Tor needs volunteers. However, unusually for a free software project, programmers are not the most needed people. Running a relay or bridge will help keep people anonymous.

Translators and people working in advocacy are also in demand. To see how you can help people maintain both their privacy and freedom of speech, check out www.torproject.com/getinvolved/volunteer

Disk encryption

If you're interested in privacy, then the chances are you use full disk encryption. If you don't then you may wish to consider it. It's easy to set up, usually just a tickbox during the distro install, and on a modern system the performance penalty should be minor for most purposes.

Note that partial disk encryption is considerably less secure - in issue 154 we showed one of many methods for circumventing it. Modern encryption methods using algorithms such as AES are unbreakable without the passphrase, provided a sufficiently long key is used (AES-128 should be considered a minimum. If the CIA is on your tail, then AES-256 is better).

There are a few methods a government agency can use to acquire this passphrase. Unfortunately, the easiest (for them) is torture. The second easiest is to try to guess your passphrase using a dictionary attack. However, let's assume that you've picked an unguessable passphrase, and managed to jump out of the window and flee when the knock at the door came. Your secrets will be safe, right?

Well, not quite. When you're using an encrypted drive, the computer stores the decryption keys in the memory. If they smash through the door just in time to see the computer shutting down, they could put a memory scrubbing tool in your computer and restart it.

Contrary to popular belief, the RAM in your computer isn't wiped when it's powered off, just very soon afterwards. Researchers at Princeton were able to steal encryption keys from the memory of restarted computers. The tools they created to do this are available from https://citp.princeton.edu/research/memory.

If you only locked or suspended your computer, then the situation's even worse. In these cases, the spooks will have time to freeze the memory before rebooting it (or transferring to a computer set up to scrub the memory).

At room temperature, memory typically becomes unusable after a few seconds. If it's frozen to around -50˚C (which is achievable using cheap aerosols), that time increases to several minutes. To avoid this style of attack, you need to stop them being able to access usable memory.

Don't leave your machine locked or suspended. If you have valuable information on it, turn it off. And prevent booting from devices other than the hard drive without a password. This will stop them booting straight into a tool such as the Princeton researchers' USB scrubbing tool. By the time they've managed to bypass your BIOS's security, the memory will be useless.

Using longer encryption keys will also help, since slight errors often creep in during the scrubbing process. The longer your key, the more of these errors it's likely to pick up.

If the men in black are really on your tail, then you could consider running your laptop without its battery. This means that you have only to pull out the power cable before running away.

tutorial privacy security encryption cookies tracking TRBCExtra
Share this Article
Google+
Edition: UK
TopView classic version