How ARM took on the world - and won
2nd Mar 2012 | 12:00
The past, present and future of the mobile chip giant
The history of ARM
For as long as most of us care to remember, the battle for the mainstream processor market has been fought between two main protagonists, Intel and AMD, while semiconductor manufacturers like Sun and IBM traditionally concentrated on the more specialist Unix server and workstation markets.
Unnoticed to many, another company has risen to a point of dominance, with sales of chips based on its technology far surpassing those of Intel and AMD combined.
That pioneering company is ARM Holdings, and while it's not a name that's on everyone's lips in the same way that the 'big two' are, indications suggest that this company will continue to go from strength to strength.
Here we put ARM in the spotlight, investigating its past and heritage but, more importantly, also looking at the future for this unsung hero of the micro-electronics revolution.
While most of the semiconductor industry is, and always has been, based in California's Silicon Valley, it makes a refreshing change that ARM's headquarters are here in the UK in the so-called Silicon Fen area around Cambridge.
Despite ARM not being a household name, the company is no Johnny-come-lately. Indeed, if we trace ARM back to its roots we find a company that was a major force in the personal computing boom of the early '80s.
Back in 1980, the IBM PC was still in development and those personal computers that did exist were hugely expensive, costing the equivalent of several thousand pounds in today's terms. The UK had just made its mark on personal computing with the launch of the Sinclair ZX80. This was the first computer to sell for less than £100 - something that helped the UK to lead the world in home computer ownership throughout the 1980s.
One of the most influential companies to follow in Sinclair's footsteps was Acorn Computers. Just a year later, Acorn brought out the BBC Micro, which found its way into just about every school in the UK and went on to sell about one and a half million units.
Acorn's successor to the BBC Micro, the Archimedes, wasn't nearly as successful as a computer, but was far more influential because of Acorn's choice of processor. While the BBC Micro used an off-the-shelf 8-bit 6502 from MOS Technology, for the Archimedes Acorn decided to design its own high performance 32-bit RISC (reduced instruction set computer) chip, which it called an Acorn RISC Machine or ARM processor.
In 1990, in a joint venture with Apple and VLSI Technology (a company that designed and manufactured custom and semi-custom chips), Acorn span off its research division as a separate company called Advanced RISC Machines. In due course, this offshoot would evolve into the ARM Holdings we know today.
The RISC philosophy
Having used the term RISC to describe the ARM chip that powered the Archimedes, and because the same tag can be applied to today's ARM technology, it makes sense to start by investigating this approach to the design of a microprocessor. To do that, we need to begin with a brief history lesson.
Early 8-bit microprocessors like the Intel 8080 or the Motorola 6800 had only a few simple instructions. They didn't even have an instruction to multiply two integer numbers, for example, so this had to be done using long software routines involving multiple shifts and additions.
Working on the belief that hardware was fast but software was slow, subsequent microprocessor development involved providing processors with more instructions to carry out ever more complicated functions. Called the CISC (complicated instruction set computer) approach, this was the philosophy that Intel adopted and that, more or less, is still followed by today's latest Core i7 processors.
This move to increasingly complicated instructions came at a cost. Although the first microprocessors could execute most of their instructions in just a handful of clock cycles, as processors became more complicated, significantly more clock cycles were required.
In the early 1980s a radically different philosophy called RISC (reduced instruction set computer) was conceived. According to this model of computing, processors would have only a few simple instructions but, as a result of this simplicity, those instructions would be super-fast, most of them executing in a single clock cycle. So while much more of the work would have to be done in the software, an overall gain in performance would be achievable.
Many RISC-based processor families adopted this approach and exhibited impressive performance in their niche application of Unix-based servers and engineering workstations. Some of these families are now long gone but the fact that several - including IBM's POWER, Sun's SPARC and, of course, ARM - are giving the x86 architecture a run for its money rather suggests that less can indeed be more.
We really are talking about a minimalist approach here. In a classic RISC design, all arithmetic and logic operations are carried out on data stored in the processor's internal registers. The only instructions that access memory are a load instruction, which writes a value from memory into a processor register, and the store instruction that does the opposite.
A simple example will illustrate how this results in more instructions having to be executed. If you've ever tried your hand at programming using a high-level language like BASIC, you'll surely have written an instruction like A = B + C that adds together the values in variables (memory locations) A and B, and writes the result to another variable called C.
With a CISC processor, this one instruction would become three as shown in the following example, which is for a typical processor:
In this example, the LOAD instruction writes the value from memory location A into the processor's accumulator (a special register used for arithmetic and logic operations), the ADD instruction adds the value from memory location B to the value in the accumulator, and the STORE instruction writes the value from the accumulator into memory location C.
In a RISC processor, the following instructions would be needed. It's important to note that a RISC processor has several registers, not just the accumulator, so these have to be specifically referred to (as R1 and R2 for example) in the instructions.
LOAD A, R1
LOAD B, R2
ADD R1, R2
STORE R2, C
Because today's ARM processors and cores (we'll differentiate between these two terms later) are direct descendants of the original ARM chip that was designed for Acorn's Archimedes back in 1987, they are referred to as ARM architecture devices.
This is similar to today's Intel and AMD chips, which grew out of the Intel 8086 and its successors, and are therefore described as adhering to the x86 architecture. Here we'll look at what was unique about the ARM architecture when it appeared 25 years ago, how it has evolved over the years, why that evolution hasn't paralleled that of the 86 architecture, and what remains distinctive about it today.
Those who have followed the development of the PC over the years will no doubt be familiar with the concept of processor generations within the x86 architecture. In recent years the dividing lines have become somewhat less distinct, but in the early days we saw the first generation 8086 (or 8088) give way to the 80286 and subsequently to the 80386, 80486, the Pentium and so on.
The same is true of the ARM architecture, but with one important difference. In the realm of x86, new generations have, on a couple of occasions, been associated with the introduction of a new headline figure for the width of the data pathways - something that has a significant impact on performance.
So the x86 architecture launched with the 16-bit 8086, and subsequent developments have brought us 32 bits and eventually the 64-bit architecture we enjoy today. In total contrast to this, the ARM architecture made its debut at 32 bits and has yet to make the transition to 64 bits.
However, that shouldn't be interpreted as a lack of innovation, as a few facts and figures will reveal.
The first ARM processor saw the light of day in 1985, and while it was never used commercially, the ARM 2 that followed it, and which powered the first Archimedes PC, wasn't too dissimilar. Although it was based on a 32-bit architecture, it had a 26-bit address bus, which meant that it could address 64MB of memory. Although not a lot by today's standards, this was a huge amount in the mid-'80s.
The clock frequency of 8MHz also sounds rather pedestrian although, as a result of the RISC design, it was able to provide a speed of 4 MIPS (million instructions per second). To put this into context, the Intel 80386, which appeared on the scene a year later, was just a touch faster at 5 MIPS, but to achieve this it had to be clocked at 16MHz.
However, to see the advances the ARM architecture has enjoyed in a quarter century of development, we really need to draw some comparisons with today's offerings.
In terms of raw performance, today's top-end core is the Cortex-A15, which is based on the seventh generation architecture, known as ARMv7. Although the clock speed depends on the manufacturer (ARM Holdings itself doesn't manufacture any silicon), a figure of 2.5GHz is considered a likely ceiling, and at this speed it would clock up a performance of around 35,000 MIPS.
While this comes nowhere near the latest Intel Core i7, we're not comparing like with like. When expressed in terms of MIPS per core per MHz, although it still doesn't overtake the Core i7, the Cortex-A15 comes much closer - but even this is missing the point.
Indications are that the Cortex-A15 will consume less than a watt per core compared to tens of watts per core for the Core i7. In this respect it comes closer to the Intel Atom, but with much greater performance.
ARM: becoming an IP supplier
The world's first semiconductor companies designed and manufactured chips. Intel still adheres to this model, manufacturing its processors at Intel-owned fabs (industry jargon for fabrication plants), mainly in Arizona and Oregon.
Although AMD originally worked in the same way, it is now a fabless semiconductor company. This means that, although it designs and markets microprocessors, it subcontracts the manufacturing to a so-called silicon foundry.
ARM Holdings is even further removed from real-world products since, although it designs processors, it neither manufactures silicon chips nor markets ARM-branded hardware. Instead it sells, or more accurately licences, intellectual property (IP), which allows other semiconductor companies to manufacture ARM-based hardware.
These chips might be microprocessors as we understand the word, but alternatively they could be complicated chips that form the basis of a mobile phone, for example, of which the ability to execute software is just one element. So in buying the rights to manufacture an ARM-based product, exactly what does a semiconductor manufacturer receive?
We put this question to Ed Plowman, technical marketing manager at ARM's Media Processing Division. "Originally the predominant mode of delivery was via hard macros," he told us. "This is a definition the chip's layout - what to deposit where, and how to connect it all together to make a working circuit".
Over time though, as the number of transistors in the chips increased, along with the number of processes by which each could be manufactured, this became impractical. Designs are now mainly supplied as a circuit description, from which the manufacturer creates a physical design to meet the needs of its own manufacturing processes.
But this data isn't supplied as a circuit diagram - it's provided in a hardware description language that provides a textual definition of how the building blocks connect together. The language used is RTL (register transfer-level), which means, as Ed says, "the definition isn't at the transistor level, but defines how data flows between registers".
This isn't an area where one size fits all, though. ARM sometimes still chooses to implement a hard macro to improve time to market and optimised solutions for certain high-volume process technology nodes.
This is the way, for example, that the Osprey (dual-Cortex-A9) is delivered so pretty much all the manufacturer has to do is create the masks.
Processor and cores
To most technically-minded PC users, a processor is the large component that sits on the motherboard, and which forms the heart of the PC. A core, on the other hand, of which there might be two, four, six or eight, is a part of a processor that's responsible for executing instructions.
Within ARM, though, the two terms have a somewhat different meaning. A processor is pretty much what most people would expect - a design containing all the usual elements, including one or more cores, cache memories and the bus interface.
As such, it's a design that a semiconductor manufacturer can turn directly into a standard silicon component. So, for example, several companies including Toshiba, NEC and TI have ARM Cortex-A9 processors.
A core, on the other hand, is the heart of a microprocessor that semiconductor manufacturers can build into their own custom chip designs. That customised chip will often be much more than what most people would think of as a processor, and could provide a significant proportion of the functionality required in a particular device.
Referred to as a system on silicon (SoC) design, this type of chip minimises the number of components, which, in turn, keeps down both the cost and the size of the circuit board, both of which are essential for high volume portable products such as smartphones.
A perfect example of the increasing amount of functionality that's being shoe-horned into a single chip is the Samsung Exynos 4210 SoC. Intended for smartphones, tablet PCs and netbooks, the chip features a 1.2GHz dual-core ARM Cortex-A9, plus just about everything that would be found as separate chips on the motherboard on a conventional PC.
For example, there's on-chip 3D graphics and audio hardware, 1080p video encode and decode, plus interfaces for the display, camera and keypad. There are also memory, USB, PCI Express (expansion card), SATA (hard disk) and memory card interfaces and, while the necessary RF (radio frequency) circuitry would have to be provided by a separate chip, there's support for the various communication channels including Wi-Fi, HSPA+ and LTE (3G and 4G mobile phone) and GPS.
We've seen that ARM processors and cores are used in hand-held and portable devices like smartphones, tablet PCs and netbooks, but this is only the tip of the iceberg.
As Ed Plowan put it, "You can walk around any branch of stores like Comet or Curry's and be falling over ARM devices, but you won't know it." Included here are the likes of games consoles, personal media players, set-top boxes, internet radios, home automation systems, GPS receivers, ebook readers, TVs, DVD and Blu-ray players, digital cameras and home media servers.
Cheaper, less powerful chips are found in less likely sounding home products, including toys, cordless phones and even coffee makers. There's a good chance that your car could contain a fair few ARM-based devices too. They're used to drive dashboard displays, anti-lock breaking, airbags and other safety-related systems, and for engine management.
Ed also mentioned healthcare products as a major growth area over the last five years, with products varying from remote patient monitoring systems to medical imaging scanners. While your desktop or laptop PC won't feature an ARM chip as its main processor, there's a good chance that there'll be one or more hidden away somewhere doing some unexpected but important job.
ARM devices are used extensively in hard disk and solid state drives. They also crop up in wireless keyboards, and are used as the driving force behind printers and networking devices like wireless router/access points.
Before you jump to the conclusion that ARM products are destined to play a supporting role in traditional computing platforms indefinitely, we really ought to mention the EU Mont-Blanc Project.
With the aim of providing high performance computing but without the high energy consumption of today's top supercomputers, project partner the Barcelona Computing Centre is building a supercomputer from ARM-based Nvidia Tegra processors. As yet there's no indication of how many thousand of these chips will be used and what level of performance will be achieved, but this first ARM-based supercomputer will certainly break new ground in terms of energy efficiency.
While the initial goal is to provide two to five times the efficiency of x64-based supercomputers, this is expected to increase to as much as 10 times by 2014 and the ultimate project aim is a 10-30 fold improvement.
The future of ARM
The ARM architecture started out in a desktop PC, and a powerful one at that. Now, after years of being hidden away in a whole manner of consumer and industrial products, it has returned to the world of computing by powering the latest generation of portable platforms. So will ARM products once again power mainstream computers?
I put that question to Ed Plowman, who questioned our use of the term 'mainstream' and turned the question on its head by referring to the evolution of computing devices.
"First there was the desktop, and then the laptop, but the laptop was hindered by the lack of connectivity", he said. "All of this functionality plus connectivity can now be provided in a device that's always with us, but there's a limit to what you can do with a smartphone, which is why the tablet was developed. It would be wrong to think of a tablet as just a bigger phone, though; it provides a different way of presenting data and a different user experience. So ARM isn't moving into the mainstream but the mainstream is evolving to play into ARM's strengths such as low power consumption".
However we might define the mainstream, there seems to be little doubt that ARM devices are being called on to perform ever more processor-intensive tasks. The fact that this has been achieved with a 32-bit architecture when most of the competition at the top end has migrated to 64 bits is a testimony to the ARM architecture, but surely there's a limit to how far 32-bit technology can be stretched.
At ARM's TechCon conference in Santa Barbara in October last year, Ed mentioned the company's forthcoming ARMv8 architecture, which will be 64-bit throughout. He wasn't prepared to say when 64-bit designs will become available, but Ed was enthusiastic about what it will offer.
"Availability of a 64-bit ARM core takes us into interesting areas. People think it's predominantly about taking us into an Intel-type world, but there are lots of other advantages, not least of which is the scope for vastly increasing use of memory."
Indeed, coping with huge volumes of data brings us to another new application for ARM devices as evidenced by the recent announcement by HP of the Redstone server. This new product line uses ARM Cortex-based processors produced by start-up company Calxeda.
However, Ed Plowman took issue with the suggestion that ARM's entry into the large scale server market was a radical change of direction. "People think that servers are all about high performance, but most applications are data centres where the main requirement is energy efficiency", he said.
"Electrical power is needed both to run the hardware and for cooling, and this cost overrides the equipment cost. With the continual requirement for larger and larger data centres, the number of MIPS per unit area and power efficiency are becoming increasingly important".
It's interesting to note, therefore, that an aim of HP's Project Moonshot, of which the ARM-based Redstone server is a first element, is to consume 89 per cent less energy and 94 per cent less space, while reducing overall costs up to 63 per cent compared to traditional server systems.
ARM might be best known for its microprocessors and cores, but GPUs are becoming an increasingly important part of their portfolio with their newly announced Mali-T658 representing the state of the art. Needless to say, this new product follows in the ARM niche of low power consumption for handheld and portable devices, without sacrificing performance.
Indeed, a recent press release refers to desktop-class graphics on mobile devices, and Ed went on to make the tantalising suggestion that it would allow mobile phones to be driven via a gesture interface.
Whether we'll ever see a desktop PC proudly displaying an 'ARM Inside' badge remains to be seen, and unless it does the company will probably never become a household name. Yet ARM Holdings appears in the FTSE 100 list of the UK's most influential companies, employs 1,700 people, turns over more than £400 million, and demonstrates that a British company can compete with the best that Silicon Valley has to offer.
With pundits continually talking down the British economy, perhaps you'll forgive us relishing in this success story for the British computer industry.
Learn to program ARM chips
Get to grips with the mbed rapid prototyping platform
Given that ARM can trace its ancestry back to the educational market, it's perhaps appropriate that the company has launched an initiative that represents a return to its roots.
In those early days of the BBC Micro, if you wanted to do something useful with the hardware, it often meant rolling up your sleeves and creating some BASIC code. In an attempt to get people programming again, ARM has launched the mbed rapid prototyping platform, which simplifies the process of developing code for ARM processors.
The name mbed comes from the concept of an embedded application, the name given to an application that runs in the background, perhaps in something like a microwave or a washing machine, often completely unbeknown to the user.
Although you'd have to buy a development board - a small circuit board containing an ARM processor and the necessary circuitry to interface it to external hardware and costing from around £35 - the associated C/C++ compiler is freely available at http://mbed.org.
You can find all the necessary project documentation on the mbed website, along with a large collection of ARM code, contributed by other mbed users, that you are free to adapt for use in your own software.
If you're still to be convinced that mbed is for you, you might be interested to know that an ARM-powered robot arm has solved the world speed record for solving the Rubik's Cube, and other exciting amateur projects have involved racing robots around a track and controlling a remote weather station.
Spotlight on the Archimedes
How the BBC Micro's successor changed the face of computing
The Archimedes was never as well known as its hugely successful predecessor, the BBC Micro. Yet without it, the ARM architecture would probably never have come into being and today's smartphones and tablets might boast Intel rather than ARM technology.
So what was so special about this personal computer that it changed the face of the micro electronics industry? At first sight, the specification doesn't sound particularly special.
The entry-level model 305 had an 8MHz processor, 512KB of memory and a single floppy disk drive (a hard disk was an optional extra), and it cost £899. However, in comparison to most home computers of the day, which were little more than toys, cassette tape data storage and all, the Archimedes was much closer we'd think of as a proper PC.
Of course the IBM PC and the Apple Mac were already available, and while these were undoubtedly serious computers, they had price tags that, back in 1987, meant that they were first and foremost business tools and rarely found their way into the home.
What made the Archimedes stand out from anything else at the time with a similar price tag was its graphics. With a resolution of 640 x 256 (or 640 x 512 with the optional high resolution monitor) in 256 colours, the Archimedes stood head and shoulders above the competition.
Even the IBM PC could only boast 640 x 200 pixels in monochrome or 320 x 200 in four colours. And while Windows wouldn't become popular on the PC for another five years with the launch of Windows 3.0, at the outset the Archimedes could boast a graphical user interface.
This brings us back to that innovative 32-bit RISC ARM chip, without which the Archimedes' high resolution graphics and GUI just wouldn't have been feasible.