QuickPath Interconnect vs HyperTransport
24th Nov 2010 | 10:00
High-speed bus standards from Intel and AMD compared
Due to your insatiable appetite for ever-faster PCs, the poor old Front Side Bus has had its chips. It just can't cope any more.
Since the FSB sits between the processor and the northbridge chip, all the data the processor worked on passed through it. Hanging off the northbridge chip were the memory, hard drive, graphics card and just about everything else.
Increased frequencies and clever tech such as double and quad data rates have helped, but what with multi-core processors, insanely powerful graphics cards and everything else, it's just become flooded. Data rates have topped out at 12,800 MB/s at best, which is just not fast enough.
So it's out with the old and in with the new. Two high-speed bus standards with more serious capabilities take the place of the old FSB: Intel's QuickPath Interconnect and AMD's HyperTransport.
Both involve processors with onboard memory controllers. No longer will the processor have to funnel data from the main memory down the same bus its trying to access the hard drive through and send data through to the graphics card.
Moving the memory controller off the FSB means the old limitations on memory speed have been removed too – it's no longer limited to the speed of the FSB, so dual or triple channel starts to make real sense.
Dual channel DDR3-2000, for example, can deliver a maximum (on paper) of 32GB/s, considerably more than the FSB would ever be able to handle. The second aspect of these new architectures is the bus itself, which sits between the processor and the northbridge chip (or what was the northbridge chip).
Strictly speaking, both QPI and HT are not buses, but point-to-point connections. A bus is a set of wires that can be used to connect a number of devices, while point-to-point, as you might have guessed, is just for connecting two devices. We'll still call them buses though, because it's a set of wires on the motherboard transferring data, which to most of us means it's a bus.
Both systems use similar methods and have similar sets of features, although the technical implementations are different.
AMD was first of the big two to replace the FSB with its HyperTransport, released in 2003 with its Athlon 64 processors coupled to Nvidia nForce chipsets. Bundled together it's called the Direct Connect Architecture. HyperTransport is an open standard and has its own .org site to prove it. It was first announced to the world in 2001 and was originally called the Lightning Data Transport (super hero names appear to be de rigueur for high-speed connections).
HyperTransport has separate data paths for input and output, enabling the processor to read and write at the same time. It also employs double data rate technology to squeeze two transfers per clock, and has a variable bit-width between two and 32 bits.
Thus 3.2GHz double data rate by 32 bits gives us 3.2 x 2 x 32 bits per second in each direction, which when divided by 8 to convert into bytes, reaches the headline 51.2GB/s maximum figure, 25.6GB/s in each direction.
However, AMD processors use a 16-bit wide HyperTransport, and the best of them are running HyperTransport at 2GHz, which gives us 16GB/s. HyperTransport isn't limited to PCs. It's being developed to be used on much more complex server systems, and as a bus in high-speed routers. There are even HyperTransport expansion cards.
The idea of HyperTransport-enabled expansion cards is an interesting one. It might not mean much for desktops, but for servers it opens all sorts of possibilities. Version 3.0 even supports hot plugging.
Five years after AMD released its high-speed bus on the PC, Intel responded with QuickPath Interconnect. It was developed by a team that used to be part of Digital Equipment Corporation, and was originally called the Common Systems Interface.
As with HyperTransport, it's a point-to-point connection, and offers similar speeds. Of course, the two systems are completely incompatible with each other. QPI first appeared in 2008 on the X58 motherboard chipset and the Xeon 5500 processor. QPI currently features on the top end of Intel's Nehalem range.
Taken together with the integrated memory controller, it forms the QuickPath Architecture. The physical structure consists of two sets of 20 data links, one set going in one direction, and one set going in the other. As with HyperTransport, it employs low-voltage differential signalling, which uses two signals on separate lines to transmit data, reading the difference between the two.
It's a method used on PCs and many other electronic devices to combat electromagnetic interference. Add in a pair of clock lines, and the total pin count is 84.
QPI transmits data in flits (yes, that's what Intel calls them) of 80 bits, which takes two clock cycles. Each flit has 8 bits of error detection, 8 bits of header, leaving 64 bits of data. The bus is divided into quadrants, each with five data links, and each can be used independently.
QPI runs off a multiplier of the base clock (133MHz for Intel), currently either x18 or x24, giving us speeds of 2.4 or 3.2GHz. All this boils down to theoretical maximum transfer rates of 19.2GB/s for the 2.4GHz version, and 25.6GB/s for the 3.2GHz.
The industry likes to employ the term, 'Gigatransfer GT/s'. This is double the frequency, as it includes the double data tech (transferring two bits per clock), thus 6.4GT/s equates to 3.2GHz. Since it's bi-directional, you might prefer to halve those figures, though.
Only Intel's top dogs get QPI, and even pretty tasty i7s have to struggle on with the FSB and DMI bus instead. (Direct Media Interface is another point-to-point bus that sits between the northbridge and southbridge chips.)
You'll find the 2.4GHz version of QPI on Intel i7-920 through to the 970. The faster version graces the i7-975X, 980X and 990X.
Can I overclock QPI?
You've just read about how fast these systems are, and you want to overclock them already? Typical. Basically, it's not such a good idea, since you're better off trying to squeeze more performance out of your system elsewhere.
Since QPI runs off the base clock (like everything else), adjusting the base clock will overclock QPI. It isn't advised though, as it also overclocks the Uncore (so called because it isn't part of the core).
The Uncore controls the L3 cache and the memory bus. This must be clocked at twice the memory clock, and doesn't react well to being played around with. Your best bet is sticking to the simple stuff, such as experimenting with the CPU multiplier.
The only time you might want to mess with QPI speeds is if your processor has a locked multiplier. This means you'll have to resort to playing with the base clock. Your best bet here is to drop the QPI multiplier to 18 if it isn't there already and then start increasing the base clock and vCore by little steps until it all stops working.
It's a similar story with HyperTransport, which can be a sensitive beast if you play with the multiplier or hike the base clock up too much. You'll get more tangible gains elsewhere.
The gains that were once had by playing with the FSB speed are gone. We know this won't stop you from trying though. As you might have spotted, the replacement of the FSB with something faster gives your graphics card more breathing space, too.
For a start it doesn't have to share the processor's attention with the main memory. The current top spec for connection is PCIe 2.0, which is another high-speed point-to-point system, and can theoretically shift 8GB/s. The next step, PCIe 3.0, is due to be finalised in November, and will be able to shift 32GB/s. We won't see cards until next year though.
Which system is faster?
There are a lot of numbers thrown about, and to be frank they really don't matter. QPI is currently faster, but HyperTransport is ultimately more flexible. Both are extremely capable, and can quite happily handle a single-processor desktop PC with room to spare.
It's still fairly early days too, and both will get faster. Where this tech really flies is in multi-processor systems, connecting each processor to each other processor in a little net of connections, enabling high-speed non-uniform shared memory access.
These boys are so capable that on the average desktop you're laughing. That said, you can be sure that if there is bandwidth available, somebody will find a way to use it all up, and it all starts again…
Liked this? Then check out Sandy Bridge performance: what's it like?
Sign up for TechRadar's free Weird Week in Tech newsletter
Get the oddest tech stories of the week, plus the most popular news and reviews delivered straight to your inbox. Sign up at http://www.techradar.com/register