Following off from my song-and-dance from Nvidia’s team, I decided to now have a proper look at Nvidia’s Kepler/GK104-based GTX680. The response from the online community has been one of awe and praise. The GTX680 brings value to gamers in the price range of the HD7970 and will allow enthusiasts running SLI GTX460 rigs to consider swapping their two cards for one with the same performance and benefits.
Firstly, something needs to be said about the design and execution of both product refreshes for Nvidia and AMD. Its like they’ve switched sides and didn’t tell the public about it. AMD’s GCN turns to a more efficient design from the clunky VLIW4 architecture, and brings the Radeon HD7970 abreast of Fermi. In fact, GCN could be considered a better version of Fermi, as it incorporates a more efficient hardware code scheduler which sorts through software code dependencies and then executes the software that needs to be finished first, rather than VLIW4 doing things one-by-one and allowing hardware to stand idle while the thread was completed. GCN changed that by doing the code that other code is dependent on first.
GK104 is an improvement on Fermi as well. It takes the GF104 design found in the GTX460, damn near quadruples and doubles everything you find, and chops out a great deal of the hardware scheduling stuff that was in Fermi. In Fermi the code went through the Multi-Port Post Decode Queue, which took threads that were required by the application and put those into the dependency checker. If there were no dependencies, the code was sent off to be re-ordered and worked through the rest of the system to be completed. If it was dependent, that code had to be put in line to run through after the dependent code was finished and put into the Register scoreboard for later use.
GK104 does differently by chopping out those unnecessary bits and includes a shorter hardware stage that sorts out dependencies and does those first, with the independent code being sent through the pipeline afterwards with the dependent code.Does that make it an inferior chip? Not at all – the power savings from the smaller CUDA cores more than makes up for the loss of the hardware schedulers, and while Direct Compute performance does suffer, this is first and foremost a gaming graphic card – if you want better GPU-accelerated application performance, you’re better off buying a Quadro or Tesla card suited to the particular workload that needs to be done. Another benefit brought to the table is the power used for one clock cycle (from the Cache to the final Issue on the graph). It takes about half the required power for one cycle on a Kepler card compared to Fermi.
[pullquote]Kepler takes it one step forward and jumps the gun, going straight to the redesign process instead of Nvidia’s typical re-hash and rebrand strategy.[/pullquote]What we’re left with is, really, a cleaner design that essentially throws three GTX460s in together and crams it all onto a single board. While GCN is a better version of the Fermi design, Kepler takes it one step forward and jumps the gun, going straight to the redesign process instead of Nvidia’s typical re-hash and rebrand strategy, building something smarter than both in the process. The performance of Fermi was actually held back quite a bit – if higher TDPs were allowed and if the 28nm process was used, the GTX580 would have sat atop with the performance crown for much longer. Mind you, early versions of the chip didn’t produce a lot more performance when overclocked, instead producing more heat. The Ti versions of cards fixed this to a great extent, and the change on the GTX560Ti version was probably the first indication of where Nvidia was going with Kepler. The GTX460 was a great mid-range card, and sold in far larger quantities than the GTX580 and GTX570.
But back to Kepler. The first thing one notices when looking at the card is the stacked 6-pin PCI Express power plugs, rated for 225watts. It’s important to remember that limit later, as it’s a limit to which the GTX680 can overclock. While the GTX580 had a 8 and 6-pin connector, the GTX680 is sure to fit in more PCs thanks to its lower power requirement of 195watts. In fact, most 550watt power supplies should easily supply enough headroom for rigs that won’t be overclocked. If you’re maxing it out, a 650watt 80+ model is recommended.
Moving around and to the back, we see two SLI fingers, allowing for four-way SLI. As we’ll see later, scaling in SLI is good, but most people haven’t had a chance to test a three or four-way setup yet (and by then I’m fully expecting the processor, chipset and memory to hold performance back – this is where PCI Express 3.0 comes into play). At the very back we see a half-size exhaust, two dual-link DVI ports, a HDMI 1.4a full-sized connector and one full-size DisplayPort connector, enabling up to four monitors to be plugged into the GTX680. With the presence of DisplayPort, we can finally have a multi-monitor setup without the need for SLI. Hopefully as more gamers stack up their monitor counts, DisplayPort will be more widely adopted on monitors in future. As it is, getting properly working DP to DVI or HDMI converters is a pain.
Finally, looking at the card head-on, we see the reference cooler that will be on most cards at launch. Some companies have already released details of their non-reference designs, and I’ll have a look at that too as the week progresses. Underneath the single fan we find a dual-slot aluminium heat sink with fins to draw away heat, and sound dampening on the card where the fan sits. It’s a welcome addition, bringing the GTX680 to near-whisper quiet levels normally associated with non-reference coolers.
Underneath the fins we see the board itself. The GPU is made up of 3.54 billion transistors on the 28nm process, and measures 294mm2. Nvidia’s flagship GPU designs are usually 500mm2 in area, and in fact the GTX580 measured 520mm2. The space savings in Kepler are enormously apparent, and its racking up to be an interesting contender for the high-range performance crown.