NVIDIA’s recent Pascal launch started with the GTX 1080, which had reviews go up all over the internet two days ago. It’s an absolute monster of a card, coming in around 25% faster than a Geforce GTX Titan X, and easily besting two GTX 970 cards in SLI. But it’s not the only care that NVIDIA will be launching this quarter of 2016 – on 10 June 2016, NVIDIA’s partners will start to sell the GTX 1070 on retail channels in the US, Canada and most of Europe, and it’s set to be at least as fast as the GTX 980 Ti for almost half the price. NVIDIA has finally revealed the full specifications of this card, and it’s a very interesting option.

NVIDIA Geforce Pascal hardware comparison

GTX 1080 GTX 1070 GTX 980 GTX 970
 GPU family name  Pascal Pascal Maxwell Maxwell
 CUDA Core count  2560 1920 2048 1664
 Single-precision throughput 9.0 TFLOPS 6.5 TFLOPS 4.6 TFLOPS 3.5 TFLOPS
 Base clock 1607 MHz 1506 MHz 1127 MHz 1050 MHz
 Boost clock 1733 MHz 1683 MHz 1216 MHz 1178 MHz
 Raster Operators 64 64 64 56
 Texture units 160 120 128 104
 Shader module count 20 15 16 13
 Memory bus width 256-bit 256-bit 256-bit 256-bit
 Memory bandwidth 320 GB/s 256GB/s 224 GB/s 224 GB/s
 Outputs Displayport 1.4, HDMI 2.0b, DVI Displayport 1.4, HDMI 2.0b, DVI Displayport 1.2a, HDMI 2.0, DVI Displayport 1.2a, HDMI 2.0, DVI
 HDCP 2.2 support Yes Yes Yes Yes
 Maximum temperature 95° C 94° C 98° C 98° C
 Thermal power limit 180W 150W 165W 145W
 Power connectors 1x 8-pin 1x 8-pin 2x 6-pin 2x 6-pin
 Launch price $599 $379 $400 $285

Instead of comparing the GTX 970 and GTX 980 Ti to the GTX 1070, as a lot of people are wont to do when looking at the specifications, the actual targets for NVIDIA are owners of the following cards: Geforce GTX 680, GTX 670, GTX 770, GTX 780, GTX 970, and GTX 980. If you’re upgrading from these specific cards, and anything below them, you’re getting a sizeable performance jump from that upgrade. You might begin to see CPU bottlenecking on Intel CPUs from the Nehalem and Sandy Bridge family, but it’ll be at such a low level that you probably won’t notice it.

On paper, the GTX 1070 is a very interesting part. It’s the first time that NVIDIA’s released a second-highest GPU in a new family that is 25% slower than the launch flagship. The price gap between the two is almost $200, and a lot of people are going to have to think long and hard whether that 25% extra performance is worth the buy-in. You can probably close the gap to the GTX 1080 by about 10% through overclocking, and that’s looking at the minimum overclocks that you might see from this chip. The GTX 970 and GTX 980 were very, very close in raw performance, and that gap of around 5% only extended to the 10% it is today thanks to driver optimisations.

nvidia geforce GTX 1070 block diagram

The block diagram for GTX 1070 outlines this quite well. Basically one GPC is disabled entirely, but mostly everything else is left intact. That includes the full compliment of L2 cache as well as the 256-bit memory bus. There’s a small drop in the texture unit count from 160 to 120 units, but the ROP count remains the same. With 64 ROPs on board, performance at 4K should be similar to current generation cards like the GTX 980 Ti and Radeon R9 Fury X, but with a slight lead thanks to the improved architecture.

Seeing NVIDIA segment performance in this way leaves nothing to the imagination of how they’re going to approach the smaller GP104 derivative cards as well as GP 106 and GP108. The GTX 1080 is known as GP104-400, while the GTX 1070 is GP104-200, and there might be two more cards based on the GP104 die that we haven’t seen yet. That’s a return to NVIDIA’s old ways of having four cards derived from the high-end consumer chip, which we didn’t have with Maxwell because it only spawned the GTX 980 and GTX 970. The reason why that happened is pretty clear – NVIDIA could easily make a ton of dies that had high yields, and so the GTX 970 became their volume card, an unprecedented move at the time because this has always been the position taken up by the third-fastest card in the consumer stack.

In the future, I think we’ll see a GTX 1060 and GTX 1060 Ti popping out to fill in the gaps left. The GTX 1060 might have two disabled GPC clusters for a total of 1280 CUDA cores, 32 ROPs and 80 texture units, all on a 192-bit memory bus with GDDR5 memory. The GTX 1060 Ti, whenever it launches, will probably be more of the same, but with 1536 CUDA cores, 32 ROPs, and 100 texture units. The cards below that will probably be GP106 and GP108. NVIDIA might have a difficult time selling those chips to consumers, though, as AMD is targeting the mid-range and budget markets first with Polaris.

