NitroWare.net

Please standby while the website is under maintenance. All existing content is still available to access.

Looking back at two years of Graphics Core Next

ATI, later AMD endured several major evolutions in its graphics architecture which mirrored industry developments and major revolutions in the PC and graphics industry. At times Radeon graphics was used by Microsoft for the development of their DirectX APU. All roads lead to GCN, which encompasses AMD's vision for a graphics core that can scale top to bottom, for high end gaming graphics cards, consoles and tablets.

AMD Radeon HD 7000 seres graphics gpu evolution

 

 

Fixed Function

  • 3dfx VooDoo Graphics series
  • ATI Rage series
  • ATI Radeon 7000 series
  • MATROX G-Series
  • NVIDIA RIVA 128
  • NVIDIA TNT, TNT 2
  • NVIDIA GeForce, GeForce 2

ATI Rage Pro Turbo

 

Simple Shaders

 

  • ATI Radeon 8000-9000 series
  • ATI Radeon X1000 series
  • MATROX P-Series
  • NVIDIA GeForce 3 - 7
ATI Radeon 8500
 

Graphics Parallel Core

  • ATI/AMD Radeon HD 2000-8000 series
  • AMD Radeon R7 series
  • AMD Radeon R9 series
  • NVIDIA GeForce 8000-9000 series
  • NVIDIA GeForce 200-700 series
ATI Radeon HD 2600 Pro


AMD Radeon HD 7000 seres graphics core next architecture

The design goals for the original GCN at launch sound remarkably familiar to those for R9, with the exception being heterogeneous computing which is still somewhat undeveloped and Fusion which is now rebranded as HSA.

AMD Radeon HD 7000 series tahiti block diagram

Tahiti resembles a 'stack of Lego bricks' which, Compared to Hawaii

AMD Radeon R9 290 series hawaii block diagram

Hawaii's GCN '2.0's hardware resources are organised into is organised into four units called "Shader Engines" which allows resources to be scaled and shared more effectively. This effectively mirror's NVIDIAs approach with Kepler expect their name for the topology is SMX units.

AMD Radeon R9 290 graphics core next shader engine

This allows for GPUs to be scaled down more easily by disabling an entire Shader engine (or SMX units) without re-spinning the entire chip to reduce the number of cores or fusing off clusters of cores. There is still some resource sharing within resources contained in each shader engine such as renderers and caches.

Each Shader Engine contains1 rasteriser and 1 Geometry Unit which can load balance, 1 Shader Engine is sufficient to operate the entire GPU.

This diagram shows a simplified overview of the graphics pipeline

  • The Graphics command processor overseas operations across load balanced resources
  • Geometry is setup and tessellated in The Geometry processors. Data can be exchanged with the compute units if needed or sent to Rasteriser directly.
  • The compute units execute pixel shaders or perform GPU computing on the scene
  • Pixel data is then passed onto the rasterisers which handle assignment or partitioning of pixels on the screen as well as Hierarchal Z sorting, i.e. the pixels depth in the scene
  • Finally, the Render Back Ends handle Pixel Depth Testing as well as stencilling and colour operations

AMD Radeon R9 290 graphics core next gpu pipeline and geometry processing

Further to do actual processing, Each of Hawaii's Shader Engines contain 11 Compute Units. The Compute Unit is the smallest physical processing block of the GPU containing all of the necessary low level building blocks that a compute processor needs to fetch, decode and execute instructions.

AMD Radeon R9 290 graphics core next compute unit

The final stage are the Render Back Ends which handle operations relating to the scene's Z(Depth), Stenciling and Color.

AMD Radeon R9 290 graphics core next render backend

That is all the graphics and compute processing pipelines explained, but a GPU many processors in parallel, which need to be fed tasks and be directed.

We need a means of scheduling and dispatching to allow the GPU to perform multi-tasking across its parallel computing units. This is where the Asynchronous Compute Units come in, which Hawaii has 8 of which are independent of the Shader engines. The ACE units queue, store and share data for use in GPU computing across the entire GPU. Graphics specific commands are issued by a separate command unit.

AMD Radeon R9 290 graphics core next aysnc compute unit

So in summary the layout of Graphics Core Next Architecture, 'version 2' as used in the 290X is essentially a scaled up version of Tahiti.

GCN '2.0' supports:

  • 1-8 Asynchronous Compute Engines
  • 1-4 Shader Engines
  • 1-11 Compute Unit per Shader Engine, giving 64 to 704 shaders per Shader Engine

AMD Radeon R9 290 graphics core next hawaii block simplified diagram

Tahiti v Hawaii – spec comparison

 

AMD Graphics Core Next
‘Tahiti’

AMD Graphics Core Next
‘Hawaii’

Increase

Compute Units/ IEEE-2008 Compliant  Shaders

32 / 2048

44 / 2816

1.4x

Geometry Processors

2

4

2.0x

Render Back-Ends
Color Operations
Depth/Stencil Operations

8
32
128

16
64
256

2.0x

L2 Cache

768KB

1MB

1.3x

Memory Bus

384 bit wide GDDR5 264 GB/sec

512-bit wide GDDR5 320GB/s

1.2x

In addition to the increased GPU resources, Hawaii adds updating display controllers for Eyefinity, AMD TrueAudio and a new version of CrossFire.

GCN v Kepler Architecture Performance & Efficiency – spec comparison

 

AMD Radeon
HD 7970 GHz Edition
‘Tahiti’

AMD Radeon
R9 290X
‘Hawaii’

Increase

NVIDIA GeForce GTX 780
‘Kepler’

NVIDIA GeForce GTX TITAN

‘Kepler’

Geometry Processing

2.1 billion primitives/sec

4 billion primitives/sec

1.9x

   

Compute

4.3 TFLOPS

5.6 TFLOPS

1.3x

4.0 TFLOPS

4.5 TFLOPS

Texture Fill Rate

134.4 Gtexels/sec

176 Gtexels/sec

1.3x

166 Gtexels/sec

188 Gtexels/sec

Pixel Fill Rate

33.6 Gpixels/sec

64 Gpixels/sec

1.9x

41.4 Gpixels/sec

40.2 Gpixels/sec

Peak Bandwidth

264 GB/sec

320GB/sec

1.2x

288 GB/sec

288 GB/sec

Die Area

352 mm^2

438 mm^2

1.24x

561 mm^2

561 mm^2

Peak GFLOPS/mm^2

12.2

12.8

1.05x

7.1

8

While Peak raw power and computing have not significantly increased, the GPU’s horsepower within its engines is much stronger with almost 2x throughput available for 3D Graphics intensive tasks such as pixel shaders and geometry at only a 25% increase in die size. On paper 290X is more efficient, thanks to its ‘higher horsepower’ design at a smaller die size than the Kepler GK110 based GTX 780.

On paper, 290X provides a good step-up from the previous generation HD 7970. The lower compute performance for NVIDIA GeForce is expected as this is a hallmark of their consumer oriented GPU.