|
|
|
|
|
GoForce
3D architecture is a completely new architecture
|
|
•designed from the ground up to be efficient in
terms of performance and power.
|
|
•By low power, what this means is that
|
|
•Implementations require less than 100mW or in
some cases 50mW per 100 mega-pixels per second.
|
|
•That’s a significant reduction in power
consumption when compared to the traditional graphics pipeline.
|
|
|
|
•Much shallower architecture, there are fewer
pipeline stages.
|
|
•Stages only trigger on activity.
|
|
•Unlike the traditional pipeline, where the
units are always triggering. Here
things are more power efficient, because only the stages that are busy are
clocking.
|
|
|
|
•Built around a “fragment ALU”
|
|
•Programmable unit that we use to implement most
per-pixel operations
|
|
|
|
•Transform and setup engine
|
|
•VLIW-like unit that handles the transformation
and setup tasks
|
|
•Native fixed and floating point data, BUT setup
work is computed in floating point – ensures accurate rendering.
|
|
•Vertex cache – vertex re-use improves
performance. Indexed trianglelists or
strips allows hardware to use cached copy
|
|
•Frustum clipper – eliminates/clips triangles
outside of the view frustum.
|
|
|
|
•Raster – two roles
|
|
•Generates pixel fragments – (Z, colors, texture
coordinates,etc.)
|
|
•Manages recirculation of pixels to downstream
parts of the pipeline
|
|
|
|
•Texture Unit
|
|
•Z-fetch and Z-comparison (early)
|
|
•Color fetch operation – when doing FB blending
this unit fetches the FB color
|
|
•Undithering (optional)
|
|
•Texture Fetch
|
|
•Cache, Filtering, Format Conversion
|
|
•Decompression
|
|
|
|
•Fragment ALU
|
|
•Signed 10-bit math per-component
|
|
•Programmable – used to implement Texture
Combine modes
|
|
•Per-pixel ops:
fog, alpha blend, alpha test
|
|
|
|
•Data Write
|
|
•Writes data to the framebuffer
|
|
•Format conversions when writing data to the
framebuffer (I.e. it can do “dithering”)
|
|
•Optional: recirculation
|
|
|
|
•Key Benefits
|
|
•Scalable
|
|
•Low Power
|
|
•Programmable
|
|
|
|
|
|
Floating
point VLIW machine
|
|
Precision
for accurate rendering
|
|
Vertex
Buffer for vertex re-use
|
|
Triangle
strips, fans, meshes …
|
|
Supports
Float or Fixed formats
|
|
Frustum
clipper
|
|
|
|
Scoreboard
– basically a “traffic cop” – manages recirculation of pixels traffic
|
|
|
|
Z Fetch
and compare (Early Z)
|
|
Color
Fetch and compare (color keying)
|
|
Optional
un-dithering of color
|
|
Texture
Fetch
|
|
Cache
|
|
Filter
|
|
Format
conversion
|
|
Decompression
|
|
|
|
Signed
10bit math per component
|
|
Texture
Combine
|
|
Fog
|
|
Alpha
Blend
|
|
Alpha
Test
|
|
|
|
|
|
Format
conversion
|
|
Optional
Dithering
|
|
Optional
Re-circulation forwarding
|