Tesla G80 is NVIDIA’s first unified shader architecture and the first GPU to expose the CUDA programming model. It introduces SIMT execution: scalar ALUs execute threads in fixed warps of 32, issued in lockstep by a single in-order warp scheduler per SM. Control-flow divergence is handled via warp masking, with no independent thread progress.
Each Texture Processor Cluster (TPC) includes a small L1 texture/read cache, while off-chip memory traffic is serviced via per–memory-partition L2 caches tightly coupled to ROP partitions. There is no general-purpose per-SM data cache; latency hiding relies on high warp occupancy and explicit shared memory usage.
- Compute capability: 1.0 (SM_10)
- Execution model: Warp-synchronous SIMT, single warp scheduler per SM
-
SM-visible memory model:
- Per-TPC L1 texture/read cache
- Per memory-partition (ROP-coupled) L2 cache
- Explicitly managed shared memory
- Memory support: GDDR3
CUDA / ISA feature set (SM_10)
-
Baseline PTX and native ISA
-
Integer and FP32 arithmetic only
-
No atomic operations
-
No warp vote / ballot instructions
-
No FP64 (double precision)
-
Strict global memory coalescing rules (half-warp, aligned, sequential)
-
Denormalized FP values flushed to zero
-
Architectural significance: Establishes NVIDIA’s SIMT execution model and baseline CUDA SM semantics
Disclaimer
The info found in this page might not be entirely correct. Check out this guide to learn how you can improve it.