Nvidia Tesla GT200 (SM_13)
threadsMemory ctrl.
GDDR3
cpuFeatures
  • Nvidia SM_13
actActions
doc No documents available
notes Notes

Tesla GT200 is the high-end, compute-oriented culmination of the Tesla architecture. At the SM level, it preserves the same execution and memory hierarchy as earlier Tesla designs but adds native double-precision floating-point execution, with FP64 units integrated into the scalar pipeline.

The absence of a general-purpose per-SM data cache remains; performance depends on memory bandwidth, warp-level latency hiding, and explicit shared memory management.

  • Compute capability: 1.3 (SM_13)
  • Execution model: Warp-synchronous SIMT with scalar FP32 and FP64 pipelines
  • SM-visible memory model:

    • Per-TPC L1 texture/read cache
    • Per memory-partition L2 cache
    • Explicit shared memory
  • Memory support: GDDR3 (often 448–512-bit buses)

CUDA / ISA feature set (SM_13)

  • Adds native FP64 (double-precision) arithmetic

    • add.f64, mul.f64, fma.f64, etc.
  • Retains all SM_12 features:

    • Shared memory atomics
    • 64-bit global atomics
    • Warp vote / ballot
    • Relaxed memory coalescing
  • First CUDA architecture capable of sustained double-precision workloads

  • Architectural significance: First NVIDIA SM suitable for serious FP64 CUDA and early HPC workloads

Disclaimer

The info found in this page might not be entirely correct. Check out this guide to learn how you can improve it.