Nvidia Tesla GT200 (SM

Nvidia Tesla GT200 (SM_13)

Logs

Socket

Empty

Features

Nvidia SM_13

Memory ctrl.

GDDR3

Actions

No documents available

Notes

Tesla GT200 is the high-end, compute-oriented culmination of the Tesla architecture. At the SM level, it preserves the same execution and memory hierarchy as earlier Tesla designs but adds native double-precision floating-point execution, with FP64 units integrated into the scalar pipeline.

The absence of a general-purpose per-SM data cache remains; performance depends on memory bandwidth, warp-level latency hiding, and explicit shared memory management.

Compute capability: 1.3 (SM_13)
Execution model: Warp-synchronous SIMT with scalar FP32 and FP64 pipelines
SM-visible memory model:
- Per-TPC L1 texture/read cache
- Per memory-partition L2 cache
- Explicit shared memory
Memory support: GDDR3 (often 448–512-bit buses)

CUDA / ISA feature set (SM_13)

Adds native FP64 (double-precision) arithmetic
- add.f64, mul.f64, fma.f64, etc.
Retains all SM_12 features:
- Shared memory atomics
- 64-bit global atomics
- Warp vote / ballot
- Relaxed memory coalescing
First CUDA architecture capable of sustained double-precision workloads
Architectural significance: First NVIDIA SM suitable for serious FP64 CUDA and early HPC workloads

Disclaimer

The info found in this page might not be entirely correct. Check out this guide to learn how you can improve it.