Architecture |
Ampere |
GPU memory |
24GB HBM2 |
Memory Bandwidth |
933GB/s |
Peak FP64 |
5.2 TF |
Peak FP64 Tensor Core |
10.3 TF |
Peak FP32 |
10.3 TF |
TF32 Tensor Core |
82 TF || 165 TF with sparsity |
BFLOAT 16 Tensor Core |
165 TF || 330 TF with sparsity |
Peak FP16 Tensor Core |
165 TF || 330 TF with sparsity |
Peak INT8 Tensor Core |
330 TOPS || 661 TOPS with sparsity |
Peak INT4 Tensor Core |
661 TOPS || 1321 TOPS with sparsity |
Media Engines |
1 * optical flow accelerator (OFA) |
|
1 * JPEC decoder (NVJPEG) |
|
4 * video decoders (NVDEC) |
Interconnect |
PCIe Gen4: 64GB/s |
|
3rd Gen NVIDIA® NVLINK® |
|
200GB/s** |
|
**NVLink Bridge for up to two GPUs |
Form Factor |
2-slot, FH, FL |
Max Thermal Design Power (TDP) |
165W |
Multi-Instance GPU (MIG) |
4 MIGs @ 6GB each |
|
2 MIGs @ 12GB each |
|
1 MIG @ 24GB |
Virtual GPU (vGPU) Software Support |
NVIDIA AI Enterprise |
|
NVIDIA Virtual Compute Server |