Architecture | Ampere |
Process Size | 7nm | TSMC |
Transistors | 54.2 Billion |
Die Size | 826mm² |
Peak FP64 | 9.9 TFLOPS |
Peak FP64 Tensor Core | 19.1 TFLOPS | Sparsity |
Peak FP32 | 19.9 TFLOPS |
TF32 Tensor Core | 159 TFLOPS | Sparsity |
Peak FP16 Tensor Core | 318.5 TFLOPS | Sparsity |
Peak INT8 Tensor Core | 637 TFLOPS | Sparsity |
Multi-instance GPU Support | 7 MIGs at 10GB/each |
3 MIGs at 20GB/each | |
2 MIGs at 40GB/each | |
1 MIG at 80GB | |
GPU Memory | 80GB HBM2e |
Memory Bandwidth | 2039GB/s |
Interconnect | PCIe Gen4 (x16 Physical, x8 Electrical) |
NVLink Bridge | |
Networking | 2 * 100Gbps ports, Ethernet or InifiniBand |
Maximum Power Consumption | 300W |
NVLink | 3rd Gen |
600GB/s Bi-directional | |
Media Engines | 1 * Optical Flow Accelerator (OFA) |
1 * JPEG Decoder (NVJPEG) | |
4 * Video Decoders (NVDEC) | |
Integrated DPU | NVIDIA BlueField-2 |
Implements NVIDIA ConnectX-6 DX functionality | |
8 Arm A72 Cores at 2 GHz | |
Implements PCIe Gen4 Switch | |
NVIDIA Enterprise Software | NVIDIA vCS (Virtual Compute Server) |
NVIDIA AI Enterprise | |
Form Factor | 2-slot, FH, FL |
Thermal Solution | Passive |