NVIDIA A100 TENSOR CORE GPU 80GB SXM

NVIDIA A100 TENSOR CORE GPU 80GB SXM

From
Special Price $24,999.00 Regular Price $75,000.00
Rewards Banner

NVIDIA A100 TENSOR CORE GPU 80GB SXM

Earn 24,999 points when you buy me!

Hurry! Other 1 people are watching this product
SKU
NVIDIA-A100
Special Price $24,999.00 Regular Price $75,000.00
In stock
Free shipping
could be yours in 1 - 5 days
days hrs mins secs
Hurry! Other 1 people are watching this product

Need Help Making a Decision?

Ask an expert to help you with your purchase today.

Ask An Expert → +1 833 631 7912 (North America)
 

 

The NVIDIA A100 Tensor Core GPU delivers unprecedented acceleration at every scale for AI, data analytics, and HPC to tackle the world’s toughest computing challenges. As the engine of the NVIDIA data center platform, A100 can efficiently scale up to thousands of GPUs or, using new Multi-Instance GPU (MIG) technology, can be partitioned into seven isolated GPU instances to accelerate workloads of all sizes. A100’s thirdgeneration Tensor Core technology now accelerates more levels of precision for diverse workloads, speeding time to insight as well as time to market.

 

Details

An Order-of-Magnitude Leap for Accelerated Computing

Tap into exceptional performance, scalability, and security for every workload with the NVIDIA H100 Tensor Core GPU. With the NVIDIA NVLink™ Switch System, up to 256 H100 GPUs can be connected to accelerate exascale workloads. The GPU also includes a dedicated Transformer Engine to solve trillion-parameter language models. The H100’s combined technology innovations can speed up large language models (LLMs) by an incredible 30X over the previous generation to deliver industry-leading conversational AI.

 
Supercharge Large Language Model Inference with H100 NVL

 

Supercharge Large Language Model Inference

For LLMs up to 175 billion parameters, the PCIe-based H100 NVL with NVLink bridge utilizes Transformer Engine, NVLink, and 188GB HBM3 memory to provide optimum performance and easy scaling across any data center, bringing LLMs to mainstream. Servers equipped with H100 NVL GPUs increase GPT-175B model performance up to 12X over NVIDIA DGX™ A100 systems while maintaining low latency in power-constrained data center environments.

Ready for Enterprise AI?

Enterprise adoption of AI is now mainstream, and organizations need end-to-end, AI-ready infrastructure that will accelerate them into this new era. 

NVIDIA H100 GPUs for mainstream servers come with a five-year subscription, including enterprise support, to the NVIDIA AI Enterprise software suite, simplifying AI adoption with the highest performance. This ensures organizations have access to the AI frameworks and tools they need to build H100-accelerated AI workflows such as AI chatbots, recommendation engines, vision AI, and more.

Securely Accelerate Workloads From Enterprise to Exascale

Up to 4X Higher AI Training on GPT-3

Up to 4X Higher AI Training on GPT-3

Projected performance subject to change.  GPT-3 175B training A100 cluster: HDR IB network, H100 cluster: NDR IB network | Mixture of Experts (MoE) Training Transformer Switch-XXL variant with 395B parameters on 1T token dataset,  A100 cluster: HDR IB network, H100 cluster: NDR IB network with NVLink Switch System where indicated.

Transformational AI Training

H100 features fourth-generation Tensor Cores and a Transformer Engine with FP8 precision that provides up to 4X faster training over the prior generation for GPT-3 (175B) models. The combination of fourth-generation NVLink, which offers 900 gigabytes per second (GB/s) of GPU-to-GPU interconnect; NDR Quantum-2 InfiniBand networking, which accelerates communication by every GPU across nodes; PCIe Gen5; and NVIDIA Magnum IO™ software delivers efficient scalability from small enterprise systems to massive, unified GPU clusters.

Deploying H100 GPUs at data center scale delivers outstanding performance and brings the next generation of exascale high-performance computing (HPC) and trillion-parameter AI within the reach of all researchers.

 

Real-Time Deep Learning Inference

AI solves a wide array of business challenges, using an equally wide array of neural networks. A great AI inference accelerator has to not only deliver the highest performance but also the versatility to accelerate these networks.

H100 extends NVIDIA’s market-leading inference leadership with several advancements that accelerate inference by up to 30X and deliver the lowest latency. Fourth-generation Tensor Cores speed up all precisions, including FP64, TF32, FP32, FP16, INT8, and now FP8, to reduce memory usage and increase performance while still maintaining accuracy for LLMs.

Up to 30X higher AI inference performance on the largest models

Megatron chatbot inference (530 billion parameters)

Real-Time Deep Learning Inference

Projected performance subject to change. Inference on Megatron 530B parameter model based chatbot for input sequence length=128, output sequence length =20 | A100 cluster: HDR IB network | H100 cluster: NVLink Switch System, NDR IB

Up to 7X higher performance for HPC applications

AI-fused HPC Applications

Projected performance subject to change. 3D FFT (4K^3) throughput | A100 cluster: HDR IB network | H100 cluster: NVLink Switch System, NDR IB | Genome Sequencing (Smith-Waterman) | 1 A100 | 1 H100

Exascale High-Performance Computing

The NVIDIA data center platform consistently delivers performance gains beyond Moore’s law. And H100’s new breakthrough AI capabilities further amplify the power of HPC+AI to accelerate time to discovery for scientists and researchers working on solving the world’s most important challenges.

H100 triples the floating-point operations per second (FLOPS) of double-precision Tensor Cores, delivering 60 teraflops of FP64 computing for HPC. AI-fused HPC applications can also leverage H100’s TF32 precision to achieve one petaflop of throughput for single-precision matrix-multiply operations, with zero code changes. 

H100 also features new DPX instructions that deliver 7X higher performance over A100 and 40X speedups over CPUs on dynamic programming algorithms such as Smith-Waterman for DNA sequence alignment and protein alignment for protein structure prediction.

DPX instructions comparison NVIDIA HGX™ H100 4-GPU vs dual socket 32-core IceLake.

Accelerated Data Analytics

Data analytics often consumes the majority of time in AI application development. Since large datasets are scattered across multiple servers, scale-out solutions with commodity CPU-only servers get bogged down by a lack of scalable computing performance.

Accelerated servers with H100 deliver the compute power—along with 3 terabytes per second (TB/s) of memory bandwidth per GPU and scalability with NVLink and NVSwitch™—to tackle data analytics with high performance and scale to support massive datasets. Combined with NVIDIA Quantum-2 InfiniBand, Magnum IO software, GPU-accelerated Spark 3.0, and NVIDIA RAPIDS™, the NVIDIA data center platform is uniquely able to accelerate these huge workloads with higher performance and efficiency.

 

 

Accelerated servers with H100
 
 
 
 
NVIDIA Multi-Instance GPU

 

 

Enterprise-Ready Utilization

IT managers seek to maximize utilization (both peak and average) of compute resources in the data center. They often employ dynamic reconfiguration of compute to right-size resources for the workloads in use. 

H100 with MIG lets infrastructure managers standardize their GPU-accelerated infrastructure while having the flexibility to provision GPU resources with greater granularity to securely provide developers the right amount of accelerated compute and optimize usage of all their GPU resources.

Built-In Confidential Computing

Traditional Confidential Computing solutions are CPU-based, which is too limited for compute-intensive workloads such as AI at scale. NVIDIA Confidential Computing is a built-in security feature of the NVIDIA Hopper™ architecture that made H100 the world’s first accelerator with these capabilities. With NVIDIA Blackwell, the opportunity to exponentially increase performance while protecting the confidentiality and integrity of data and applications in use has the ability to unlock data insights like never before. Customers can now use a hardware-based trusted execution environment (TEE) that secures and isolates the entire workload in the most performant way.

NVIDIA Confidential Computing Solutions
 
 
 
 
NVIDIA Confidential Computing Solutions

 

Exceptional Performance for Large-Scale AI and HPC

The Hopper Tensor Core GPU will power the NVIDIA Grace Hopper CPU+GPU architecture, purpose-built for terabyte-scale accelerated computing and providing 10X higher performance on large-model AI and HPC. The NVIDIA Grace CPU leverages the flexibility of the Arm® architecture to create a CPU and server architecture designed from the ground up for accelerated computing. The Hopper GPU is paired with the Grace CPU using NVIDIA’s ultra-fast chip-to-chip interconnect, delivering 900GB/s of bandwidth, 7X faster than PCIe Gen5. This innovative design will deliver up to 30X higher aggregate system memory bandwidth to the GPU compared to today's fastest servers and up to 10X higher performance for applications running terabytes of data.

Tech Specs

Product Specifications

 

Form Factor   H100 PCIe  
FP64   9.7 teraFLOPS  
FP64 Tensor Core   19.5 teraFLOPS  
FP32   19.5 teraFLOPS  
TF32 Tensor Core   156 TFLOPS | 312 TFLOPS*  
BFLOAT16 Tensor Core   312 TFLOPS | 624 TFLOPS*  
FP16 Tensor Core   312 TFLOPS | 624 TFLOPS*  
FP8 Tensor Core   3,026 teraFLOPS2  
INT8 Tensor Core   624 TOPS | 1248 TOPS*  
GPU memory   80GB HBM2e  
GPU memory bandwidth   1,935 GB/s  
Decoders   7 NVDEC 7 JPEG  
Max thermal design power (TDP)   300W  
Multi-Instance GPUs   Up to 7 MIGs @ 10GB  
Form factor   PCIe dual-slot air-cooled or single-slot liquid-cooled  
Interconnect   NVIDIA® NVLink® Bridge for 2 GPUs: 600 GB/s ** PCIe Gen4: 64 GB/s  
Server options   Partner and NVIDIA-Certified Systems with 1–8 GPUs  
NVIDIA AI Enterprise   Included  
Models
^Top