GPU
GPU (graphics processing unit) has thousands of small cores optimized to run the same operation across many values in parallel, which is ideal for the matrix operations used in deep learning.
While a CPU is strong for varied sequential tasks, GPU is standard for model training and much of inference in the cloud. TPU is Google’s specialized variant for tensor workloads. For users of ChatGPT, the hardware is mostly invisible, but without GPU clusters, LLM systems would be slower and more expensive.
Key characteristics
- Is standard hardware for training and inference in many neural networks because of massive parallelism.
- Directly affects cost, speed, and scalability in compute-heavy AI projects.
- Is mostly invisible to end users but central to what AI services can deliver in practice.