Neural Networks
Neural networks loosely mimic biological neurons: each node weights its inputs, the result passes through activation functions, and then moves to the next layer. During training, gradients are propagated backward to reduce error (backpropagation).
Deep variants are deep learning. Scaled training on GPU enabled today’s LLM. An important architecture to know is the transformer.
Key characteristics
- Consist of layered nodes where weights are updated during training to reduce error.
- Are the core building block behind deep learning and many modern language, image, and audio models.
- Become more powerful with more data and compute, but also harder to fully interpret and control.