What is Attention Mechanism?

Question

Accepted Answer

A neural network component that allows a model to focus on the most relevant parts of an input sequence when producing an output, enabling context-aware processing of data. The attention mechanism was introduced in 2014 for machine translation and became the core of the transformer architecture in 2017. Self-attention allows every token in a sequence to attend to every other token, capturing long-range dependencies that earlier architectures like RNNs struggled with. Multi-head attention runs several attention functions in parallel, enabling the model to capture different types of relationships simultaneously. Attention is the key innovation behind all modern LLMs.

Attention Mechanism

Explore the Data

Related Terms

Artificial General Intelligence (AGI)

AI Alignment

ChatGPT

Fine-Tuning

Foundation Model

Frontier Model

AI Economy Pulse