What is Attention?

Short Answer

A mechanism that allows models to focus on relevant parts of input when generating output.

Attention is a key mechanism in transformer-based language models that allows the model to focus on different parts of the input sequence when generating each token of the output. This enables the model to understand context and relationships between different parts of the text.

Types of attention include:

Self-attention: Model attends to different parts of the input sequence
Cross-attention: Model attends to different parts of multiple sequences
Multi-head attention: Multiple attention mechanisms working in parallel

⚙️ Technical Details

Attention mechanisms calculate attention scores between all pairs of tokens, allowing the model to weight the importance of different input tokens when generating output.

⚙️ Technical Details

Related Terms

Transformer

Token