A mechanism that allows models to focus on relevant parts of input when generating output.
Attention is a key mechanism in transformer-based language models that allows the model to focus on different parts of the input sequence when generating each token of the output. This enables the model to understand context and relationships between different parts of the text.
Types of attention include:
Attention mechanisms calculate attention scores between all pairs of tokens, allowing the model to weight the importance of different input tokens when generating output.