Back to Glossary

What is Token?

Technical Glossary term: Token
Short Answer

The basic unit of text that AI models process (words, parts of words, or characters).

Tokens are the fundamental units of text that AI models process. They can represent complete words, parts of words, or individual characters, depending on the language and tokenization method used.

Token characteristics:

  • Variable length: Can be words, syllables, or characters
  • Language dependent: Different for different languages
  • Cost implications: Affect API pricing and usage
  • Processing units: How models understand and generate text
  • Context limits: Determine maximum input/output length

⚙️ Technical Details

Tokenization varies by model and language. English words are often single tokens, while longer words may be split into multiple tokens. Special characters and punctuation are typically separate tokens.