The basic unit of text that AI models process (words, parts of words, or characters).
Tokens are the fundamental units of text that AI models process. They can represent complete words, parts of words, or individual characters, depending on the language and tokenization method used.
Token characteristics:
Tokenization varies by model and language. English words are often single tokens, while longer words may be split into multiple tokens. Special characters and punctuation are typically separate tokens.