Hasty Briefsbeta

Bilingual

Human typing habits and token counts

14 hours ago
  • #Typing Habits
  • #Tokenization
  • #AI Billing
  • Human typing habits like typos, shorthand, filler words, and pasted data increase token counts without changing intent, affecting billing.
  • Typos (e.g., swapped or dropped letters) and word variations (e.g., suffixes) cause tokenizers to split text differently, raising token counts.
  • Conversational padding (e.g., fillers, hedges, expressive habits) adds tokens that help tone but rarely aid task completion, impacting costs.
  • Shorthand forms (e.g., 'pls' for 'please') can be less token-efficient than standard words, contrary to keystroke-saving intentions.
  • Non-conversational elements like UUIDs, timestamps, and URLs significantly inflate token counts in work contexts, contributing to billing overhead.
  • Tokenizers, such as OpenAI's and Claude's, vary in token output for the same text, with counts influenced by surrounding text features.
  • The disconnect between human typing for efficiency and tokenizer billing based on patterns introduces cost considerations in everyday communication.