Hasty Briefsbeta

Bilingual

How fast is N tokens per second really?

3 days ago
  • #Throughput Demo
  • #Token Visualization
  • #LLM Benchmark
  • Local LLM benchmarks measure throughput in tokens per second, but these numbers are hard to visualize without seeing the actual streaming speed.
  • The tool demonstrates four modes: code (syntax-highlighted pseudo-code), text (Lorem ipsum prose), think (reasoning sentences with code), and agent (tool calls with pauses).
  • Users can test different speeds: from 5 tok/s (Raspberry Pi class) to 800 tok/s (Cerebras class), where visual perception becomes the bottleneck.
  • Token density varies by content; code is more token-dense than prose, making the same throughput feel different, which this tool aims to illustrate.
  • English prose averages about 1.3 tokens per word, so 30 tok/s is roughly equivalent to 23 words per second.