How fast is N tokens per second really?

3 days ago

Local LLM benchmarks measure throughput in tokens per second, but these numbers are hard to visualize without seeing the actual streaming speed.
The tool demonstrates four modes: code (syntax-highlighted pseudo-code), text (Lorem ipsum prose), think (reasoning sentences with code), and agent (tool calls with pauses).
Users can test different speeds: from 5 tok/s (Raspberry Pi class) to 800 tok/s (Cerebras class), where visual perception becomes the bottleneck.
Token density varies by content; code is more token-dense than prose, making the same throughput feel different, which this tool aims to illustrate.
English prose averages about 1.3 tokens per word, so 30 tok/s is roughly equivalent to 23 words per second.

Hasty Briefsbeta