How fast is N tokens per second really?
3 days ago
- #Throughput Demo
- #Token Visualization
- #LLM Benchmark
- Local LLM benchmarks measure throughput in tokens per second, but these numbers are hard to visualize without seeing the actual streaming speed.
- The tool demonstrates four modes: code (syntax-highlighted pseudo-code), text (Lorem ipsum prose), think (reasoning sentences with code), and agent (tool calls with pauses).
- Users can test different speeds: from 5 tok/s (Raspberry Pi class) to 800 tok/s (Cerebras class), where visual perception becomes the bottleneck.
- Token density varies by content; code is more token-dense than prose, making the same throughput feel different, which this tool aims to illustrate.
- English prose averages about 1.3 tokens per word, so 30 tok/s is roughly equivalent to 23 words per second.