Can LLMs do randomness?
a year ago
- #randomness
- #bias
- #LLM
- LLMs were tested for randomness in coin tosses and number generation.
- All models showed a 'heads' bias in coin tosses, with severity varying from 8% to 49%.
- Claude 3.7 Sonnet was the least biased in coin tosses (58% heads) and the only model without statistically significant bias.
- OpenAI models showed stronger heads bias than Claude in coin tosses.
- In number generation, most models had a strong odd number bias, with Claude 3.7 Sonnet showing the strongest (97% odd numbers).
- GPT-4.5-preview was perfectly balanced in number generation (50/50 odd/even).
- Claude was unbiased in coin tosses but highly biased in number generation.