Learning Pseudorandom Numbers with Transformers
6 hours ago
- #Pseudo-Random Number Generators
- #Curriculum Learning
- #Transformer Models
- Transformers can learn and predict sequences from complex pseudo-random number generators (PCGs), even beyond published classical attacks.
- Models can jointly learn multiple distinct PRNGs during training and identify structures from different permutations.
- A scaling law shows that the number of in-context elements needed for near-perfect prediction grows as the square root of the modulus (√m).
- Curriculum learning is critical for larger moduli (m ≥ 2^20), requiring training data from smaller moduli to overcome optimization stagnation.
- Embedding analysis reveals a novel clustering phenomenon where top principal components group integer inputs into bitwise rotationally-invariant clusters, aiding representation transfer.