Hasty Briefsbeta

Bilingual

Learning Pseudorandom Numbers with Transformers

7 hours ago
  • #Pseudo-Random Number Generators
  • #Curriculum Learning
  • #Transformer Models
  • Transformers can learn and predict sequences from complex pseudo-random number generators (PCGs), even beyond published classical attacks.
  • Models can jointly learn multiple distinct PRNGs during training and identify structures from different permutations.
  • A scaling law shows that the number of in-context elements needed for near-perfect prediction grows as the square root of the modulus (√m).
  • Curriculum learning is critical for larger moduli (m ≥ 2^20), requiring training data from smaller moduli to overcome optimization stagnation.
  • Embedding analysis reveals a novel clustering phenomenon where top principal components group integer inputs into bitwise rotationally-invariant clusters, aiding representation transfer.