Hasty Briefsbeta

Trained LLMs exclusively on pre-1913 texts

a day ago
  • #time-locked-models
  • #historical-research
  • #large-language-models
  • A family of 4 billion parameter large language models (LLMs) based on the Qwen3 architecture trained from scratch on 80B tokens of historical data up to knowledge-cutoffs.
  • Models are time-locked, meaning they do not have access to any information beyond their knowledge-cutoff date.
  • The project aims to create windows into the past for research in humanities, social sciences, and computer science.
  • Models will reproduce historical biases and views present in their training data, which is a feature for understanding historical discourse.
  • A responsible access framework is being developed to make models available to researchers while preventing misuse.
  • The project invites comments and suggestions on periods, regions, questions, validation methods, and access frameworks.