Hasty Briefsbeta

Bilingual

Letting Claude Play Text Adventures

4 months ago
  • #CognitiveArchitectures
  • #AI
  • #TextAdventures
  • Attended an AI hackathon focused on mech interp but worked at the API layer due to limited PyTorch knowledge.
  • Explored cognitive architectures (Soar, ACT-R) and their potential to scaffold LLMs for better performance.
  • Chose text adventures as an evaluation task due to their structured, long-horizon nature, using Anchorhead as a test case.
  • Developed a Python wrapper to interact with the dfrotz interpreter for text adventures.
  • Implemented a simple LLM agent (SimplePlayer) that interacts with the game via chat history but faced high token costs.
  • Experimented with memory harnesses to reduce token usage but observed degraded performance in task completion.
  • Created smaller, custom games to test agent performance but found them less effective than complex games like Anchorhead.
  • Proposed future improvements like domain-specific memories, automatic/manual geography tracking, and episodic memory.