Letting Claude Play Text Adventures
4 months ago
- #CognitiveArchitectures
- #AI
- #TextAdventures
- Attended an AI hackathon focused on mech interp but worked at the API layer due to limited PyTorch knowledge.
- Explored cognitive architectures (Soar, ACT-R) and their potential to scaffold LLMs for better performance.
- Chose text adventures as an evaluation task due to their structured, long-horizon nature, using Anchorhead as a test case.
- Developed a Python wrapper to interact with the dfrotz interpreter for text adventures.
- Implemented a simple LLM agent (SimplePlayer) that interacts with the game via chat history but faced high token costs.
- Experimented with memory harnesses to reduce token usage but observed degraded performance in task completion.
- Created smaller, custom games to test agent performance but found them less effective than complex games like Anchorhead.
- Proposed future improvements like domain-specific memories, automatic/manual geography tracking, and episodic memory.