What does it take to build a human-like user simulator?
a day ago
- #AI
- #Language Models
- #Human Simulation
- Defining the right training objective is crucial for eliciting new language model capabilities.
- Preference models and verifiable rewards have improved model performance in reasoning and assistance.
- Simulating real human users could be a new objective for models to solve complex problems collaboratively.
- Two language models could simulate interactions: one as an assistant and another as a human user.
- User simulators need to judge interaction success to update the assistant model's parameters.
- Current language models fall short as effective human-like user simulators.
- Key design decisions for user simulators include context, scaffold, and training objectives.
- Context involves goal descriptions, behavioral traits, and historical interactions.
- User simulation is underspecified due to the difficulty in capturing latent human context.
- Three promising directions: synthetic context imputation, longitudinal data collection, and new measurements.
- Scaffolds define how user simulators interface with their environment and evolve over time.
- Scaffolds can model goal fidelity, self-knowledge, influence, memory, and cognitive load.
- Changing the training objective could make user simulators more human-like.
- Humans optimize for multiple objectives, including task completion, effort minimization, and group considerations.
- Hybrid approaches combine task objectives with behavior cloning for more human-like simulations.
- Open questions remain about evaluation, generalization, and the utility of user simulators versus other methods.