How Can Reinforcement Learning Achieve Expert-Level [Chip] Placement?
9 hours ago
- #reward model
- #reinforcement learning
- #chip placement
- Chip placement is crucial in physical design, but RL-based methods focusing on wirelength optimization often fail to achieve expert-quality layouts.
- The reward design is identified as the main cause of the performance gap with experts.
- The approach bypasses formalizing complex processes by learning directly from expert layouts to derive a reward model.
- It infers step-by-step expert trajectories from final expert layouts.
- Using these trajectories as demonstrations or preferences, a model is trained to capture latent implicit rewards in expert results.
- Experiments show the framework can learn efficiently from even a single design and generalize well to unseen cases.