Building a custom octocopter from scratch with no prior hardware experience
5 days ago
- #failure recovery
- #reinforcement learning
- #drone simulation
- The drone now handles single, dual, and some triple motor failures in simulation.
- The author first trained a simulation-only policy, which helped identify issues quickly.
- Initial attempts with a baseline PPO and high exploration failed due to increasing entropy and crashing.
- Lowering exploration showed promise but was stopped to add single-motor failure training first.
- Introducing a curriculum for hover, single, and dual failures broke everything, revealing bugs.
- Issues included zombie processes overwriting checkpoints, residual actions causing saturation, and training failure.
- Two key fixes were implemented: using tanh to squash actions and adjusting the survival reward.
- The final policy is a 43.4k-parameter MLP that generalizes to untrained triple motor failures.
- It even manages some previously thought uncontrollable yaw cases with minimal drift.