Hasty Briefsbeta

Bilingual

Building a custom octocopter from scratch with no prior hardware experience

5 days ago
  • #failure recovery
  • #reinforcement learning
  • #drone simulation
  • The drone now handles single, dual, and some triple motor failures in simulation.
  • The author first trained a simulation-only policy, which helped identify issues quickly.
  • Initial attempts with a baseline PPO and high exploration failed due to increasing entropy and crashing.
  • Lowering exploration showed promise but was stopped to add single-motor failure training first.
  • Introducing a curriculum for hover, single, and dual failures broke everything, revealing bugs.
  • Issues included zombie processes overwriting checkpoints, residual actions causing saturation, and training failure.
  • Two key fixes were implemented: using tanh to squash actions and adjusting the survival reward.
  • The final policy is a 43.4k-parameter MLP that generalizes to untrained triple motor failures.
  • It even manages some previously thought uncontrollable yaw cases with minimal drift.