Building a custom octocopter from scratch with no prior hardware experience

a month ago

The drone now handles single, dual, and some triple motor failures in simulation.
The author first trained a simulation-only policy, which helped identify issues quickly.
Initial attempts with a baseline PPO and high exploration failed due to increasing entropy and crashing.
Lowering exploration showed promise but was stopped to add single-motor failure training first.
Introducing a curriculum for hover, single, and dual failures broke everything, revealing bugs.
Issues included zombie processes overwriting checkpoints, residual actions causing saturation, and training failure.
Two key fixes were implemented: using tanh to squash actions and adjusting the survival reward.
The final policy is a 43.4k-parameter MLP that generalizes to untrained triple motor failures.
It even manages some previously thought uncontrollable yaw cases with minimal drift.

Hasty Briefsbeta