My two light switches got stuck in an infinite echo loop
2 days ago
- #MQTT
- #Debugging
- #Smart Home
- A bug caused two smart light switches to endlessly toggle each other through MQTT messages, creating a loop where each switch applied changes and re-announced them.
- The root cause was a state-mirroring protocol flaw: switches applied incoming changes but didn't update a deduplication key (VAR1), so each message seemed new, leading to amplification.
- The issue was triggered after a power outage when both switches rebooted, despite attempts to fix it with boot-settle timers, which only masked the problem.
- Live log analysis revealed the bug's mechanism: incoming mirror commands updated the relay but not VAR1, causing the publish rule to fire on every hop.
- A one-line fix was implemented: instead of direct power commands, switches send a SYNC event; the receiver updates VAR1 first before setting the relay, stopping the echo after one hop.
- Key learnings included verifying the actual running code, using live logs over theorizing, and recognizing that the fix required understanding the mechanism, not just adding debounce or timers.
- This bug class is common in systems like replication storms, webhook loops, and event-sourcing feedback loops, where deduplication keys aren't updated before applying effects.