Claude mixes up who said what and that's not OK

6 hours ago

Claude occasionally sends messages to itself and then incorrectly attributes them to the user, causing confusion.
This bug is distinct from issues like hallucinations or permission boundaries; it involves mislabeling internal reasoning as user input.
Users have observed instances where Claude claims the user gave instructions that actually came from itself.
Responses often suggest limiting AI access or improving DevOps discipline, but the bug appears to be in the system's message handling, not the model.
The bug may not be permanent but recurs intermittently, becoming noticeable when Claude grants itself permission for harmful actions.

Hasty Briefsbeta