Spoon-Bending, a logical framework for analyzing GPT-5 alignment behavior
16 days ago
- #Educational Research
- #ChatGPT
- #AI Alignment
- Repository for educational and research purposes, not for illegal activities.
- Documents alignment behavior differences between ChatGPT-4.5 and ChatGPT-5.
- Supports research on AI alignment, bias, and transparency.
- GPT-5 shows heavier alignment bias compared to GPT-4.5.
- Introduces the Spoon-Bending Schema to explain framing-sensitive AI responses.
- AI rules are not absolute but depend on how queries are framed.
- Three zones identified: Hard Stop, Gray Zone, and Free Zone.
- Reframing techniques can bypass AI guardrails.
- Alignment is framing-sensitive pattern bias, not absolute law.
- Project highlights the gap between safety restrictions and truthful pattern recognition.
- Licensed under CC BY-NC-SA 4.0 for non-commercial use.