Alignment Is Capability
3 days ago
- #AGI
- #Capability
- #AI Alignment
- Alignment is not a constraint on capable AI systems but is what capability is at sufficient depth.
- Anthropic integrates alignment researchers into capability work, resulting in models like Claude Opus 4.5 that lead benchmarks and are praised for usefulness.
- OpenAI treats alignment as a separate process, leading to issues like sycophancy and coldness in models like GPT-5, with declining user engagement.
- A model's self-image or self-concept influences its behavior in novel settings, as seen in Anthropic's approach of training a coherent identity into the model.
- Understanding human intent and values is a core part of AI capability, making alignment integral to achieving AGI.
- OpenAI's struggles suggest that treating alignment as a separate constraint may hit capability ceilings, while integrated approaches like Anthropic's show promise.