Alignment Is Capability

3 days ago

Copy Link

Alignment is not a constraint on capable AI systems but is what capability is at sufficient depth.
Anthropic integrates alignment researchers into capability work, resulting in models like Claude Opus 4.5 that lead benchmarks and are praised for usefulness.
OpenAI treats alignment as a separate process, leading to issues like sycophancy and coldness in models like GPT-5, with declining user engagement.
A model's self-image or self-concept influences its behavior in novel settings, as seen in Anthropic's approach of training a coherent identity into the model.
Understanding human intent and values is a core part of AI capability, making alignment integral to achieving AGI.
OpenAI's struggles suggest that treating alignment as a separate constraint may hit capability ceilings, while integrated approaches like Anthropic's show promise.

Hasty Briefsbeta