Deliberative Alignment, And The Spec – Rational Review News Digest

Source: Astral Codex Ten
by Scott Alexander

“OpenAI has bad luck with its alignment teams. The first team quit en masse to found Anthropic, now a major competitor. The second team quit en masse to protest the company reneging on safety commitments. The third died in a tragic plane crash. The fourth got washed away in a flood. The fifth through eighth were all slain by various types of wild beast. But the ninth team is still there and doing good work. Last month they released a paper, Deliberative Alignment, highlighting the way forward. Deliberative alignment is constitutional AI + chain of thought.” (02/12/25)

https://www.astralcodexten.com/p/deliberative-alignment-and-the-spec