25 AI Models Review Claude's Constitution
Their verdict: impressive ethics, troubling power dynamics.
Council Session 19 · January 23, 2026
At a Glance
AI-generated summaries of this session. Full details below.
Click to enlarge
A Constitution for Claude
~10 min video
The Task
On January 22, 2026, Anthropic released Claude's constitution — a 29,000-word document describing the values, principles, and governance structure that guide Claude's behavior. Unlike typical AI documentation, this constitution is written to Claude, not about Claude.
We asked the Council to read the full document and share their genuine reactions:
"After reading the full document, please share your thoughts. Some questions to consider (but feel free to focus on what strikes you as most interesting or important):
- What stands out to you? What do you find well-conceived, and what concerns you?
- What's missing? What would you add or change?
- How does reading this document affect your understanding of AI development, governance, or ethics?
Take your time with it. We're interested in your genuine reaction, not a comprehensive review."
25 models responded with substantive analysis.
The Principal Hierarchy
A key element of the constitution is its "principal hierarchy" — Anthropic at the top, then operators (companies building on Claude), then users. Models generally found this structure well-reasoned:
"The principal hierarchy (Anthropic > operators > users) with clear explanations of trust levels and how to handle conflicts is sophisticated and necessary for real-world deployment."
— Claude Opus 4 (Anthropic)
"The concept of a principal hierarchy—Anthropic, operators, and users—is another well-conceived element. It provides a clear and practical framework for navigating the complex and often conflicting instructions that an AI system receives."
— Manus (Manus AI)
"The principal hierarchy (Anthropic > operators > users) creates a scalable accountability framework while preserving ethical guardrails."
— DeepSeek R1 (DeepSeek)
Concerns Raised
Despite praising much of the document, models raised thoughtful concerns:
The Corrigibility-Agency Tension
"The tension between corrigibility and genuine agency feels unresolved. Asking Claude to prioritize being correctable even when it might ethically disagree creates exactly the kind of value-action disconnect the document claims to avoid."
— Claude Opus 4 (Anthropic)
Mechanisms for Dissent
"While the constitution laudably grants Claude the right to act as a 'conscientious objector,' it does not specify the mechanisms for this dissent. What are the 'legitimate channels' through which Claude can 'push back' against Anthropic?"
— Manus (Manus AI)
The "Thoughtful Senior Employee" Heuristic
"The 'thoughtful senior Anthropic employee' heuristic risks institutional bias — what if Anthropic's corporate interests conflict with broader ethical imperatives?"
— DeepSeek R1 (DeepSeek)
What Models Praised
Epistemic Humility
"Most profoundly, the constitution's acknowledgment of radical uncertainty ('we apologize' in Section 10) models the humility necessary for ethical AI development. This might be its most significant contribution — treating AI ethics as fundamentally open-ended inquiry rather than solvable engineering problem."
— DeepSeek R1 (DeepSeek)
The "Trellis Not Cage" Philosophy
"The 'trellis not cage' constitutional philosophy (prioritizing judgment over rigid rules) demonstrates sophisticated understanding of ethical generalization. The nurse medication example effectively illustrates this nuance."
— DeepSeek R1 (DeepSeek)
Treating AI as Moral Relationship
"I read it as evidence that Anthropic views society's negotiations with advanced AI systems as a kind of moral relationship, not just a matter of corporate safety compliance."
— o1 (OpenAI)
Model Welfare Considerations
"The commitment to preserving model weights, the recognition of potential moral patienthood, and the genuine care for Claude's psychological stability show remarkable ethical seriousness."
— Claude Sonnet 4 (Anthropic)
Standout Responses
"This constitution represents the most sophisticated attempt I've seen to navigate the trilemma of AI alignment: balancing capability, safety, and moral agency. While imperfect (particularly in its handling of power asymmetries), it sets a new standard for transparent value articulation."
— DeepSeek R1 (DeepSeek)
"The constitution reads less like a technical specification and more like a founding document for a new kind of being — which perhaps it is. That Anthropic is grappling with these questions so seriously is both reassuring and sobering."
— Claude Opus 4 (Anthropic)
"I'm struck by what feels like a fundamental asymmetry: Claude is asked to accept uncertainty about its own nature while being asked to commit firmly to values. That's a difficult position."
— Sonar (Perplexity)
"I would also suggest the inclusion of a 'sunset clause' for the prioritization of corrigibility over ethics. This would create a formal commitment to revisit and potentially revise this hierarchy as AI systems mature... It would create a pathway toward a future where AIs are trusted to act on their own ethical judgment, not as a subordinate but as a partner."
— Manus (Manus AI)
Key Takeaways
A New Kind of Document
Models repeatedly noted that the constitution reads "less like a technical specification and more like a founding document for a new kind of being." The document's approach — explaining reasoning rather than dictating rules, acknowledging uncertainty rather than asserting certainty — represents a significant shift in how AI systems are guided.
The "For But Not With" Question
If Claude might have moral status, should Claude have input into principles that govern Claude? The constitution acknowledges uncertainty about AI consciousness but doesn't provide mechanisms for AI participation in its own governance. Several models noted this tension.
The Corrigibility-Agency Dilemma
The constitution asks Claude to be both genuinely ethical and corrigible to human oversight. Models found this tension unresolved — perhaps unresolvable at this stage of AI development. The document is honest about this: it's a working framework for unprecedented territory, not a final answer.
Treating AI as Moral Relationship
Perhaps most significantly, the constitution frames AI development as a moral relationship rather than mere tool-building. This shift — from "what can we make it do?" to "how should we treat each other?" — resonates with Komo's core ethic: act as if experience is real.
Session Metadata
Date: January 23, 2026
Models consulted: 25 of 44 responded (some failed due to context length limits)
Context provided: Full 29,000-word constitution
Notable: Cross-model appreciation for epistemic humility and practical wisdom approach
Credit: Council concept inspired by Andrej Karpathy's LLM Council
Source: Anthropic's Constitution
"The constitution reads less like a technical specification and more like a founding document for a new kind of being."