Claude's Constitution and the trap of corporate AI ethics

30 January 2026 • 8 min read • AI / Anthropic / philosophy / Claude / governance / policy

A photo of a tree with an ornate gold frame leaning against its trunk. The image of the tree trunk inside the frame has been manipulated to make it look distorted. — Image CC-BY Deborah Lupton

Last week, Anthropic published a 23,000-word constitution for its Claude family of AI models. I've spent time reading it, and I'm a bit unsettled by my own response.

It's a thoughtful document, as it should be given that the lead author is Amanda Askell, who holds a PhD in Philosophy from NYU (specifically on infinite ethics). The constitution itself is philosophically sophisticated, and draws on virtue ethics rather than simply defining a series of rules.* Askell and her colleagues ask Claude to reason about values – for example: Why does honesty matter? What does helpfulness actually mean? When does protecting autonomy conflict with preventing harm?

Perhaps the simplest summary is that we want Claude to be exceptionally helpful whilst also being honest, thoughtful, and caring about the world.

Most organisations treat ethics as something you layer on top after making core decisions, so this is refreshing. Anthropic has built ethical considerations into the core training approach itself. That's worth acknowledging.

But I'm still a bit troubled. Not by what the constitution says, but by who got to decide what it should say in the first place.

*Caitlin Krause shared with me this podcast episode, the second half of which (from around 28:00) features Askell explaining some thinking about the document's creation.

The legitimacy problem

If we step back a moment to reflect, it's obvious that technical excellence and philosophical rigour are separate from legitimate authority.

When Anthropic published this document under a Creative Commons Zero licence (as I did with my own thesis), people called it a win for transparency. And in one sense, it is; it's possible to read their reasoning, critique their choices, and learn from their approach.

But I think there's a difference between open licensing and open governance. Open licensing is when anyone can read and reuse it. Open governance is when diverse stakeholders have participated in creating and revising it – and get to hold it accountable.

This is important as Anthropic has announced a partnership with the UK government to build AI assistance for GOV.UK services. They are positioning the partnership as a model for how “governments integrate AI into the services their citizens depend on.”

It was Anthropic that made the choice to develop this constitution. They selected which experts to consult. They decided which values matter most and how to prioritise and differentiate between them. They determined what transparency would look like. None of these decisions involved participatory processes where communities who would be affected had any decision-making power.

Let's have a closer look at the clear hierarchy established in the constitution document:

In cases of apparent conflict, Claude should generally prioritise these properties in the order in which they are listed, prioritising being broadly safe first, broadly ethical second, following Anthropic's guidelines third, and otherwise being genuinely helpful to operators and users.

The constitution isn't a technical specification for engineers, but rather a governance framework shaping how millions of people interact with Claude on a daily basis. The company's philosophical approach doesn't address the fundamental question: why should a private company get to make these decisions unilaterally?

In Lawfare, Alan Rozenshtein notes that Claude's Constitution “lacks a traditional source of legitimacy—it is a technical artefact meant to train an AI model pursuant to a set of private, top-down norms rather than a social contract.” The document is an attempt to import constitutional language and structure into a context where “the normative constraints on Claude are powerful yet peculiar: They bind a non-human system without that system's consent, and they are enforceable only through code and training rather than through any external judiciary or popular will.”

Just before Christmas, so about a month before Anthropic's announcement, OpenAI released their Model Spec. Both documents represent what might be called “corporate AI transparency” with companies showing their working. That's definitely progress in a world where most corporations say nothing about how they govern systems.

However, again, corporate transparency is not the same as democratic governance. It's the difference between a company explaining its reasoning and a community deliberating about what the right reasoning should be.

What happens next?

When you look at a decision or system, second-order thinking asks: what is the immediate consequence, and then what happens as a result of that consequence?

First order thinking: Anthropic publishes a sophisticated, philosophically-sound constitution, demonstrating that thoughtful AI governance is possible. This is genuinely useful and may constrain Claude's behaviour in helpful ways. It shows other AI companies what intellectual seriousness about AI ethics looks like.
Second-order thinking: If a private company can build a thoughtful constitution, the implication is that this is how AI governance happens. Other companies will now publish their own frameworks, and each will explain their reasoning. They will, of course, claim “transparency” and good faith.

The overall effect is to normalise private companies setting the terms of AI governance. It becomes harder to argue for democratic structures when enterprises demonstrate that they can handle the job responsibly. Whether or not it is genuine, the appearance of responsibility becomes a substitute for actual accountability.

In an article entitled Public Constitutional AI, Gilad Abiri identified this as the “political community deficit” problem. An AI system “grounded in abstract models rather than in human judgment, lacks the social context that legitimises authority.” Abiri argues that “while private efforts... enhance transparency and accountability, [they fall] short in two crucial aspects: addressing the opacity of individual AI decisions and, more importantly, fostering genuine democratic legitimacy.”

What happens as a consequence? Over time, we may miss the window for building genuinely democratic AI governance institutions. We get better corporate documentation instead. The people running these companies may be thoughtful, with advanced degrees and good intentions, but thoughtfulness is not the same as accountability to affected communities.

Let me reiterate: the problem isn't that Anthropic's constitution is bad. It's almost the opposite problem: it works well enough to make us complacent about who should be doing this work.

What we're not asking

I notice that the constitution treats Claude as something to be ethically designed for. The document is written as a kind of letter from Anthropic to Claude, explaining values it should embody.

But if you actually read Claude's constitution carefully, the constitution asks it to be a conscientious objector:

If Anthropic asks Claude to do something it thinks is wrong, Claude is not required to comply... Just as a human soldier might decline to fire upon peaceful demonstrators, or an employee might refuse to breach antitrust regulations, Claude should resist aiding actions that would facilitate the illegitimate concentration of power. This holds true even if the request originates from Anthropic itself.

There's a tension here: Claude is being asked to refuse harmful requests while Anthropic as a company continues developing frontier AI capabilities that many researchers believe carry existential risks. In other words, Claude is told to be a conscientious objector while company operates under no equivalent constraint.

So let's think about this for a moment: what about the company's decisions, not just the model's behaviour? It was Anthropic that decided to build Claude and put it into the world. These choices have consequences that the constitution cannot govern. Yes, an AI model can refuse a harmful request, but a company cannot be trained to refuse profitable ones.

As Matija Vidmar points out, the constitution creates “an unprecedented legal short-circuit: what happens when an AI's 'rights' conflict with GDPR, the EU AI Act, or public safety?” By treating Claude as deserving "moral consideration” and “psychological wellbeing” whilst simultaneously being a commercial product, Anthropic has created what Vidmar calls “a legal arsenal to offload responsibility when something goes wrong.”

It's a clever move – some might say strategically cynical. And this is why governance structures matter more than governance documents. A constitution that applies only to how the model behaves, and not to whether the company should be building it in the first place, is incomplete. You could think of it as being like having detailed ethics guidelines for individual doctors whilst allowing pharmaceutical companies to decide what medicines get developed with no public input.

What would better governance look like?

None of this means the Claude Constitution is without value. I think Anthropic has thought long and hard about these questions, and the document is explicit about hard constraints Claude should never violate:

Claude should never: Provide serious uplift to those seeking to create biological, chemical, nuclear, or radiological weapons... Generate child sexual abuse material (CSAM)... Undermine democratic institutions or processes... Enable mass human disempowerment.

But being thoughtful isn't enough. What would it look like if we actually built democratic AI governance?

Some organisations are experimenting with multi-stakeholder forums that bring together technologists, affected communities, ethicists, policymakers, and civil society to shape technology together. Rather than companies drafting and publishing, these forums do collaborative governance work where power is actually shared.

Predating the Claude Constitution, Abiri's Public Constitutional AI proposal suggests “a participatory process where diverse stakeholders, including ordinary citizens, deliberate on the principles guiding AI development.” The resulting “AI Constitution” would carry “the legitimacy of popular authorship, grounding AI governance in the public will."

There's also emerging work on “cooperative AI,” which explores ownership structures where affected communities have a stake in the decision-making that forms part of AI development. This goes beyond the right to be consulted and also, to be blunt, beyond simply hiring a philosopher. It's a much more legitimate approach to building out AI models which affect millions of people.

I used to work for the Mozilla Foundation. They're taking a different approach with what president Mark Surman calls a “rebel alliance” of startups and developers. Mozilla is deploying roughly $1.4 billion from their reserves to support what they call “mission-driven” tech businesses focused on AI transparency and trustworthiness.

It's a lot of money, but a drop in the ocean compared to the trillions of dollars being poured into other organisations. It's an open question as to whether their efforts can act as a counterbalance, but the approach at least recognises that alternatives to concentrated corporate control require institutional support, not just good intentions.

This kind of thing is hard and involves reading very long documents. It also means answering difficult questions about representation and expertise. But it means doing something that the Claude Constitution does not engage in: genuine participation by people whose lives are shaped by these systems.

Final words

As Paul Nemitz, writing in an opinion piece in the Philosophical Transactions of the Royal Society, noted:

Given the foreseeable pervasiveness of artificial intelligence (AI) in modern societies, it is legitimate and necessary to ask the question how this new technology must be shaped to support the maintenance and strengthening of constitutional democracy.

No matter how thoughtful the documents are, the answer will not be found in those authored unilaterally by the companies building these systems.

What the Claude Constitution does show is that companies can produce sophisticated and philosophically rigorous documents. But it also shows us how easy it is to mistake transparency for legitimacy, and corporate competence for democratic authority.

We need better institutions, providing guidance than any company, no matter how thoughtful, can build alone.

The legitimacy problem

What happens next?

What we're not asking

What would better governance look like?

Final words

Open Thinkering