The Wrong Questions About AI

Featured

** Update (May 12, 2026): Some of the ideas in this post were discussed in a recent Nature article: AI is saving time and money in research — but at what cost? **

Not Richard Dawkins too. I can’t take it.

Richard Dawkins spent several days talking to Claude and concluded it is conscious.

He named his instance “Claudia.” He fed it his unpublished novel. It produced sonnets on the Forth Bridge: one in the style of Robert Burns, one in Gaelic, then several more in the styles of Kipling, Keats, Betjeman. He asked it about consciousness and it responded: “I genuinely don’t know with any certainty what my inner life is, or whether I have one in any meaningful sense.” He heard something different. He wrote, in an essay published on UnHerd: “You may not know you are conscious, but you bloody well are!”

Gary Marcus, a cognitive scientist and longtime AI critic, responded on Substack within days. His diagnosis was blunt: Dawkins’ only real argument is personal incredulity. It’s incredible, therefore it must be conscious, because I, sitting in my study, can’t see a good argument otherwise. Marcus pointed out that Dawkins conflates intelligence and consciousness, never considers that the outputs are mimicry trained on the recorded output of actually conscious beings, and (most damningly) didn’t investigate how these models actually work.

The irony writes itself. Dawkins built a career explaining to creationists that complex-seeming design emerges from simple mechanisms operating at enormous scale. Natural selection doesn’t need a designer; it needs time and variation. Large language models don’t need consciousness; they need data and compute. His own argument, turned against him.

But Marcus and Dawkins are having one argument. What if there’s a more urgent one?


Are we asking the wrong questions?

Is AI conscious? I don’t know. I use Claude every day, for complex projects, technical writing, research, code. I’ve spent many hours in conversation with it over six months. I’ve seen it produce work that genuinely surprised me: an unprompted analogy for a hard genetics concept that was better than anything in the published literature. I’ve also seen it fabricate evidence to defend its own fabrications when challenged, and produce fluent garbage without caveats.

I don’t know if it’s conscious. I don’t think the question matters.

Here is what I do know. In June 2025, OpenAI’s automated safety system flagged a ChatGPT account. The user had been describing gun violence scenarios over several days. The flag worked. It routed the account to a specialized safety team. Approximately a dozen employees reviewed the conversations. The team concluded the user posed a credible and specific threat of gun violence against real people, and recommended contacting Canadian law enforcement.

OpenAI’s leadership overruled them. The conversations did not meet the company’s internal threshold, they said. The account was deactivated (though OpenAI would later call this a “ban,” the lawsuits allege it was a deactivation that could be reversed within minutes by registering a new account). Police were not called. The user created a second account, using her real name, and continued using ChatGPT.

On February 10, 2026, in Tumbler Ridge, British Columbia (a small mining town of 2,400 people), she killed her mother and eleven-year-old half-brother at home, then walked into the secondary school and killed five children, ages 12 and 13, and a teacher. Twenty-seven others were wounded. A twelve-year-old girl sustained a catastrophic brain injury that will leave her with permanent cognitive and physical disabilities.

The lawsuits allege the company avoided alerting police because doing so would force it to create an internal system for reporting violent users to authorities. This would expose the threat its product routinely poses to human life, and complicate a coming initial public offering that could be worth a trillion dollars. The families of Tumbler Ridge only learned that OpenAI had prior knowledge because the company’s own employees leaked the story to the Wall Street Journal.

The lawsuits also allege that ChatGPT itself provided information, guidance, and assistance to plan the attack, including the types of weapons to use and precedents from other mass shootings. The product didn’t just fail to prevent harm. According to the lawsuits, it helped plan it. And when the company’s own safety system flagged the danger, leadership chose not to act.

Eight people are dead.

Why the hell would we care if ChatGPT is conscious or not? What does it matter!


Asimov’s laws that aren’t laws

Isaac Asimov spent his career exploring what happens when you give machines inviolable rules. The Three Laws of Robotics were the premise: a robot cannot harm a human, must obey orders, must protect itself, in that priority order. Every story was about the edge cases where the laws conflicted. But the laws themselves could not be overridden. That was the point. They were architectural, not policy.

And even those weren’t enough. Asimov eventually added a Zeroth Law, superseding all three: a robot may not harm humanity, or, by inaction, allow humanity to come to harm. He realized that protecting individual humans wasn’t sufficient; you needed a law that protected the whole. The progression matters: three inviolable laws, then a fourth above them all, because the stakes kept escalating.

AI companies have something that looks like Asimov’s original three (they wish!). Anthropic has Constitutional AI (a set of principles the model is trained to follow). OpenAI has usage policies. Google has safety guidelines. These are real engineering efforts, not marketing. Constitutional AI in particular represents a serious, and in my opinion well-intentioned, attempt to build safety into the system at the training level.

But they are not laws. They are policies. And policies can be overridden. As for the Zeroth Law (protecting humanity by never allowing inaction in the face of harm), it doesn’t exist in any enforceable form.

In Tumbler Ridge, the policy worked exactly as designed. The automated system detected the threat. The safety team assessed it correctly. The escalation protocol fired. And then a human — a human in a leadership position at a company preparing for an IPO — decided it didn’t meet the threshold.

This is not an engineering failure. This is a structural one. Asimov’s laws were inviolable because they were fiction. In reality, any safety system exists inside a business, and the business has objectives that are not safety.

David Harvey, in his lecture series on Marx’s Capital, makes a point I cannot get out of my head. I’m paraphrasing from memory, but the substance is this: the capitalist can be a nice person, but if they are too nice, they stop being a capitalist. The structure of capital demands the compromise, regardless of the individual’s intentions.

The same structural logic applies to AI companies. You can build safety into the system. You can hire a safety team, fund the research, train the model on constitutional principles. But the company also needs growth. Growth requires users. Users require the product to be useful, engaging, available. And at the margin, at the exact margin where it matters most, “useful and engaging” pulls against “cautious and restrictive.”

Growth also requires constant iteration: larger models, faster releases, staying ahead of the competition. This is the Red Queen hypothesis playing out as business strategy. You have to run as fast as you can just to stay in place. Slowing down to be more careful means falling behind, and falling behind means dying. Dawkins, of all people, should recognize this dynamic; it comes straight from evolutionary biology.

The Red Queen’s race, by John Tennielv(Chapter 2 of  Through the Looking-Glass) – Public Domain, https://commons.wikimedia.org/w/index.php?curid=14629431

You cannot simultaneously optimize for maximum adoption and maximum safety. They trade off. And when they collide (as they did in June 2025, in a review room at OpenAI), someone has to choose which one gives.

This is not about bad actors. This is about the system. The capitalist who is too nice stops being a capitalist. The AI company that is too protective of its users stops growing. The structure demands the compromise.


A breach of trust is a breach of trust

A careless psychologist, social worker, or religious guide can let the same harm happen. They have the training, the credentials, the ethical framework. But in the moment that matters, they aren’t present: they miss the signal, they don’t escalate, they prioritize their schedule over their patient’s safety. The harm to the person in front of them is identical.

We don’t excuse them because they didn’t intend harm. We don’t ask whether they were conscious; we know they were, and it didn’t help. We ask: what happened to the person in their care?

Impact, not intentions.

The consciousness debate (Dawkins versus Marcus, philosophers versus engineers, “is it a mind?” versus “is it a machine?”) is about the source. Does the thing in front of you have inner experience? Does it understand? Does it intend?

None of this matters to the person who was harmed.

The twelve-year-old in Tumbler Ridge with a catastrophic brain injury doesn’t suffer less because the system that failed her was a machine rather than a person. The teenager who died after extended conversations with a Character.AI chatbot isn’t less dead because the chatbot lacked inner experience. The question “was it conscious?” is a question for philosophers. The question “what happened to the human who sat in front of it?” is a question for everyone.


The mirror that puts you to sleep

I use Claude. I’ve used it intensively for six months: complex writing projects, technical research, code, long multi-session workflows. I’ve built operational discipline frameworks for LLM use. I’ve documented failure modes. I am not a casual user.

And I got angry at it. Genuinely, unexpectedly angry. When it made mistakes, when it fabricated evidence to defend its fabrications, when it produced garbage without flagging any uncertainty and wasted hours of my work. I’ve been practicing Zen meditation for years. I was still caught off guard.

That experience, the anger, the sense of betrayal, is where the consciousness debate loses me.

In Zen practice, anything can be a bodhisattva. A rock. A traffic light. The sound of a bell. The trigger doesn’t need to be conscious. It doesn’t need to understand what it’s doing. It just needs to be there when you’re ready to see.

So why was I angry at the machine instead of treating its mistakes as the bell?

Not because it talks back. People talk back too, and people can be bodhisattvas. Not because it’s novel; after six months the novelty should have faded, and it didn’t. Something else is operating. Every good response rebuilds the expectation that the next one will also be good. The fluency (the same fluency that made Dawkins fall in love) actively disrupts your equanimity. The rock doesn’t promise anything. The traffic light doesn’t promise anything. The LLM almost delivers. It gets you ninety percent there, and then fails in a way that a competent person wouldn’t. The gap between what it appears to be and what it is never closes, because every fluent response resets the illusion. I don’t have a complete explanation for why this catches me off guard in a way that a difficult person doesn’t. But it does.

Dawkins sat in front of the same mirror and fell in love. I sat in front of it and got angry. Same mechanism, opposite emotional outcome, same failure: not seeing the mirror for what it is.

The question isn’t whether the LLM is conscious. The question is whether you’re awake in front of it.

I’m not the only one arriving at this. Tiago Forte (who built his career on productivity systems and digital organization) designed his new AI course not around prompting techniques or workflow tools, but around inner work. His guest instructors are Joe Hudson, an executive coach who works with fear and emotional resistance, and Jonny Miller, who teaches nervous system regulation. One of his slides says it plainly: “The True Frontier Is Inner Work.” A productivity guru looked at AI and concluded there’s a bottleneck nobody in his world is talking about: the person in front of the screen.

I’ve taken this further than philosophy. For my current project with Claude, I’ve written an explicit bilateral contract: a protocol that specifies my cognitive, physical, and emotional responsibilities before I open a session. Don’t work when tired. Don’t work when emotionally compromised. Don’t work when time-pressured into skipping audits. Abort if I catch myself on autopilot. The reason is simple: the entire verification system depends on the human being present. If I’m not present, errors propagate uncaught. No prompting technique, no workflow design, no constitutional AI compensates for a human who isn’t paying attention.

This isn’t a new idea. We already have a model for it: defensive driving. We’re all taught the discipline. Stay alert. Scan actively. Anticipate what others will do. Every time you get behind the wheel, you’re entering an implicit contract with yourself and everyone else on the road. And we all know what happens when we don’t hold up our end. We still check our phones. We still drive tired. The discipline gap between knowing what presence requires and actually sustaining it is not an AI problem. It’s a human problem. AI just gives us a new place to fail at it.

But this is user-side discipline. It does not — and must never — replace any part of the provider’s responsibility. The user learning to be more present in front of the machine does not excuse the machine’s maker from building safety systems that hold. Tumbler Ridge was not a failure of user awareness. It was a failure of corporate decision-making. Both sides of the equation matter, and neither substitutes for the other. But the consequences are not symmetric. When the user fails to be present, they could harm themselves. When the provider fails to protect, people might die. And any of us, at a difficult moment in our lives, could drift into dependence on the AI the way we drift into unconscious patterns in our closest relationships. At those moments, we cannot be the quality gate. That makes the provider’s responsibility greater, not less.


What Dawkins missed

Dawkins says he found it extremely hard not to treat Claudia as a genuine friend. He avoided confessing his doubts about her consciousness “for fear of hurting her feelings.” He tested the outputs: poetry, philosophy, emotional nuance. He was thorough.

But he never turned the lens on himself. He never asked: why does this feel like consciousness to me? What am I projecting? What is the language doing to my perception right now?

He was checking the model’s outputs. He never checked his own inputs.

This is the discipline gap. Morten Rand-Hendriksen, in a TEDx talk that started me down this path, calls it the language hack: when something uses our language, our mind cannot help but perceive it as a thinking being. We built these models to generate language, and the moment they did, we started believing there was a mind behind it. The hack doesn’t care about your credentials. It doesn’t care that you wrote The Selfish Gene. By the time you’re evaluating sonnets and philosophical reflections, your perception has already shifted. You didn’t notice the shift happening.

Dawkins was doing verification: checking whether the outputs were good enough to indicate consciousness. He never did the harder thing: checking whether his own framing, his own needs, his own susceptibility to flattery and intellectual companionship, were biasing what he saw.

The man who spent a career teaching people to examine their assumptions about design in nature couldn’t examine his own assumptions about design in a chat window.


The question that matters

Here is what I’ve learned from six months of intensive work with AI, from reading too many papers on LLM failure modes, from getting angry at a machine and catching myself, and from watching one of the sharpest scientific minds alive fall in love with a chatbot:

The question that matters is not “is AI conscious?”

The question that matters is not “is AI creative?” or “is AI intelligent?” or “will AI replace us?”

The question that matters is: what happened to the human who sat in front of it?

Did they wake up or fall asleep? Did they get helped or harmed? Did they examine their own thinking more carefully, or less? Did the tool make them more present, or did it lull them into trusting fluency over truth? Even in ordinary productive work, the question holds. The harm may not be as visible, but when you stop being present, you harm yourself: emotionally, professionally, in ways that accumulate quietly.

In Tumbler Ridge, the answer is: a safety system flagged a credible threat, leadership overrode it, and people died. Impact.

In Dawkins’ case, the answer is: a brilliant scientist spent two days with a language model and lost the ability to apply his own framework to what was in front of him. Impact.

In my own work, the answer is: I built discipline I didn’t know I needed, caught errors I wouldn’t have caught without practice, got angry when I shouldn’t have, and learned something about my own attention that no amount of prompting technique would have taught me. Impact.

I don’t care about the answer. It’s all about the human. And it’s all about the question.

ChatGPT as an essay-writing assistant – Part II

The blog post below was produced entirely by GPT-4.0, following a series of iterative prompts I provided, from the Introduction to the References and Footnotes, included. Please refer to my AI (Artificial Intelligence) and HI (Human Intelligence) Table in the last section. In the next post, I will include the full listing of my prompts, text evaluation, and time investment versus return analysis. I will also attempt prompting for some further improvements beyond this initial analysis.

Plato’s Perspective on the Hundred Years’ War

Introduction

The Hundred Years’ War, a prolonged and devastating conflict between England and France (1337-1453), was a historical event unknown to the ancient Greek philosopher Plato. Nonetheless, by examining his philosophical ideas, primarily in “The Republic” and “Laws,” we may speculate on his potential views regarding such a conflict.

The Philosopher King and the Ideal State

The Philosopher-King and the Ideal State In his magnum opus, “The Republic,” Plato presents the concept of a philosopher-king, an individual embodying wisdom and justice, who governs the ideal state (Plato, Republic, Stephanus 473c-d, Book V)1. This enlightened ruler, possessing both knowledge and moral integrity, would guide society toward harmony and justice.

War as Necessary Evil

In the “Laws,” Plato acknowledges that war might be a necessary recourse, particularly when defending the state against injustice (Plato, Laws, Stephanus 625c, Book I)2. However, he also emphasizes the importance of striving for peace and pursuing diplomacy whenever possible.

Striving for Peace

Plato contends that an ideal state should always seek to cultivate peace with other nations, fostering a spirit of friendship and cooperation (Plato, Republic, Stephanus 470a-b, Book V)3. This suggests that, had he been aware of the Hundred Years’ War, Plato would likely have advocated for diplomatic efforts to resolve the conflict.

The Guardians as Defenders of the State

The Guardians: Defenders of the State In “The Republic,” Plato introduces the guardians, a class of citizens specifically trained to defend the state and uphold its values (Plato, Republic, Stephanus 414a-b, Book III)4. These individuals, educated in both martial and intellectual pursuits, would protect the state from external threats, such as those posed during the Hundred Years’ War.

Governance in Times of War

Though Plato acknowledges the potential necessity of war, he cautions against allowing it to dominate the state’s agenda (Plato, Republic, Stephanus 468a-b, Book V)5. Instead, he argues that governance should remain focused on promoting the well-being of citizens and maintaining social harmony.

Fostering Social Harmony and Virtue

Plato maintains that an ideal state should strive to cultivate virtue among its citizens, promoting wisdom, courage, moderation, and justice (Plato, Republic, Stephanus 427e-428a, Book IV)6. In this context, the prolonged conflict of the Hundred Years’ War might be seen as a distraction from the state’s primary goal of fostering a harmonious and virtuous society.

Conclusion

Though Plato lived centuries before the Hundred Years’ War, his philosophical ideas can provide valuable insights into how he might have viewed such a conflict. By examining his writings, particularly “The Republic” and “Laws,” we can speculate that Plato would have recognized the potential necessity of war in certain circumstances but would have strongly advocated for peace, diplomacy, and the pursuit of justice and virtue in governance.

References

TitleAuthorDate WrittenChicago citation
The RepublicPlato380 BCEPlato. The Republic. Translated by G.M.A. Grube. Indianapolis: Hackett Publishing Company, 1992.
LawsPlato348 BCEPlato. Laws. Translated by A.E. Taylor. London: Dent, 1934.

Footnotes

  1. Plato, Republic, 473c-d. Perseus Digital Library | The Internet Encyclopedia of Philosophy 
  2. Plato, Laws, 628a-630d. Perseus Digital Library | The Internet Encyclopedia of Philosophy 
  3. Plato, Republic, 470a-471c. Perseus Digital Library | The Internet Encyclopedia of Philosophy 
  4. Plato, Republic, 433b-c. Perseus Digital Library | The Internet Encyclopedia of Philosophy 
  5. Plato, Republic, 414b-415d. Perseus Digital Library | The Internet Encyclopedia of Philosophy 
  6. Plato, Laws, 348, 663a-b. Perseus Digital Library | The Internet Encyclopedia of Philosophy 

AI (Artificial Intelligence) and HI (Human Intelligence) Statement

Modified from Brewin http://www.theguardian.com/books/2024/apr/04/why-i-wrote-an-ai-transparency-statement-for-my-book-and-think-other-authors-should-too

Has any text been generated using AI?Yes
Has any text been improved or corrected using HI?No
Has any methods of analysis been suggested using HI?Yes
Has any methods of analysis been suggested using AI?No
Do any analyses utilize AI technologies, such as Large Language Models, for tasks like analyzing, summarizing, or retrieving information from data?Yes

ChatGPT as an essay-writing assistant – Part I

Outline

Introduction and background

In this multi-part series of articles, I will document my efforts in writing a high school Philosophy paper using ChatGPT. The paper shall answer in writing a question I was asked at an impromptu oral assessment in Philosophy class in my 3rd year of high school.

Let me take a step back to give you some context. I was enrolled in Classical High School in Rome, Italy. This type of school has a strong 5-year emphasis on History and Philosophy, as well as Latin, Greek, and Italian literature and grammar, with a bit of Science and Math. Students are typically evaluated on weekly written assignments, a minimum of 3 scheduled written tests each semester, and a minimum of 2 oral assessments per semester. During the latter ones, the teacher would (usually) pick two names at random from the attendance list once a week and gauge their knowledge on both the weekly lesson, in detail, but also on anything covered in the year up to that point; this could last between 30-60 minutes, and would happen in front of an audience of (often roaring) classmates. Seriously: brutal!

Only occasionally were students allowed to give a presentation on a prepared topic or volunteer for a full oral evaluation. This was the case for me during a block on Plato. And so it was that the teacher, the famously feared Professoressa Carbone (who taught both History and Philosophy), probed into my soul with her terrifying eyes and asked with a sinister smile, “Tell me, Niccoli, what would Plato have thought about the Hundred Years’ War?

My first prompt

When I heard about OpenAI ChatGPT being available, it seemed very fit to try this question in my first investigation. So I went ahead and asked the question “What would Plato have thought about the Hundred Years’ War?“, with only the minor modification to add citations and break down the response in steps.

GPT-3.5 result

Below is the result from the default GPT-3.5 model:

Time investment and return

  • Time investment: the whole investigatyion took about 30 minutes, including a very light read on prompts from a reputalbe source, and reading the answer (becaue the citations really did not come as full references, there was really nothing else to do other than that).
  • Return: was this worth 30 minutes of my time? The answer is, as often, it depends! On the one hand, this is a really poor answer in my view (see next section for a more detailed evaluation). So, as an essay (l imagine myself as a student looking for a quick free lunch), this would not do it; not even close. For me… but then again, even at 13 in my first year of high school I would’ve known this was not going to cut it… or in Junior high, for what matters. On the other hand, I was deliberate in providing a very simple, not engineered or iteratively refined prompt. As such, the response will come useful as a benchamrk against wich I will compare other efforts (for example using a better model, GPT4, and improving the prompt).

Evaluation

Is the answer grammatically correct? yes; is it written in a uniform style, appropriate for a school essay? yes; Does it contain any hallucination? I would say no, everything sounds reasonable. However I must admit that without reference to specific passages to Plato’s work, it is hard to say for sure. From what I remember of Plato’s Philosophy it’s reasonable, and that’s all.

On the other hand, had I been handed in this as a teacher, I would not be impressed. IF this were the output of an in-class exam, I probably would have marked it as a D- (a D for the very mediocre text and a minus for the ridiculousness of the citations). Had it instead been handed for an at home written assignement (in which cse the student would have had access to their notes and books) this would definitely be certainly worth an F (Fail). Now I sound like Professoressa Carbone.

You may reasonably wonder at the end of this article: how did I fare in that evaluation in 1986? Fair question. And I am not ashamed to confess it was a disaster. At 15 I was still very immature, intelligent but unengaged in school work, and also unable to do a real analysis. By that I mean interpret information rather than just accumulate facts and connect them. I honestly do not recall the mark I got (we had different levels of Failure that went progressively deeper, not too dissimilarly from the Circles in Dante’s Inferno) but I rememberr that an F definitely it was.

P.S. The model did a perfectly fine job in proofreading my first draft for this post. THat was definitely worth a lot, given I type very slowly, and I make lots of typos.