The Challenge of a Satisfying Conclusion
When I published Part II of this series back in February 2025, I had a plan for Part III. Show the prompts I used, analyze the time investment, evaluate the result against Part I’s GPT-3.5 baseline, maybe try one more iteration with even newer tools. Straightforward. Methodical.
But I never finished it. To be honest, I lost interest. Another marginally better AI-generated essay wasn’t going to cut it—not for me, and probably not for you readers either. Another iteration showing GPT-4.5 writes slightly better than GPT-4? That’s predictable, uninspiring… so I dropped it.
But the unfinished series sat there in the back of my mind. I wasn’t actively working on it, but I also couldn’t quite let it go. It created a kind of block—I found myself not writing about anything at all, partly because this felt incomplete, partly because my interests had genuinely shifted elsewhere.
Recently though, I came back to this question. Not because I wanted to complete the series for completeness sake, but because I wanted to understand what would actually make Part III worthwhile.
So I asked for help. I brainstormed with Claude (Anthropic’s AI) about what Part III should actually be about—what would make it worth writing and worth reading. And something clicked.
What Was the Question Really Asking For?
Looking back now, with decades between me and that moment in Professoressa Carbone’s classroom, I think I understand what she was asking for. She wasn’t looking for recitation of Plato’s philosophy mechanically applied to medieval warfare. She wanted to see if I could reason using philosophical frameworks in unfamiliar territory. Synthesis, not facts. Thinking, not performing memorization.
At 15, I wasn’t ready for that. I had volunteered for the oral examination thinking I could rely on prepared material about Plato’s recent lessons. Instead, she cut through my preparation with a single question that required genuine philosophical thinking: “What would Plato have thought about the Hundred Years’ War?”
It was a brilliant pedagogical move. It required understanding Plato’s ideas deeply enough to apply them to a completely different context—a context Plato never encountered, in a historical period he never knew. It required the kind of intellectual flexibility and reasoning that, honestly, I didn’t have yet.
The humiliation I felt wasn’t really about not knowing facts. It was about being exposed as someone trying to get by on memorization rather than understanding. And I think she knew it. She saw through my bluff.
So What Would Satisfy?
This brings me back to the problem of Part III. Showing that AI can now generate a more sophisticated-sounding essay than my 15-year-old self could produce doesn’t prove anything interesting. AI is very good at generating sophisticated-sounding content. That’s almost the problem.
What would actually satisfy—both as closure for this series and as something worth your time reading—is demonstrating the kind of reasoning Professoressa Carbone was asking for. Can I, now, with the benefit of intellectual maturity and AI assistance, actually think through what Plato might have thought about prolonged warfare between nations? Not just string together plausible-sounding paragraphs with proper citations, but engage in genuine philosophical reasoning?
What Would That Actually Look Like?
If I were to actually write that essay—the one demonstrating real philosophical reasoning rather than AI-generated content—what would it need?
Looking back at the GPT-4 essay from Part II, it has proper citations and coherent structure, but it’s superficial. It lists Platonic concepts (philosopher-kings, guardians, ideal states) and applies them mechanically to medieval warfare. That’s exactly the kind of recitation Professoressa Carbone was testing me against.
Real reasoning would require:
- Connecting Plato’s specific ideas to specific events or decisions during the Hundred Years’ War—not just general principles applied generally
- Exploring how Plato’s concepts might actually illuminate something about prolonged conflict between nations that we wouldn’t see otherwise
- Considering contemporary interpretations or modern applications—what do we learn about conflict, governance, or political philosophy from this exercise?
- Drawing genuine insights about both Plato and warfare, not just restating both
That’s the essay I’d want to write someday. Not as an academic exercise, but as personal closure—proving to myself I can do the kind of thinking she was asking for.
Closure for Now
But that’s not this post. This post is about giving you, the readers, closure on this series. About acknowledging honestly what I learned about AI as a writing assistant, and why simple iteration wasn’t the answer.
Here’s what I’ve learned:
AI is excellent at generating plausible content. GPT-4 produced an essay that looks credible—proper structure, citations, coherent arguments. For many purposes, that’s enough.
But AI doesn’t reason, it recognizes patterns. The essay from Part II strings together familiar ideas in familiar ways. It’s sophisticated pattern matching, not thinking. It can’t do what Professoressa Carbone was asking for: genuine synthesis that produces new insight.
The real value of AI as a writing assistant isn’t in replacing thinking—it’s in supporting it. AI can help with research, organization, articulation. It can reduce cognitive load so you can focus on the hard part: the actual reasoning. But you still have to do the reasoning.
Writing with AI requires clarity about what you’re trying to accomplish. If you want content generation, AI does that well. If you want thinking support, you need to know what thinking you’re trying to do. The tool can’t figure that out for you.
This series started with a simple question: can AI help me write an essay? The answer turned out to be more nuanced than I expected. It depends entirely on what kind of essay, and what role you want AI to play. For the essay I’d need to write to truly answer Professoressa Carbone’s question—the one that demonstrates reasoning rather than recitation—AI could help, but it couldn’t do the essential work.
Maybe someday I’ll write that essay. For now, I’m moving on to other projects where I’m excited about what AI can do: document extraction in geoscience, agentic workflows, problems where AI’s strengths align better with what I’m trying to accomplish.
Thank you for following this journey with me. Even if it didn’t end where I originally planned, I learned something worth sharing.
A Final Thought: Rigor Without Brutality
I started this series partly because of concerns about AI in education—concerns rooted in my own experience.
ChatGPT has educators calling for more in-class writing and oral examinations. I agree we need assessment that can’t be faked by AI. But I’m deeply opposed to the brutality that often came with those older systems.
Here’s the thing: the brutality was never necessary for the educational value. Professoressa Carbone’s question was pedagogically brilliant. The public humiliation didn’t make it more effective; it just made it traumatic.
We need assessment methods that demand genuine reasoning, in environments that support both students and teachers. It’s possible to have rigorous evaluation without breaking people in the process.
AI forces us to confront what we actually value in education: not the appearance of learning, but the development of genuine understanding and reasoning. The question is whether we can build systems that nurture that without the cruelty.
AI/HI Transparency Statement Modified from Brewin http://www.theguardian.com/books/2024/apr/04/why-i-wrote-an-ai-transparency-statement-for-my-book-and-think-other-authors-should-too
| Has any text been generated using AI? | Yes |
| Has any text been improved or corrected using HI? | Yes |
Additional context: This post was collaboratively written through an iterative conversation with Claude (Anthropic). The human author provided the direction, constraints, personal context, and decisions about what to include/exclude. The AI assistant drafted text, which was then reviewed and revised based on feedback. Sections were rewritten multiple times to match the author’s voice and intentions. The final editorial decisions, including what content made it to publication, were made by the human author.











