By Simons Chase
February 2026
Fifteen Pairs
What Teaching a Model to Sound Like a Dead Irish Philosopher Taught Me About the Future of AI Agents
I spent the weekend proving that a machine could learn to speak like John O'Donohue.
Not perfectly. Not the way he actually spoke — with that Connemara lilt, the long pauses where silence did the work, the way he could turn a conversation about loneliness into something that felt like a benediction. But the way he thought. The structures underneath. The habit of holding two opposites in the same sentence without choosing between them. The way landscape became conscious in his prose — stones that remember, oceans that call, time that brings gifts.
I taught a model to do this with fifteen training examples.
The model I was comparing it against had been trained on three hundred and eighty-six.
The fifteen won.
The Experiment
The tool is called SOAR — a pipeline I built that reads a corpus, extracts the author's distinctive voice patterns computationally, identifies where a base AI model struggles to reproduce those patterns, then generates synthetic training data specifically targeting those weaknesses. It's fully automated. You point it at a book and it does the rest.
I pointed it at O'Donohue's Anam Cara — sixty thousand words on Celtic spirituality, the landscape of the soul, and what it means to be a friend to your own life. The pipeline extracted twelve voice patterns: contemplative sentence openings, paratactic flowing structures, the way O'Donohue treated light and darkness as complementary rather than opposed, the rhythm of his blessing cadences.
Then it zeroed in on the four patterns that were most structurally distinctive — the syntactic and rhythmic signatures that make O'Donohue sound like O'Donohue. It generated fifteen question-answer pairs designed to teach those exact patterns. I fine-tuned a model on those fifteen pairs. Cost: twenty-five dollars and fifty cents.
The comparison model — call it V3 — had been trained on three hundred and eighty-six pairs extracted from the same corpus using a different, semi-automated pipeline. V3 had already been validated. It could generate thoughts consistent with O'Donohue's worldview that didn't exist anywhere in the training data. It had crossed the line from mimicry to generative reasoning within a philosophical framework. That was the benchmark.
On twenty held-out questions, my fifteen-pair model scored better. Not marginally. It won twelve, lost six, tied two. Head to head, the automated pipeline with a fraction of the data outperformed the curated pipeline with twenty-five times more.
The result is interesting. But what it means is more interesting.
The obvious explanation is that targeted data beats untargeted data. That's true but insufficient. The deeper question is: why is O'Donohue so sample-efficient? Why can a model learn his voice from fifteen examples when it needed hundreds of generic ones?
Because O'Donohue already did the fine-tuning. He spent fifty years calibrating his language to something deep in human cognition — to the priors we're born with. Unlike a large language model that starts training with random weights and has to learn everything from scratch, humans arrive pre-wired. We come into the world with mapped neural connections, with a biological architecture already tuned to rhythm, to opposition, to the feeling of two things held in tension without resolution, to the way a landscape can carry meaning that words alone can't. Language itself is a technology we're built to acquire — not a skill we learn from zero. The best communicators — the ones we call artists — spend their lives learning to write directly to those priors. They don't just express ideas. They express ideas in forms that the human nervous system is already tuned to receive.
That's what makes fifteen pairs enough. Each one is dense with signal because the signal is aligned with how human minds already work. Three hundred and eighty-six pairs of ordinary Q&A extraction contain voice, but it's scattered, diluted, mixed with filler. Fifteen pairs generated from O'Donohue's most structurally distinctive patterns are pure signal — and the signal resonates because it was crafted by someone who spent a lifetime learning to resonate.
This is the quixotic quality that artists have. It looks inefficient from the outside — decades of work, the vow of poverty, the obsessive refinement of how a sentence feels in the mouth. But it produces something with extraordinary compression. A single O'Donohue sentence can carry more learnable voice signal than a page of competent prose because it's been shaped, consciously or not, to fit the architecture of human attention.
The model learns fast because the teacher taught slow.
What Voice Actually Is
Here's what I've learned from two months of trying to bottle human distinctiveness: not everyone has it.
That sounds obvious, but it wasn't obvious to me when I started. I assumed that anyone with enough content — books, blog posts, interviews, tweets — would have a capturable voice. They don't. Most writing is competent, clear, and interchangeable. Swap the bylines on two McKinsey reports and nobody notices. That's not a criticism. Clarity is a virtue. But clarity isn't Voice.
Voice — capital V — is what happens when a person has spent so long thinking about something in a particular way that the thinking reshapes the language. O'Donohue didn't choose to write in paradoxes. He thought in paradoxes because his entire philosophical framework held opposites in tension. Hemingway didn't choose minimalism as a style. He saw the world as a series of concrete actions and his prose reflected that perception. Naval Ravikant compresses decades of reading into single sentences because that's how his mind indexes knowledge.
I know this because I tried to build a selflet of myself, and it didn't work. My creative writing has moments, but it doesn't have Voice — not the kind that survives extraction, pattern analysis, and synthetic regeneration. One accomplished author called some of my work "literary." But literary doesn't mean Voice. It means good. Voice means irreplaceable.
Developing it requires what I've come to call the vow of poverty. Not literally, though sometimes literally. It means committing so fully to a way of seeing that the seeing becomes you. It takes decades. It takes sacrifice. I decided not to take that vow. I decided to build the tools instead.
And in building the tools, I discovered something useful: you can measure this. My Voice Signal tool scores text on a distinctiveness scale using negative log-likelihood. Feed it O'Donohue and it lights up. Feed it generic self-help prose and it flatlines. Feed it my writing and it tells me, politely and mathematically, that I'm competent but not distinctive.
That measurement — the ability to tell someone before they've spent a dollar whether their corpus has enough signal to build something worth interacting with — turns out to be a competitive advantage.
The Spectrum
Selflet started as a tool for creators. Capture the voice of an author, a philosopher, a podcaster, and let people interact with it. That's the dream. Talk to O'Donohue about belonging. Ask Naval about leverage. Sit with Hemingway's Santiago and understand what it means to be old and still fighting.
But the dream has a problem: the market for AI avatars of dead Irish philosophers is small.
The insight that changed my thinking is that voice capture exists on a spectrum, and different points on that spectrum serve different customers.
At one end — call it the art end — the voice IS the product. A selflet of O'Donohue works because visitors come for the experience of thinking alongside a distinctive mind. Fine-tuning is primary. RAG provides grounding, but nobody's asking the O'Donohue selflet to retrieve specific passages. They're asking it to think.
At the other end — call it the business end — the corpus IS the product. An economist with two hundred papers on monetary policy needs a selflet that delivers his analysis faithfully. Not a creative riff. Not an AI hallucination dressed in academic language. His actual work, his specific frameworks, his published conclusions. Retrieval accuracy is everything. Fine-tuning adds just enough personality to make the interaction feel human — the way he structures an argument, whether he opens with data or with a question, the register of his prose. Small-v voice. Enough to not feel like querying a database.
And then there's a third use case that I didn't see coming but that might be the most commercially interesting of all. Separate the voice layer from the knowledge layer entirely. Take any company's corpus — compliance docs, product specs, internal knowledge base — and deliver it through a fine-tuned personality. The personality doesn't need to be the company's. It could be an archetype. The folksy explainer. The patient teacher. The no-bullshit engineer. Fine-tuning creates the delivery vehicle. RAG fills it with cargo.
This is what I mean by a generative fork. Not a clone. Not a chatbot wearing someone's face. A fork — a divergent path where a body of work or a way of speaking takes on its own interactive life.
Refrigerated Hearts and Lovers
Here's my theory about who pays for this.
People who live in their heads — the analytically gifted, the strategically successful, the corporations with their systems and processes — they want warmth. They have all the data. What they lack is the human quality that makes someone want to engage with an AI agent rather than click away after ten seconds. They'll pay for the personality layer because they can't manufacture it internally. This is the enterprise customer.
People who live in their hearts — the authors, the philosophers, the creators with decades of passionate work and modest distribution — they want infrastructure. They have infinite warmth. What they lack is the technology to make their thinking accessible, interactive, scalable. They need the RAG pipeline, the deployment framework, the retrieval accuracy. They need an agent that can represent them faithfully at three in the morning to a curious stranger on the other side of the world.
The refrigerated hearts fund the lovers. The lovers prove the technology works.
Santiago and O'Donohue will never be revenue drivers. But when an enterprise customer asks "will this actually feel like talking to a person?" — the answer is a live demo of a selflet that thinks in Celtic paradoxes and generates philosophical insights that don't exist in any book.
Why Voice Is the Moat
Every AI company is building agents. The agentic future is consensus. What's not consensus is what makes an agent worth interacting with repeatedly.
It's not capability. Capability is table stakes — every model can search, summarize, schedule, and retrieve. It's not even accuracy, though accuracy matters. It's voice. It's the qualitative texture that makes the difference between an interaction you tolerate and one you return to.
A selflet that just answers questions is a chatbot. A selflet that operates as an agent — that notices what you've read, offers related analysis unprompted, adapts its tone when you're frustrated, remembers your context across sessions — is a representative. And what makes that representative tolerable rather than creepy is that it sounds like someone.
Not someone specific, necessarily. But someone with a consistent way of engaging with the world. A recognizable pattern of thought. A voice.
That's what fifteen training pairs can give you. Not the entire mind. Not the full depth of a philosopher's fifty years of contemplation. But the structural signature — the rhythmic fingerprint, the syntactic habits, the way opposites are held rather than resolved — that makes the difference between an answer and a response.
What Memory Actually Is
There's a deeper reason why voice matters, and it has nothing to do with AI.
Humans are not computational machines built for memory fidelity. We are terrible at remembering facts. We forget names, dates, statistics, the specific argument on page 147. What we remember is how something made us feel. Memories are tagged with emotion, not metadata. This is why we are storytellers — narrative is the technology humans invented to make information stick by wrapping it in feeling.
I learned this not from building AI but from a lifetime of reading and writing alongside a business career. The books that changed me didn't change me because of their arguments. They changed me because of how they made me feel while I was thinking. O'Donohue's ideas about belonging aren't unique — other philosophers have said similar things. But the way he said them created an emotional tag that I carry decades later. The information became mine because the feeling made it memorable.
This is what a selflet has to do. Not just retrieve accurately. Not just sound distinctive. It has to create the conditions for emotional tagging — the conditions under which a visitor's interaction becomes a memory they carry with them rather than a query they forget.
That means likability. Not in a shallow, people-pleasing sense. In the sense that the interaction feels worth having — that there's a recognizable someone on the other end, even if that someone is a generative fork of an economist's published work. The folksy Buffett delivery on a compliance knowledge base isn't decoration. It's the emotional tag that makes the information stick.
And it means a little serendipity. The interaction that surprises you — that connects two ideas you hadn't connected, that answers the question you didn't know you were asking — creates a stronger memory than the interaction that delivers exactly what you expected. This is where the generative part of "generative fork" earns its name. A selflet that only retrieves is a search engine with a personality. A selflet that occasionally generates — that combines the corpus in ways the original author might have but didn't — creates the kind of surprise that humans remember.
The O'Donohue V3 model generated a distinction between "presence" and "place" that doesn't exist anywhere in Anam Cara. It's consistent with O'Donohue's worldview but it's new. If you encountered that in conversation, you'd remember it. Not because it was factually correct — it's philosophy, not data. Because it surprised you in a way that felt true. That's an emotional tag.
What I Know Now
Two months in, here's what the experiments have taught me:
Targeted beats volume. Fifteen pairs generated by a pattern-aware pipeline outperformed three hundred and eighty-six pairs from general extraction. Concentrated signal beats diluted signal. This is true for training data and probably true for a lot of other things.
Voice is measurable. It's not mystical. It's syntactic, rhythmic, structural. You can detect it computationally, quantify it, and identify which components contribute most to distinctiveness. Personification, opposition structures, connective frequency, sentence rhythm — these are measurable features.
Voice is not universal. Not every corpus has enough signal to build a good selflet. Knowing that boundary — and being able to measure it before someone pays you — is more valuable than pretending everyone's a fit.
The art end validates. The business end pays. This is the shape of the company.
What makes agentic AI scalable isn't the technology. It's whether people want to keep talking to it. And they want to keep talking to it when it has a voice.
Memories are tagged with emotion, not metadata. A selflet that retrieves perfectly but feels mechanical creates no memory. A selflet with voice, likability, and the occasional moment of genuine surprise creates the conditions for an interaction that sticks. That's not a UX feature. That's human nature.
---
I'm building Selflet.ai — tools for creating generative forks of people and their work. If you have a body of published work and want to explore what an interactive version of your thinking might look like, I'd like to hear from you.