Research paper · April 2026

The Education Three Cs

Q: What are the Education Three Cs?

Curiosity, Communication, and Confirmation. They are the human capabilities the current education system was never built to measure: the ability to stay curious when the easy answer is right there, to have the hard conversation, and to question what you are handed instead of swallowing it whole. They are the only skills getting harder to find and more valuable to have.

Curiosity. Communication. Confirmation. Why the skills schools were never built to teach are now the only ones that matter.

EducationAIHELP

By Christopher Schafer · April 2026 · ~30 min read

The current education system rewards what AI now does for free. It under-rewards what AI cannot do at all. That gap is no longer theoretical, and the fix is not more rigor inside the old categories. It is a new set of categories.

Why I wrote this

My oldest son, Owen, left for university in September. He was seventeen. He packed his life into the back of a car and went off to start a future the way millions of kids do, trusting that the institution he was walking into knew what it was doing.

Three months later, in November, OpenAI released ChatGPT. The world changed in a weekend.

I remember the conversation I had with him in those first months, because it is the conversation this entire paper is built around. The school’s position was clear and it was loud. Stay away from it. Do not touch it. Use it and you will be fined, flagged, failed. It was all going to be terrible. So I asked Owen two questions.

Do you think AI is going anywhere, or is it only going to get stronger and more useful to you over time? And do you believe you will need to know how to use it when you graduate, more than you will need to have avoided it?

Because we are an outcome-focused family. The answer to those two questions determines what you actually do. Follow the outcome, not the rule.

He knew the answer. So did I. AI was not going anywhere. It was going to be the single most important tool of his working life. And the institution he was paying to prepare him for that working life was telling him to pretend it did not exist.

Sure enough, the schools caught up. Slowly. By his later years the same institutions that had threatened students over it had reversed course completely. They went from banning it to gesturing at how students were supposed to use it. And they are still on the fence today, still unsure how to handle it, still issuing guidance that contradicts last semester’s guidance.

If they were truly outcome-focused, they would have started in a different place. They would have asked what these kids need on the day they walk out the door, and worked backward from there.

They did not. By preventing students from using the most consequential tool of their generation, and then failing to teach them how to use it responsibly and effectively, the schools failed twice. First by banning it. Then by half-adopting it without understanding it.

Here is the part that should stop every superintendent and provost cold. The schools’ shift to allowing AI is still broken, because the institutions themselves do not know how to use it well. They have not automated their own work with it. They have not streamlined a single core process with it. They do not use it effectively inside their own walls. So who, exactly, did we expect to teach our children to use it? You cannot teach a discipline you have never practiced.

I watched four years of my son’s education get distorted by an institution reacting to a tool instead of preparing him for the world that tool created. Not enriched. Not accelerated. Distorted. The years were not a clean loss, Owen is sharp and he came out fine, but the institution added far less than it should have, and at moments it actively got in the way.

That is what made me sit down and do the work behind this paper.

I did not start with theory. I started with people. I thought about myself, and about every genuinely successful person I have known across twenty-five years in the room. I asked a simple question. What did they actually have that the system never measured? It was never the credential and it was rarely the raw recall. It was a set of human capabilities. The ability to stay curious when the easy answer was right there. The ability to have the hard conversation. The ability to question what they were handed instead of swallowing it whole. Call them soft skills if you want. They are the only skills that are getting harder to find and more valuable to have.

Then I noticed something else. Most of those people were not doing it consciously. The ones who ran these plays reliably had something underneath the talent. They had a system, even if they had never named it. And most people do not. Most people are not emotionally intelligent enough to do this on instinct, under pressure, every time. They need an operating system. That is why we built HELP. Hear. Evidence. Learn. Proceed. It gives the human a structure to stay in the loop when every incentive is pulling them to check out.

But HELP is the how. It came after the what. And the what is the three Cs. Curiosity. Communication. Confirmation. These are the inputs we have to build back into every school, the capabilities the system was always supposed to produce and quietly stopped producing somewhere along the way. HELP is how you teach them. The three Cs are what you are teaching.

I could not get Owen’s four years back. But I have another son.

My second son was accepted to the same university his brother attended. He turned it down. He chose a different school, on purpose, specifically because he did not want his brother’s experience. A seventeen-year-old looked at what happened to his older brother and made a colder, clearer institutional bet than most education boards have managed to make. That should tell you something about where trust in the credential is heading.

I wrote this paper because I was driven to help the next parent’s kid. The next Owen. So that the institution he walks into is ready for the world he is walking into.

This is the work.

Executive summary

The current education system rewards what AI now does for free. It under-rewards what AI cannot do at all. That gap is no longer theoretical. It produces graduates who arrive in the workforce unable to start a conversation, unable to push back on a flawed idea, and unable to function without being told what to do next. They are credentialed and capable on paper. They are dependent in practice.

This paper argues that primary, secondary, and post-secondary education must reorganize around three teachable inputs. Curiosity. Communication. Confirmation. We call these the Education Three Cs. Creativity is not a fourth input. It is the dividend that emerges when the first three are healthy.

The three Cs describe what students need. They do not describe how to teach it. The how is the HELP Operating System, a four-phase framework we co-developed for use in leadership coaching and clinical practice. Hear. Evidence. Learn. Proceed. HELP is the operating plan that makes the three Cs teachable, observable, and assessable inside a classroom or an exam room. It is described in detail in Part Three.

The evidence base is no longer thin. EEG research from MIT in 2025 found that students writing essays with ChatGPT showed measurably weaker brain connectivity than students writing without it, and the deficit persisted even when the tool was taken away. Microsoft and Carnegie Mellon, in a separate 2025 study of 319 knowledge workers, found that the more confident a worker was in AI, the less critical thinking they did. The OECD has classified curiosity as a malleable trait that can be developed through inquiry and Socratic dialogue. Standardized creativity scores, measured on the Torrance Tests across roughly 300,000 students, have declined since 1990 even as IQ has risen. Employer surveys covering more than 85,000 companies put communication and critical thinking at the top of every dissatisfaction list.

The three Cs are teachable. A two-hour AI literacy intervention with Grade 8 and 9 students produced measurable gains in their ability to question and verify AI output. The mechanism works. It is just not in the curriculum.

This paper makes the case for three things. First, that the system as currently designed measures the wrong outputs. Second, that the three Cs are the inputs that produce judgment in the AI era. Third, that the assessment model should shift from standardized testing to apprenticeship-style outcome demonstration, the same model that already certifies competence in medicine, law, accounting, engineering, and the trades.

The audience for this paper is K to 12 district superintendents and principals, and university presidents and provosts. The recommendations are aimed at decisions you can make this academic year.

Prefer to read offline

Get the PDF

The full paper is below. If you’d like a clean PDF copy to share with a board or a leadership team, we’ll send it over and add you to the weekly Insights newsletter. Unsubscribe anytime.

No spam · Unsubscribe anytime

Part One: The problem the system was never built to solve

What schools have always rewarded

Every era of education has rewarded a specific information skill, and the dominant skill has always reflected what was scarce at the time.

When books were rare and information was held by a small number of teachers, the system rewarded memorization. A graduate who could recall facts on demand was useful because the facts lived in their head and almost nowhere else.

When books became cheap and libraries became universal, the system shifted to rewarding research and synthesis. A graduate who could find sources, organize them, and write a coherent argument was useful because the facts lived in books and a graduate had to bridge between them.

When the internet arrived, the system was slower to adapt. It still trained students to do the synthesis manually. The premium skill in the workforce had already moved on, but classrooms held the line.

Now AI has collapsed the cost of producing answers. A high school student with a phone can generate a research paper, a financial model, a marketing plan, a legal memo, or a working application in minutes. Production is no longer scarce. Production is free.

Information is abundant. Judgment about information is the new bottleneck.

The skills schools have spent the last century optimizing are the skills AI now performs at near-zero cost. The skills that determine whether a graduate can survive and lead in the new environment are not in the curriculum, were never in the curriculum, and in many cases are actively trained out by the time a student finishes secondary school.

A note on the factory model argument

Some critics of education reach for the phrase “factory model school” to explain how the current system was built. Education historians push back hard on this framing as ahistorical. The phrase was popularized in the 1970s as rhetoric, not history, and the actual evolution of public schooling is more varied than the metaphor allows.

The substance of the argument still holds without the metaphor. The current system measures and rewards what is easy to measure: standardized test scores, GPA, replication of expert outputs. It does not measure curiosity, communication quality, or the ability to question and verify. What gets measured gets optimized. What does not get measured atrophies. That is the structural problem, regardless of how the system originated.

What graduates are actually missing

The 2025 NACE Job Outlook survey, run across more than 1,000 employers and 20,000 students, found gaps approaching 30 percent between employer-rated importance of competencies and graduate-rated proficiency on those same competencies. Communication and critical thinking sat at the top of the gap list.

The Hult International Business School and Workplace Intelligence survey of 800 HR leaders and 800 recent graduates, published in January 2025, found that 98 percent of leaders say their organization is struggling to find talent, while 89 percent say they avoid hiring recent graduates. The reasons named most often: graduates do not have real-world experience, they do not know how to work well on a team, they have poor business etiquette, and they lack a global mindset. Seventy-seven percent of recent graduates said they learned more in six months on the job than during their entire undergraduate experience. Ninety-six percent of HR leaders said schools need to take more responsibility for workforce training.

The QS Global Employer Survey, which collects more than 85,000 employer responses across 80 countries and 27 industries, names communication, problem-solving, and resilience as the three skills employers most consistently report graduates lack.

The pattern is not regional. It is not industry-specific. It is structural. Graduates can produce work product. They cannot judge it, defend it, or operate without being told what to do next.

What AI use is doing to students who lack the inputs

Two studies published in 2025 deserve attention from every superintendent and provost. They are not commentary. They are measurements.

The MIT Media Lab study, “Your Brain on ChatGPT,” by Kosmyna and colleagues, ran EEG on 54 participants writing essays across three conditions: ChatGPT-assisted, search engine, and no tools. Brain connectivity scaled directly with the amount of cognitive work the participant had to do. The no-tools group showed the strongest and widest neural networks. The ChatGPT group showed the weakest. The most important finding for educators came in session four. When ChatGPT users were asked to write without the tool, their brains did not bounce back to baseline engagement. They showed reduced alpha and beta connectivity. The deficit persisted. The researchers named the phenomenon “cognitive debt.”

The Microsoft and Carnegie Mellon study, published in February 2025, surveyed 319 knowledge workers about their AI use. The pattern was direct and unambiguous. The more confident the worker was in the AI, the less critical thinking they did. Workers reported a perception that they were thinking critically when in fact they were deferring. Output diversity collapsed. People with access to AI tools produced a less varied set of solutions to the same problem than people without.

AI did not cause cognitive collapse. It just made disengagement frictionless.

These findings are not an argument against AI in education. They are an argument for the operating system that determines whether AI use builds judgment or replaces it. Two students using the same tool can produce opposite outcomes. One uses AI to skip the work and retains nothing. The other uses AI to test their own thinking, proofreads what came back, pushes on what does not make sense, and ends up better educated than they would have been without it. The variable is not the tool. The variable is whether the human stayed in the loop. Curiosity, communication, and confirmation are how the human stays in the loop.

Part Two: The Education Three Cs

Three teachable inputs. One emergent output. The framework is simple by design because frameworks that survive contact with classrooms have to be.

Curiosity

Curiosity is the engine. It is the willingness to keep asking, keep digging, keep noticing, when the easy path is to accept what you have been given. It determines whether a student reaches for the next question or accepts the first answer.

The OECD’s 2024 Learning Compass classifies curiosity as a malleable trait, not a fixed one. It can be developed through Socratic dialogue, open questions, inquiry projects, and learning environments that reward exploration over correctness. Trait curiosity correlates with academic achievement and job performance independent of IQ and effort. The infrastructure for teaching it exists. It is largely absent from current curricula because curricula are built around right answers, not better questions.

Curiosity is the first input to fall when a phone enters the room. The friction that previously forced a student to think their way to an answer disappears. The brain offloads what it knows it does not have to hold. This is a feature of the brain, not a flaw. The implication for educators is that you cannot fight the offloading. You have to redesign the work so the student is engaging, not just retrieving.

Curiosity cannot be assigned. A teacher cannot make a student curious about a topic the student does not care about. The curriculum has to do something it currently does not do well. It has to surface, for each student, the topics that genuinely intrigue them, and let them work questions inside those topics with rigor. Curiosity inside a real question is the only kind that builds anything.

Communication

Communication is how thinking becomes shareable, testable, and useful to other people. It is the most consistently cited deficit in employer surveys. It is also the area where the school system has come closest to teaching it well, in narrow ways.

What schools teach is the surface of communication. They teach written communication in English class. They teach the mechanics of presentation in occasional public speaking units. They sometimes teach grammar, vocabulary, and structure. None of that is wrong, and most of it is necessary.

What schools do not teach is the operating framework that runs underneath a real conversation. Not chitchat. Not small talk. The kind of conversation where a real discussion is happening: a disagreement, a decision, a clarification, a moment where two people need to land somewhere together. Most adults cannot do this well, because they were never taught how. Students arrive at university and at first jobs unable to handle the operational layer of a real exchange. They wait their turn to speak. They prepare a response while the other person is still talking. They confuse hearing with agreeing. They confuse silence with discomfort. They have no model for what a working conversation looks like, because no one ever gave them one.

These are EQ skills. They are the skills graduates name when they say, in the Hult survey, that they learned more in six months on the job than in their entire degree. They are also the skills that AI cannot do for you. AI can draft your email. It cannot have the conversation. AI can summarize a meeting. It cannot navigate the moment in the meeting where the disagreement actually surfaces.

A graduate who cannot have a standard conversation with a stranger, cannot initiate without permission, and cannot articulate disagreement is not a graduate ready for the workforce. The data says we are producing those graduates at scale, and the AI era is making the dependency worse, not better, because students now have a tool that lets them avoid the conversational work entirely.

Teaching communication well requires teaching the framework that runs underneath it. We describe one such framework in Part Three.

Confirmation

Confirmation is the discipline of questioning. It is the habit of treating any answer, including one from a teacher, a textbook, a search result, or an AI, as a hypothesis that needs verification. It is the skill that determines whether a student trusts AI output appropriately or trusts it blindly.

The Microsoft and Carnegie Mellon study quantified what happens when confirmation atrophies. Workers who were confident in AI’s capabilities perceived themselves as thinking critically while in fact deferring to the tool. Output diversity dropped. Independent reasoning dropped. The skill of asking “is this right, and how would I know” disappeared in the moment when it was needed most.

Confirmation is teachable on a fast timeline. A 2025 study with Grade 8 and 9 students found that a two-hour AI literacy workshop produced measurable gains in students’ ability to reject underspecified prompts, ask follow-up questions, and accurately judge whether an AI answer was correct. Two hours. The intervention worked. It is just not in the curriculum.

Creativity is the dividend

Creativity is not a fourth input. It is what emerges when the first three are healthy. A student who is curious, can communicate, and confirms what they hear has the operating system needed to recombine ideas in novel ways. A student missing any one of the three cannot be creative in any sustained sense, regardless of innate talent.

Some students have higher creative ceilings than others. That is a fact, and pretending otherwise produces curricula that frustrate everyone. The job of the school system is not to manufacture creativity in students who do not have the raw material. The job is to ensure that whatever creative ceiling exists in each student is exposed and incubated, rather than crushed by a system that punishes divergent answers.

The Torrance Tests of Creative Thinking, the most rigorous longitudinal measure of creativity in students, show that creativity scores in the United States have declined since 1990 even as IQ has risen. The decline was sharpest in the youngest children. The researchers attribute the trend to the rise of standardized testing, the increase in screen time, and the loss of unstructured play. The dividend has been collapsing for thirty-five years. The three Cs are how the dividend gets restored.

Part Three: The operating plan. HELP.

The three Cs describe what students need. They do not describe how to teach it. The shift from describing skills to actually building them requires an operating plan. Without a plan, curiosity becomes a slogan. Communication becomes a unit on public speaking. Confirmation becomes a poster about critical thinking. The intent is right. The mechanism is missing.

The HELP Operating System was developed for use in leadership coaching and in clinical practice. It works for the same reason any operating plan works: it puts bookends on a process that otherwise runs by accident. HELP stands for Hear, Evidence, Learn, Proceed. Four phases. The phases are sequential. The discipline is in not skipping one.

We bring HELP into this paper because the three Cs map onto it directly. A school or university that wants to teach curiosity, communication, and confirmation has, in HELP, a structured way to make all three observable inside a single conversation, project, or assessment. HELP is also developmentally appropriate. Children can be taught the four phases in age-appropriate language by the end of elementary school. The same four phases scale into university seminars, residency programs, and boardrooms.

Hear

Listen fully. No interruptions. No reframing while the other person is speaking. No solving in your head before they finish. The opening discipline of any real conversation is the willingness to take in what the other person actually said, in their own words, without immediately translating it into your own.

When somebody feels truly heard, their thinking sharpens on its own. This is one of the most well-documented findings in clinical psychology, and it is one of the least practiced skills in classrooms. Students are taught to wait for their turn. They are not taught to listen. The difference matters. A student who has been heard once, properly, by a teacher who reflected back what was said without changing it, learns more about communication in that moment than in a year of presentation workshops.

Hear is the bookend that opens real communication. Without it, what follows is two people taking turns talking past each other.

Evidence

Separate fact from interpretation. This is confirmation made operational. The discipline is to take what has been said or read, and split it into two columns. What is observable, verifiable, and time-stamped. What is opinion, inference, story, or assumption.

This is the phase that AI use erodes most quickly. The Microsoft and Carnegie Mellon study found that confident AI users perceive themselves as thinking critically while in fact deferring to the tool. They lose the habit of asking “is this fact, or interpretation? What evidence backs it up? What evidence is missing?” Evidence as a phase forces those questions back into the foreground.

Inside a classroom, Evidence is taught by asking students to mark up a source, an article, or an AI-generated answer with two highlighters. One color for facts. Another for interpretations. The exercise sounds simple. It is not. Most adults cannot do it well. Students who learn to do it by Grade 7 or 8 carry the discipline into university and into work, where it becomes the foundation of how they evaluate information.

Learn

Pattern recognition. This is curiosity made operational. Once the evidence is on the table, the question becomes: when have I seen something like this before? Where? Who else has seen this pattern? What was similar? What was different?

These four questions move the brain from “solve this for me” to “I have likely seen this before, somewhere.” The shift is from external search to internal synthesis. It is the moment where schema gets activated, recombined, and extended. It is also the moment where creativity often shows up unannounced. A student working through a Learn pass on a science problem will sometimes connect it to something from a different class, a personal experience, or an idea they read about months ago. That connection is creativity. It cannot be assigned. It can only be incubated, and the Learn phase is where the incubation happens.

The Learn phase is also where curiosity gets sharpened. Curiosity that has nothing to attach to dissipates. Curiosity that lands on a recognizable pattern from a student’s own experience deepens. Teachers who run a Learn phase well do not give students the answers. They ask the pattern questions and then wait. The waiting is not wasted time. The waiting is where the cognitive work happens.

Proceed

End with ownership, not instruction. The Proceed phase asks: what are you leaning toward? What are you going to do? Why?

In the workplace, this is the phase where ownership compresses decision time and execution speeds up. In the classroom, it is the phase where a student commits to their own answer, defends it, and accepts the consequences of being right or wrong. Proceed is where the conversation comes back around to communication, because the student now has to articulate what they have decided, and why, in language someone else can engage with.

Proceed is also where confidence is built. A student who has been heard, who has examined the evidence, who has made the pattern connections, and who then commits to a position has done something most adults cannot do reliably. The four phases working together are how that capability gets developed.

Why this works for children, not just adults

HELP was originally designed for leadership coaching, clinical practice, and high-stakes professional conversations. We are arguing in this paper that it should be taught to children, starting young, in age-appropriate forms. The reasons are clinical and practical.

Clinically, the four phases align with the developmental sequence of how children actually learn to think and communicate. Listening before speaking. Distinguishing what they observed from what they assumed. Connecting new information to what they already know. Committing to an answer and being able to explain it. Every one of those is already a developmental milestone. HELP names the milestones in a way that lets teachers reinforce them deliberately, instead of hoping they emerge.

Practically, the framework gives a school district something teachable, observable, and assessable. A Grade 5 teacher can ask students to run a HELP pass on a story they read. A Grade 9 science teacher can ask students to run HELP on a lab result that did not match the prediction. A high school English teacher can ask students to run HELP on an AI-generated essay before deciding whether to use any of it. The framework is the same in every case. The complexity scales with the student.

The clinical foundation matters here because the Hear phase, in particular, draws directly from how trained psychotherapists are taught to listen. Reflective listening, separating content from emotion, holding space without immediately problem-solving. These are skills that have been studied for decades in clinical training and that have been demonstrated to be teachable to non-clinicians, including children, with the right structure. We are not inventing the underlying mechanism. We are bringing it into the place it has always been needed and never been used.

How HELP makes the three Cs teachable

The mapping is direct. Hear builds the listening capacity that real communication requires. Evidence builds the discipline of confirmation. Learn builds the pattern-recognition that curiosity needs to attach to. Proceed brings the loop back to communication and to ownership. Across the four phases, creativity often surfaces in Learn and gets expressed in Proceed.

The three Cs without HELP are inputs without an operating plan. The HELP framework without the three Cs is structure without purpose. Together, they describe both what students need to develop and how to develop it. That is what makes the combination teachable at scale, by teachers who do not need a clinical degree or twenty years of leadership experience to run it. They need a framework, training, and the institutional permission to use it.

Part Four: The memory objection, honestly addressed

The strongest critique of any framework that emphasizes inputs over content goes by various names. The Memory Paradox is one. The Hollowed Mind is another. The argument, summarized fairly, is this. Cognitive offloading to AI and digital tools weakens the brain’s declarative and procedural memory systems. The recent reversal of the Flynn Effect, the long-running rise in IQ scores in developed countries, has correlated with the rise of cognitive offloading. A student with no internal knowledge cannot evaluate AI output, cannot ask a real question, and cannot be meaningfully creative. Skip the foundation and the three Cs collapse.

The argument has real evidence behind it. It also conflates two things that should be separated.

Memorization is not the same as recall

Memorization, in the sense schools have historically used the term, is the deliberate pre-loading of information you are not currently using, for later reproduction on a test. The brain treats this exactly the way it treats any other low-priority signal. It offloads it as soon as the test is over. This is not a failure of pedagogy. It is how the brain is designed to work. The brain comes off the treadmill the moment it knows it does not have to hold something.

Recall is different. Recall is what happens when a student has actively engaged with material, questioned it, used it, applied it, argued about it, or built something with it. Recall does not require memorization in the schoolroom sense. It requires engagement. The brain does not get to decide it does not need to hold material that has been actively wrestled with.

This distinction matters because it explains a pattern educators are starting to see in AI-assisted student work. Two students with identical access to AI produce opposite outcomes. The first student uses AI to skip the work, never reads what comes back, and learns nothing. The second student uses AI to draft, then reads carefully, edits, pushes back on the parts that do not match what they know, and ends up with a working internal model of the topic. The second student often learns more than they would have without the tool, because they engaged with more material at higher density.

The variable is not the tool. The variable is engagement. The three Cs are how engagement happens.

Schema is built. Memorization is loaded.

What the Memory Paradox researchers correctly identify as essential is internal schema. Schema is the connected web of concepts, vocabulary, and mental models that lets a student make sense of new information and ask real questions inside a domain. A student with no schema for cellular biology cannot be curious about cellular biology in any productive way. They do not have the surface area to be curious against.

Schema gets built through active engagement, not through memorization for reproduction. A student who memorizes the Krebs cycle for a test and forgets it the next week has not built schema. A student who learns the Krebs cycle because they were curious about how their own muscles produce energy during exercise, and confirmed what they read against three sources, and explained it to someone else, has built schema. Same content. Different result. The three Cs are the difference.

So the honest position on memorization is this. Memorization in the sense of “store and reproduce facts you do not engage with” is dead. AI killed it. There is no sustainable case for asking students to do something a phone can do better. But the underlying need that memorization was supposed to serve, building usable internal knowledge, is more important than ever. The three Cs are how that internal knowledge actually gets built in the AI era. They are not anti-knowledge. They are the conditions under which knowledge sticks.

What this means for the Flynn Effect reversal

The Flynn Effect reversal is real. IQ scores in developed countries have stalled or declined since the 1990s. The Memory Paradox researchers attribute this in part to cognitive offloading. That attribution is plausible but not exclusive. The Torrance creativity decline began at roughly the same time and pre-dates widespread internet use, let alone AI. Standardized testing, screen time, and the loss of unstructured play are the more likely structural causes. AI is accelerating a trend that was already in motion.

The implication for policy is the same either way. The system has been failing to build the inputs that produce real cognition for thirty-five years. AI did not cause the problem. AI exposed it. Fixing the problem requires the three Cs, not a return to memorization.

Part Five: Outcome-based assessment

Every serious profession that handles real consequences has long since moved past pure standardized testing. Medicine has residency. Law has articling and bar exams that test applied judgment, not just recall. Accounting has the CPA designation built around supervised work. Engineering has professional designations that require years of mentored practice before the ring goes on. Plumbing, electrical, and the trades use apprenticeship because the work is too consequential to certify on a written test alone. The general education system is the outlier. It still certifies competence primarily through standardized testing.

The argument of this paper is that the general education system should adopt the assessment model the high-stakes professions already use. Outcome-based. Apprenticeship-style. Demonstrated competence on real deliverables, evaluated by people who can tell whether the work is real.

What this looks like in practice

The shift is not from grading to no grading. It is from grading reproduction to grading capability. A student demonstrates they can do something. A qualified evaluator confirms it. The credential reflects what the student can actually do, not what they can replicate on a written test.

In K to 12, this means project-based assessment as the primary mode, with traditional testing relegated to checking foundational fluency in the specific cases where fluency matters. Students should defend their work to a panel that can ask follow-up questions. The three Cs become observable. A student who has been curious shows it in the depth of their questioning. A student who can communicate shows it in the defense. A student who has practiced confirmation shows it in the way they handle pushback.

In post-secondary, this means moving toward credentials that more closely resemble professional designations. Less emphasis on credit hours and seat time. More emphasis on supervised demonstration of capability. The Hult survey’s 96 percent of HR leaders who say schools need to take more responsibility for workforce training is not asking universities to add another internship requirement. It is asking universities to redesign the credential so that finishing it means something.

The role of the curriculum

Outcome-based assessment only works if the curriculum gives students enough latitude to find topics that genuinely intrigue them. Curiosity cannot be assigned. A student forced to demonstrate capability inside a topic they do not care about will produce competent-looking work that builds no schema and no judgment.

The curriculum design challenge is to define a wide enough catalog of acceptable demonstrations that every student can find an honest entry point, while keeping the rigor high enough that the credential means something. This is harder than the current model. It is also closer to how the high-stakes professions have always worked. A medical resident does not pick their patients, but they do pick their specialty, and their depth comes from that choice.

Part Six: Recommendations

For K to 12 superintendents and principals

First, integrate AI literacy as a confirmation discipline, not as a forbidden tool or a productivity tool. The Grade 8 and 9 study cited earlier showed that two hours of structured instruction on questioning AI output produced measurable gains. The intervention is cheap. The return is large. Every middle and high school student should receive structured AI literacy instruction by the end of Grade 9, and the instruction should focus on confirmation: how to spot a wrong answer, how to verify against sources, how to detect homogenized output, how to recognize when the AI is making something up.

Second, adopt the HELP Operating System as the explicit framework for classroom discussion, project work, and student-teacher conferences. The four phases are simple enough to teach to a Grade 5 student and rigorous enough to use through senior year of high school. Train teachers on the framework. Post the four phases in classrooms. Use the language. When a class debates a topic, run a Hear pass before anyone argues. When a science experiment produces an unexpected result, run an Evidence pass before anyone speculates. When a student is stuck on a problem, run a Learn pass instead of giving them the answer. The framework provides the operating plan that the three Cs by themselves cannot.

Third, redesign assessments around demonstrated capability where possible. Project-based assessment should be the primary mode in subjects where it can be implemented, with panel defenses where students field follow-up questions. The defense is where curiosity, communication, and confirmation become observable. A student who has done the work shows it. A student who has not, cannot fake it under questioning.

Fourth, build curriculum latitude into the existing standards. Most state and provincial standards allow more flexibility than schools currently use. Where students can choose the topic of an inquiry project, the project that comes back is qualitatively different from one assigned by a teacher. The same competencies get covered. The student does the work because they care about the work.

Fifth, train teachers in Socratic methods explicitly. The OECD evidence shows curiosity is teachable through specific instructional moves. Most teachers were never trained in those moves. Professional development that focuses on asking better questions, holding space for student inquiry, and rewarding good questions over right answers should be a multi-year priority.

Sixth, name what the system is teaching. If the answer is “compliance and reproduction,” name that and decide whether it is what you want. If the answer is “curiosity, communication, and confirmation,” build the structures that actually produce them. Stating the goal out loud is a precondition for measuring whether you are hitting it.

For university presidents and provosts

First, accept that the AI question is not about plagiarism. Most universities are still treating AI use as an academic integrity problem to be policed. The framing is wrong. The question is whether your graduates can use AI well, which means using it with curiosity, communication, and confirmation, or whether they can only use it to skip work. The first kind of graduate gets hired. The second kind does not. Your accreditation and reputation depend on producing the first kind.

Second, redesign at least one core requirement around supervised, outcome-based demonstration of capability. The model should resemble a residency more than a course. Pick a domain where the cost of getting it wrong is high enough to justify the investment. Business communication, technical writing, applied research methods, and entry-level professional skills are obvious starting points. Make the credential mean something. Make completion contingent on demonstrated competence, not seat time.

Third, adopt the HELP Operating System as the framework for seminar discussion, capstone defenses, and graduate-level coursework. Universities already use frameworks for clinical reasoning in medical school, case method in business school, and Socratic dialogue in law school. HELP is a generalizable operating plan that maps onto any of those domains and is teachable inside a single course. Faculty who run their seminars through HELP find that student preparation, participation, and intellectual courage all improve, because the four phases give students a clear contract for how the conversation will run.

Fourth, treat the first-year experience as a curiosity intervention. The students arriving on campus have spent thirteen years in a system that punished divergence. They need an explicit, structured re-introduction to the discipline of asking real questions. This is not orientation. It is a course. It is taught by faculty who are good at it, not by junior staff.

Fifth, build feedback loops with employers that go past the standard advisory board. The Hult survey’s headline finding is that employers no longer trust the credential. Universities can either fix the credential or watch it lose value. Fixing it requires regular, structured contact with the people doing the hiring, and it requires acting on what they say.

Sixth, stop hiding behind the research mission. Most students do not enroll for the research mission. They enroll because they expect the credential to translate into a working career. If the credential is failing on that promise, citing the research mission is not a defense. It is an evasion.

For both audiences

The single most important shift is the assessment model. Until the system grades the right things, the curriculum will keep optimizing for the wrong ones. Outcome-based assessment, modeled on apprenticeship, is the change that makes everything else possible. It is also the change that institutions resist hardest, because it is operationally harder than running standardized tests.

Hard does not mean impossible. The medical, legal, accounting, and engineering professions have done it. The trades have done it. The general education system can do it. The question is whether the institutions doing the work will choose to lead the change or wait until enrollment, hiring outcomes, and accreditation pressure force it on them.

The cost of not acting

A student finishing high school today has spent thirteen years in a system that rewarded reproducing other people’s answers. Many of those students have spent the last two of those years using AI to do the reproduction for them. They are now arriving at universities, and shortly after that, in workplaces.

The pattern that hiring managers describe is consistent. The graduate produces work that looks competent. They cannot defend it. They cannot start a conversation that has not been scripted. They cannot push back on a flawed idea because they were never trained to confirm. They wait to be told what to do next, and when no one tells them, they do nothing. This is not a personality problem. It is a skills problem, and it was produced by a system that did not teach the skills that matter.

The cost of continuing the current model is graduates who cannot function without external instruction in an environment where instructions are increasingly absent. The cost of changing the model is operational difficulty for institutions that have spent decades optimizing the current one. One cost falls on graduates and their employers. The other cost falls on institutions. Institutions are paid to absorb operational difficulty so that graduates do not have to absorb it later.

Curiosity. Communication. Confirmation. The three Cs are not new ideas. They are the inputs the system was always supposed to build, and stopped building somewhere along the way. AI did not create the problem. It made the problem impossible to ignore.

This is the work.

About the authors

Chris Schafer is co-founder of OnDemand Leaders and co-author of the HELP Operating System. He spent 25 years in SaaS go-to-market leadership, including the journey of NetSuite from $30 million to over $1 billion in revenue through its IPO and the Oracle acquisition, and most recently served as President at ImportGenius. OnDemand Leaders provides interim CRO, President, and CEO-level leadership to growth-stage companies.

He is a public speaker who works with large classrooms and company offsites on how to implement HELP, Hear, Evidence, Learn, Proceed, and how to design solutions around strong outcomes, with a particular focus on enablement and learning systems.

Elisha Schafer, RP, is co-founder of OnDemand Leaders, co-author of the HELP Operating System, and Founder and Clinical Director of Lotus Counselling Services in Waterdown, Ontario. She is a Registered Psychotherapist with more than a decade of clinical experience treating anxiety disorders, OCD, PTSD, and addictions, and is an Assistant Clinical Professor (Adjunct) in the Department of Psychiatry and Behavioural Neurosciences at McMaster University, where she supervises clinical trainees. The HELP framework draws directly on the evidence-based clinical principles she has used in her practice and teaching.

OnDemand Leaders publishes research and frameworks at ondemandleaders.com.

Questions parents and educators ask

The Education Three Cs. The questions underneath it.

What are the Education Three Cs?

Curiosity, Communication, and Confirmation. They are the human capabilities the current education system was never built to measure: the ability to stay curious when the easy answer is right there, to have the hard conversation, and to question what you are handed instead of swallowing it whole. They are the only skills getting harder to find and more valuable to have.

Why does the current education system fall short in the age of AI?

It rewards what AI now does for free and under-rewards what AI cannot do at all. The fix is not more rigor inside the old categories. It is a new set of categories. Schools first banned AI, then half-adopted it without understanding it, which means they failed twice and distorted years of learning that should have prepared students for the world the tool created.

Why can't schools teach students to use AI well?

Because the institutions themselves do not know how to use it. They have not automated their own work with it, have not streamlined a single core process with it, and do not use it effectively inside their own walls. You cannot teach a discipline you have never practiced. That is the gap the paper names.

What does it mean to be outcome-focused about education?

It means asking what a student needs on the day they walk out the door and working backward from there. Follow the outcome, not the rule. The two questions that matter: is AI going anywhere, or only getting more useful, and will you need to know how to use it when you graduate more than you will need to have avoided it.

What do successful people have that the school system never measured?

Across twenty-five years in the room, it was never the credential and rarely raw recall. It was a set of human capabilities: staying curious when the easy answer was there, having the hard conversation, and questioning what they were handed. Most ran these plays without naming them. The work is turning that instinct into a system anyone can learn.

More from the Think Tank.

Read the other paper, or subscribe for new releases.

Canada’s Value Capture Gap ↗ Subscribe to Insights