Bucket Foundation · education-atlas working paperDOI: pending · CC-BY-4.0source on github ↗

The Positive Gap: What Education Systems Don't Teach — Learning-to-Learn, Philosophy, and the Whole Person

Bucket Foundation deep brief 03 — the missing content, not the broken machine

A literature-grounded companion to the education-atlas. Brief 01 mapped US education and innovation; brief 02 (`docs/deep/02-world-tiers-and-the-industrial-model.md`) diagnosed **why** rich, middle, and poor systems fail differently and traced all three to one shared root — the industrial, time-based, sorting-first model. This brief asks the inverse question. Not "what is the machine doing wrong" but "**what is the machine not doing at all**." The atlas (§5) flagged this directly: "indicators miss the problems that may matter most … curriculum relevance, pedagogy, the purpose of education itself." This is the positive gap — the high-leverage skills and content that the system structurally omits, ranked by the strength of the evidence that they matter and the evidence that they are absent.

Every empirical claim below is sourced inline. Where the evidence is contested (the transfer of critical-thinking skills) or where a popular idea is **wrong** (learning styles), this brief says so plainly. Bucket reforms education on the record, not on conviction — and that includes being honest when the record undercuts a reform talking point.

0. Framing: three things schools skip, and one reason they skip all three

Brief 02 ended on a target: replace time-based, sorting-first, standardized, compliance-driven, credential-terminal schooling with mastery-based, learning-first, individualized, agency-driven, competence-verified learning. That is the operating system. This brief is about the payload — the actual capacities a reformed system would develop that the current one does not. Three stand out, in descending order of evidentiary leverage:

Learning to learn (metacognition + self-regulation). The single highest-leverage missing skill. Schools

teach content; they almost never teach the skill of acquiring content. The cognitive science on how people learn is unusually settled — and almost none of it is taught to the learners themselves.

Philosophy, logic, epistemology, argument. The disciplines of how to think and how to know are largely

absent from K–12 worldwide, even as every mission statement promises "critical thinking." The evidence that structured philosophical inquiry helps is real; the evidence that generic "critical thinking" transfers is genuinely contested. Both facts belong in the case.

The whole person — agency, curiosity, intrinsic motivation, creativity, play. Humans arrive curious and

self-directed; the system measurably extinguishes both over time. The motivational and developmental dimension is not a soft add-on; it is the substrate every other learning outcome rides on.

And one structural reason all three are absent (§4): they are hard to standardize and hard to test, so the measurement model crowds them out. Goodhart's and Campbell's laws explain why a system that runs on test scores systematically de-prioritizes exactly the capacities that don't reduce to a test score — which, in the knowledge and AI age, are the capacities that matter most.

1. Learning to learn: the highest-leverage missing skill

1.1 The science of learning is settled; the learner is the last to know

If there is one place where education has a usable, replicated, decades-deep evidence base, it is the cognitive psychology of how learning works. The landmark synthesis is Dunlosky, Rawson, Marsh, Nathan, and Willingham's 2013 monograph in Psychological Science in the Public Interest, which graded ten common study techniques on the breadth and strength of their evidence (Dunlosky et al., 2013, *PSPI*; accessible summary in *American Educator*, "Strengthening the Student Toolbox"). Their verdict:

High utility: practice testing (retrieval practice) and distributed practice (spacing). Both work

across ages, abilities, materials, and subjects.

Moderate utility: interleaved practice, elaborative interrogation, self-explanation.
Low utility: rereading, highlighting/underlining, summarization, keyword mnemonics, imagery for text.

The cruelty of the finding is the inversion: the two techniques that work best are among the least used, and the techniques students rely on most are the weakest. In the data Dunlosky cites, 84% of students study by rereading, and highlighting is described as "universal" — yet both sit in the low-utility tier (*American Educator*). The "testing effect" / retrieval-practice literature is one of the most robust results in all of psychology — actively recalling material strengthens memory far more than re-exposing yourself to it — and it now replicates in real primary-school classrooms, not just labs (NIH/PMC, retrieval practice in primary schools, 2024).

The point for reform is stark. This is not contested frontier science. It is settled, teachable, and high-leverage — and the people who would benefit most from knowing it, the students, are almost never taught it.

1.2 Metacognition: the construct, and why it's the leverage point

The underlying capacity is metacognition — "cognition about cognition," or thinking about one's own thinking. The foundational definition is John Flavell's: metacognition is the "active monitoring and consequent regulation and orchestration" of cognitive processes in service of a goal (Flavell, 1979, *American Psychologist* 34(10):906–911). Flavell split it into metacognitive knowledge (what you know about how you learn — declarative, procedural, and conditional) and metacognitive monitoring/regulation (noticing whether you actually understand, and adjusting). Barry Zimmerman extended this into the practical engine of self-regulated learning (SRL): a cyclical model with a forethought phase (goal-setting, planning, motivation), a performance phase (strategy use, self-monitoring), and a self-reflection phase (self-evaluation, attribution), feeding back into the next cycle (Zimmerman, *Development of Self-Regulated Learning*, ERIC ED518491; review of Zimmerman's cyclical model, ResearchGate; broader survey: Panadero, "A Review of Self-Regulated Learning: Six Models," *Front. Psychol.* 2017).

Why is this the leverage point? Because it is the meta-skill that makes all other learning compound. A learner who can set a goal, choose an effective strategy, monitor whether it's working, and correct course is, in effect, their own teacher — and can keep learning long after any teacher, course, or institution is gone. In a world where the half-life of specific knowledge is shrinking and the volume of accessible knowledge is exploding, the capacity to direct one's own learning is the highest-return skill there is. Reviews repeatedly find metacognitive and self-regulatory instruction among the highest-impact, lowest-cost interventions available (Stanton et al., "Fostering Metacognition to Support Student Learning," *CBE—Life Sci. Educ.* 2021).

1.3 The damning evidence: students systematically misjudge their own learning

Here the literature does something rare — it shows not just that students lack the skill, but that their intuitions about learning are actively wrong, in a measurable, repeatable way. Kornell and Bjork's work is the cleanest demonstration. In one study, 90% of students learned better after spaced practice than massed practice — yet 72% of them believed massing was the more effective method. They were getting the right answer from the wrong strategy and concluding the wrong strategy was right (Kornell & Bjork, 2008, *Memory*; overview in Bjork, Dunlosky & Kornell, "Self-Regulated Learning: Beliefs, Techniques, and Illusions").

The mechanism is fluency illusions: rereading and highlighting produce a feeling of familiarity that students misread as a feeling of mastery. The strategies that feel productive (smooth, easy, fluent) are weak; the strategies that work (retrieval, spacing, interleaving — Bjork's "desirable difficulties") feel harder and slower, so students abandon them (Bjork on desirable difficulties, summary). A 2024 classroom study put it in the title: students don't learn the way they think they do (Bjork-tradition replication, *CBE—Life Sci. Educ.* 2024).

The corollary is the indictment: many students have never been taught how to study at all. Their strategies are folk-theories, picked up by imitation, and frequently wrong; explicit instruction in metacognitive strategy helps, but it is "not universally provided" (Stanton et al., *CBE—Life Sci. Educ.* 2021; teacher instruction and children's metacognition, NIH/PMC). The system devotes twelve-plus years to what to learn and almost none to how.

1.4 The honest caveat: learning styles is a debunked myth — say it

A reform brief that championed "learning to learn" while smuggling in pop-neuroscience would deserve to be ignored. So, plainly: the "learning styles" idea — that each student has a visual/auditory/kinesthetic style and learns best when instruction is matched to it (the "meshing hypothesis") — is not supported by evidence. Pashler, McDaniel, Rohrer, and Bjork's 2008 review found "virtually no evidence" for the crossover interaction that the theory requires, and concluded there is "no adequate evidence base to justify incorporating learning-styles assessments into general educational practice" (Pashler et al., 2008, *PSPI*). The consensus has held since (APA, "Toward a Deeper Understanding of the Learning Style Myth"), even as a 2024 meta-analysis revisited the question and still failed to vindicate matching (Frontiers, "Is it really a neuromyth?" 2024).

This matters for the argument in two ways. First, it is a precise example of the gap: the false idea (learning styles) is the one most teachers have heard of, while the true, high-leverage ideas (retrieval, spacing, metacognition) are the ones most learners have never been taught. Second, it models the standard Bucket holds itself to — when the evidence kills a popular idea, name it.

1.5 Transfer: the open hard problem, stated honestly

Learning-to-learn is high-leverage to the extent that it transfers — that a strategy learned in one place is deployed in another. Transfer is real but not automatic; learners frequently fail to apply a strategy outside the context in which they acquired it. This is a genuine open problem in the science, not a solved one, and it recurs sharply in the next section on critical thinking (§2.3). The practical implication is not "give up on learning-to-learn" but "teach it with deliberate attention to transfer — across many domains, with explicit reflection on when and why a strategy applies (Flavell's conditional knowledge)" rather than as a one-off study-skills unit.

2. Philosophy, logic, and the disciplines of thinking

2.1 The near-absence

Most school systems contain almost no formal instruction in philosophy, logic, epistemology, or the structure of argument. Children can pass through thirteen years of schooling and never be taught what a valid inference is, how to identify an unstated premise, what distinguishes knowledge from belief, or how to steelman a position they disagree with. Where philosophy appears at all, it is typically an elective at the tertiary level for a self-selected few — not a foundational literacy taught to everyone, the way arithmetic is. The atlas (§5) flagged "the purpose of education itself" as unmeasured; philosophy is, among other things, the discipline that asks that question, and it is structurally absent from the curriculum that would benefit from it.

2.2 The evidence that structured philosophical inquiry helps: P4C

The strongest counter-case is Philosophy for Children (P4C) — Matthew Lipman's program of turning the classroom into a "community of inquiry" where children reason together about open questions. The evidence here is unusually encouraging. Trickey and Topping's systematic review found a mean effect size of about d = 0.43 with low variance across a wide range of outcomes — a consistent, moderate positive effect (Trickey & Topping, 2004, *Research Papers in Education*). More recent meta-analyses are stronger: a three-level meta-analysis of 53 effect sizes from 33 studies (4,568 participants) found an overall effect of g ≈ 0.59, with robust effects specifically for reasoning, critical thinking, and creativity (three-level meta-analysis, NIH/PMC 2025); a separate meta-analysis reported an overall effect around 0.65, placing P4C well above the ~0.40 effect of a typical educational intervention (ERIC EJ1465596, P4C meta-analysis).

The most cited applied result: in a UK trial across 48 primary schools, pupils doing P4C for a year showed higher reading and mathematics scores than controls — and disadvantaged pupils benefited most (Trickey & Topping; EEF-cited P4C results, summarized via p4c.com). Trickey and Topping also reported social-emotional gains: increased self-esteem, reduced anxiety and dependency, greater confidence. So the discipline of structured reasoning, taught explicitly and dialogically, has measurable academic and developmental payoff — and it is almost entirely outside the standard curriculum.

2.3 The honest caveat: "critical thinking" transfer is genuinely contested

Here the reform talking point meets its sharpest empirical resistance, and integrity demands stating it. Schools overwhelmingly claim to teach critical thinking; the evidence that generic, content-free critical-thinking instruction produces skills that transfer to new domains is weak and disputed.

The central skeptic is cognitive scientist Daniel Willingham, whose position is that critical thinking is not a free-floating general skill but is substantially domain-specific — bound up with deep knowledge of the particular subject. His summary: "an expert does not think as well outside her area of expertise, even in a closely related domain"; you cannot reason well about a topic you don't know much about (Willingham, "Critical Thinking: Why Is It So Hard to Teach?"; Willingham, "How Can Educators Teach Critical Thinking?" *American Educator* 2020). On this view, the way to produce good critical thinkers is, paradoxically, to teach a lot of content — rich domain knowledge is the precondition for domain reasoning (Willingham, Knowledge Matters brief).

The meta-analytic picture is more optimistic but modest: Abrami et al. (2015) synthesized 341 effect sizes from controlled studies using standardized critical-thinking measures and found a weighted mean effect of about g = 0.30 — real, positive, achievable, but moderate — with the largest gains when critical thinking is taught explicitly and immersed in subject-matter content rather than as an isolated generic course (Abrami et al., 2015, *Review of Educational Research*, ERIC EJ1061695).

The honest synthesis: the disciplines of thinking can be taught and do help — but not as a content-free inoculation. The evidence favors teaching reasoning, logic, and argument embedded in rich domains, with deliberate attention to transfer, and structured dialogue (the P4C model) as a delivery mechanism. The reform claim is therefore not "add a critical-thinking class and skills will generalize" (the evidence is against that naive version), but "make reasoning, epistemology, and argument explicit and pervasive across a knowledge-rich curriculum." That is a defensible claim. The naive one is not, and Bucket should not make it.

3. The whole person: agency, curiosity, intrinsic motivation, creativity, play

3.1 The motivational substrate everything else rides on

The first two gaps are cognitive. This one is motivational and developmental — and it may be the most upstream of all, because no metacognitive strategy and no reasoning skill matters in a learner who has stopped wanting to learn. The dominant framework is Deci and Ryan's Self-Determination Theory (SDT), which holds that humans have three innate psychological needs — autonomy (volition, choice), competence (feeling effective), and relatedness (connection) — and that intrinsic motivation flourishes when all three are supported and collapses when any is thwarted (Ryan & Deci, 2000, "Self-Determination Theory and the Facilitation of Intrinsic Motivation," *American Psychologist*; Ryan & Deci, 2020 update, *Contemp. Educ. Psychol.*).

The damaging finding for conventional schooling: the very levers schools lean on hardest — grades, deadlines, directives, surveillance, competition, and tangible rewards — are precisely the controllers that SDT research shows undermine intrinsic motivation, because learners experience them as external control of their behavior (Deci, Koestner & Ryan, "Extrinsic Rewards and Intrinsic Motivation in Education," meta-analysis). Conversely, autonomy-supportive teaching catalyzes greater curiosity, intrinsic motivation, and appetite for challenge. The standard apparatus of the industrial model (brief 02) is, in SDT terms, a machine for converting intrinsic motivation into extrinsic compliance.

3.2 The measurable extinction of curiosity and motivation

This is not theoretical. Intrinsic motivation and curiosity decline measurably and steadily across the school years. Lepper, Corpus, and Iyengar's classic study documented intrinsic motivation falling monotonically from 3rd through 8th grade — a steep, grade-linked decline (Lepper, Corpus & Iyengar, 2005, *J. Educ. Psychol.*). School-related curiosity shows the same trajectory, and the leading proposed cause is educational practice itself — the rise of grades, correctness pressure, and public evaluation, which tend to intensify as standardized testing is introduced in the early grades (curiosity promotion/suppression in classrooms, *ScienceDirect*).

The Gallup Student Poll quantifies the same arc at national scale: about 74% of fifth-graders are "engaged" in school, falling to roughly one-third by 10th–12th grade — the "school cliff" (Gallup, "The School Cliff"; Gallup engagement by grade, *EdWeek*). A system that begins with curious, engaged children and ends with disengaged ones is doing something to them. The charitable reading is that disengagement is an unintended side effect of the measurement model; either way, the extinction of intrinsic motivation is a produced outcome, not a fact of adolescence.

3.3 The whole-person and self-directed alternatives: what the evidence supports

The constructive question is whether systems built around agency, intrinsic motivation, and self-direction actually work. The strongest recent evidence is Montessori, now tested with the rigor its critics long demanded. Lottery studies (random assignment via oversubscribed admission lotteries) found Montessori children outperforming controls on reading, math, executive function, social understanding, and other measures (Lillard & Else-Quest, 2006; Lillard et al., 2017; longitudinal Montessori study, *Front. Psychol.* 2017). The 2025 capstone is a national randomized controlled trial of public Montessori preschool, reporting significant end-of-kindergarten advantages — the strongest causal evidence to date that an agency- and autonomy-centered model produces real cognitive gains, not just happier children (national RCT of public Montessori preschool, *PNAS* 2025).

Adjacent models converge. Project-based learning shows medium-to-large positive effects on achievement in meta-analysis (e.g., 46 effect sizes, ~12,585 students, medium-to-large effect), with particular strength on higher-order and "21st-century" skills (PBL meta-analysis, *ScienceDirect*; PBL meta-analysis, *Front. Psychol.* 2023). And the mastery result from brief 02 belongs here too: Bloom's "2 Sigma" finding — that one-to-one mastery tutoring moved the average student two standard deviations above conventionally taught peers — is fundamentally a result about what becomes possible when pace and feedback are fitted to the individual learner rather than the batch (Bloom, "The 2 Sigma Problem," 1984; see brief 02 §1.2).

Two honest qualifications. (1) Radically unstructured models — Sudbury / "free school" approaches with no curriculum at all — have inspiring case reports and graduate-survey data but lack the controlled, causal evidence base that Montessori has earned; they should be presented as a hypothesis with promising anecdote, not a proven model. (2) "Discovery learning" with minimal guidance has been seriously challenged: Kirschner, Sweller, and Clark argue that novices need substantial guidance and that pure minimally-guided discovery is inefficient (see brief 02's sources). The defensible position is not "remove all structure" but structured autonomy — high agency and intrinsic-motivation support within a well-designed, knowledge-rich, mastery-paced environment. That is exactly what high-fidelity Montessori is, and exactly what the RCT evidence supports.

4. Why these three are absent: the measurement model crowds them out

The recurring puzzle is that all three missing capacities are well-evidenced and high-leverage, yet all three are systematically omitted. The explanation is structural, and it is the same mechanism brief 02 identified as the root of the industrial model: what gets measured gets optimized, and what can't be cleanly measured gets dropped.

This is Campbell's Law — "the more any quantitative social indicator is used for social decision-making, the more subject it will be to corruption pressures and the more apt it will be to distort and corrupt the social processes it is intended to monitor" (Campbell, 1976) — and its cousin Goodhart's Law: when a measure becomes a target, it ceases to be a good measure. Apply them to schooling. Standardized tests can cheaply, reliably, and defensibly measure decontextualized content recall and procedural skill. They cannot cheaply measure metacognitive self-regulation, the quality of a child's reasoning in live dialogue, intrinsic curiosity, creativity, or self-directed agency. So an accountability system built on tests will pour effort into the measurable and starve the unmeasurable — not from malice, but from the logic of the metric.

The result is a precise, perverse mismatch. The capacities the knowledge-and-AI age most needs are exactly the ones the measurement model least rewards. When facts are a search away and a language model can produce a competent essay on demand, the durable human edge is the ability to direct one's own learning (metacognition), to reason and judge (philosophy/critical thinking), and to want to (intrinsic motivation, curiosity, agency). Those are the three gaps. The system optimizes for the things AI now does cheaply and neglects the things AI makes more valuable. The atlas's own §5 caveat — that the indicators "miss the problems that may matter most" — is not a data-coverage footnote; it is the mechanism. We don't teach learning-to-learn, philosophy, and agency in large part because we can't easily test them, and we run the system on tests.

5. The reform implication: an education centered on the gap

If the diagnosis is "the system omits the three highest-leverage capacities because it optimizes a metric that can't see them," the reform is to build learning around those capacities directly. Concretely:

Teach learning-to-learn explicitly and early — as core curriculum, not a study-skills bolt-on. Make retrieval practice, spacing, interleaving, and self-explanation the default modes of study, and teach learners the metacognitive frame (Flavell) and the self-regulation cycle (Zimmerman) so they can run it themselves. Confront the fluency illusion directly (Kornell & Bjork): show learners that what feels like learning often isn't. This is the cheapest, best-evidenced, highest-leverage change available, and it compounds — a learner who can teach themselves needs the institution less every year.

Make reasoning, logic, epistemology, and argument pervasive — embedded in rich knowledge, not free-floating. Adopt the community-of-inquiry model (P4C, g ≈ 0.43–0.65) as a delivery method, but heed the transfer evidence (Willingham; Abrami): teach thinking through deep domain content, explicitly, with attention to when and why a mode of reasoning applies. Reject the naive "critical-thinking class" the evidence does not support; build the knowledge-rich, reasoning-explicit curriculum it does.

Protect and feed the whole person — design for autonomy, competence, and relatedness. Treat the school cliff (Gallup) and the grade-linked decline in intrinsic motivation (Lepper) as design failures to engineer out, not facts of adolescence to accept. Lean on the levers SDT shows build intrinsic motivation (autonomy support, genuine competence, belonging) and minimize the ones that erode it (surveillance, ranking, extrinsic-reward dependence). Use the models with real causal evidence — high-fidelity Montessori (now RCT-backed), project-based learning, and mastery pacing — as existence proofs that structured autonomy outperforms the batch.

Tie it to Bucket's open-knowledge thesis. Bucket's mission is to make primary knowledge paid-for-once and citeable-forever, and to route citation value to authors rather than gatekeepers — i.e., to put the frontier of human knowledge within direct reach. That thesis and this brief are the same argument from two ends. Open access to primary knowledge is necessary but not sufficient: a frontier you can reach is useless to a learner who was never taught to direct their own learning, to reason and judge, or who had their curiosity extinguished by twelfth grade. The complement of open knowledge is the self-directed learner — and the three capacities in this brief are precisely what produce one.

This is also where AI belongs, on the right side of a sharp line. The cognitive science gives the test: AI is a learning amplifier when it deepens metacognition, reasoning, and agency — and a crutch when it replaces them. An AI that drills you with retrieval practice, spaces your review, demands you self-explain, plays the Socratic interlocutor in a community of inquiry, and tutors to mastery at your own pace (the personalized 2-sigma tutor Bloom could only imagine; cf. the AI-tutor case studies now appearing, arXiv 2309.13060) — that AI builds exactly the three missing capacities. An AI that simply hands over the answer, the essay, and the conclusion does the opposite: it offloads the retrieval, the reasoning, and the agency, accelerating the very atrophy this brief documents. The reform is not "add AI." It is "build learning around the three things schools skip, and use AI only insofar as it strengthens them."

The factory model (brief 02) was a real democratizing achievement and is now the shared root of three failures. The gap it leaves (brief 03) is not random: it is the exact set of capacities that don't fit on a standardized test and that the knowledge-and-AI age makes most valuable. Reform is not more school or even only better-funded school. It is teaching the things the machine was never built to teach — how to learn, how to think, and how to stay the kind of person who wants to.

Key sources

Learning to learn / metacognition / self-regulation:

Dunlosky, Rawson, Marsh, Nathan & Willingham, 2013, *Psychological Science in the Public Interest* — the ten-techniques review (high: practice testing, distributed practice; low: rereading, highlighting); plain-language version: *American Educator*, "Strengthening the Student Toolbox"
Flavell, 1979, *American Psychologist* 34(10):906–911 — foundational metacognition definition
Zimmerman, *Development of Self-Regulated Learning* (ERIC ED518491); Panadero, six-model SRL review, *Front. Psychol.* 2017
Kornell & Bjork, 2008, *Memory* (90% learn better spaced / 72% believe massed is better); Bjork, Dunlosky & Kornell, "Beliefs, Techniques, and Illusions"; desirable difficulties
Stanton et al., "Fostering Metacognition," *CBE—Life Sci. Educ.* 2021; retrieval practice in primary classrooms, NIH/PMC 2024
Debunked myth: Pashler, McDaniel, Rohrer & Bjork, 2008, *PSPI* (learning styles / meshing — no adequate evidence); APA learning-style myth; 2024 meta-analysis

Philosophy / critical thinking / reasoning:

Trickey & Topping, 2004, *Research Papers in Education* (P4C, d ≈ 0.43); three-level P4C meta-analysis, NIH/PMC 2025, g ≈ 0.59; P4C meta-analysis, ERIC EJ1465596
Contested transfer: Willingham, "Critical Thinking: Why Is It So Hard to Teach?" and *American Educator* 2020 (domain-specificity); Abrami et al., 2015, *Review of Educational Research* (ERIC EJ1061695) (g ≈ 0.30; explicit + content-embedded best)

Whole person / agency / motivation / self-directed models:

Ryan & Deci, 2000, *American Psychologist* (SDT); Ryan & Deci, 2020, *Contemp. Educ. Psychol.*; Deci, Koestner & Ryan, rewards meta-analysis
Lepper, Corpus & Iyengar, 2005, *J. Educ. Psychol.* (intrinsic motivation declines by grade); Gallup "School Cliff"; Gallup engagement by grade, *EdWeek*; curiosity suppression in classrooms, *ScienceDirect*
Montessori RCT/lottery studies, AMS summary; national Montessori RCT, *PNAS* 2025; Montessori longitudinal, *Front. Psychol.* 2017
Project-based learning meta-analysis, *ScienceDirect*; PBL meta-analysis, *Front. Psychol.* 2023
Bloom, "The 2 Sigma Problem," 1984 (mastery / individualized pacing — cited from brief 02 §1.2)

Why absent / measurement model:

Donald Campbell, "Assessing the Impact of Planned Social Change," 1976 (Campbell's Law); Goodhart's Law (cf. brief 02 §6)
atlas §5 ("indicators miss the problems that may matter most") — docs/EDUCATION_PROBLEMS.md

AI as amplifier vs. crutch:

AI-tutor learning-principles case study, arXiv 2309.13060

Note on rigor: P4C and project-based-learning effect sizes vary by meta-analysis, study quality, and outcome measure; the ranges (P4C ≈ 0.43–0.65; PBL "medium-to-large") are reported as ranges, not points, and the publication-bias and fidelity caveats common to education meta-analysis apply. Where the evidence is contested (critical-thinking transfer) or negative (learning styles), this brief states it rather than omitting it. A v0.2 pass should attach DOIs/OpenAlex IDs and, for the headline effect sizes, the specific confidence intervals and study counts.

← education corpus the flagship synthesis the reform mission