Knowledge Is Not Information

Why cognitive science resists the claim that knowledge matters less in the age of AI

cognition

education

metacognition

Author

Affiliation

Andrew Ellis

Virtual Academy, Bern University of Applied Sciences

Published

February 27, 2026

A recurring claim in discussions about the future of education goes something like this: because AI makes knowledge universally accessible, having knowledge will become less important, and education should pivot toward “future skills” such as creativity, critical thinking, and collaboration. The argument sounds plausible. It is also deeply confused.

The confusion rests on a conflation of two very different things: knowledge integrated into long-term memory, and information available in an external system. These are not interchangeable. Knowledge stored in a human mind restructures how that mind perceives, reasons, and acts. Information sitting in an AI system does none of that until a person retrieves it, comprehends it, and integrates it, all of which require prior knowledge.

Expertise is not a collection of facts

Decades of expertise research have demonstrated that experts do not simply “know more” than novices. They have richly organized knowledge structures, typically called schemas, that allow them to recognise patterns, chunk information efficiently, reason by analogy, and generate solutions to novel problems.

The classic demonstration comes from Chase and Simon (1973), who showed that chess masters could reconstruct meaningful board positions from brief exposure far better than novices, but showed no advantage for random arrangements. The expertise resided not in superior memory per se, but in the ability to perceive meaningful structure, an ability that depends entirely on deep domain knowledge.

Chi, Feltovich, and Glaser (1981) found the same pattern in physics: experts categorised problems by underlying principles (e.g. conservation of energy), while novices categorised by surface features (e.g. “problems with inclined planes”). This structural understanding cannot be offloaded. An AI can provide the answer to a specific question, but it cannot provide the cognitive architecture that allows a person to ask the right question, notice when an answer is wrong, or see connections across domains.

The effort of learning is not an inefficiency

A natural response to the availability of AI is to treat the effort of acquiring knowledge as a cost to be minimised. If the answer is one prompt away, why spend hours working through the material yourself?

The answer comes from research on desirable difficulties (R. Bjork and Bjork 1992). Encoding, retrieving, and elaborating knowledge is effortful precisely because that effort is what produces durable, flexible learning. Testing yourself is harder than rereading, but produces better retention. Spacing practice over time feels less efficient than massing it, but leads to more robust memory. Generating an answer before being told the correct one requires more cognitive work, but strengthens understanding.

When a student outsources cognitive work to AI, they receive the output but skip the process that would have produced learning. The deliverable arrives; the competence does not. This is not a side effect of AI use. It is a structural consequence of bypassing the cognitive operations that give rise to learning. The process of struggling with material is not overhead to be optimised away. It is the mechanism of cognitive development.

You need knowledge to evaluate knowledge

One of the more consequential implications of AI-assisted work concerns metacognitive calibration: the ability to monitor and evaluate one’s own understanding. Effective use of AI-generated content requires judging whether that content is accurate, relevant, and complete. This judgement depends on domain knowledge.

The problem is circular. R. A. Bjork (1994) documented how learners systematically misread their own progress and current state of knowledge, confusing familiarity with understanding and fluency with competence. Metacognitive monitoring is not a general-purpose skill that operates independently of what one knows. It is itself a knowledge-dependent inference: the accuracy of a person’s self-assessment tracks the depth of their domain understanding (Fleming and Daw 2017). Applied to AI use, this means that the people least equipped to evaluate AI output are precisely those who would benefit most from doing so. Without a foundation of domain knowledge, a student cannot distinguish a correct AI response from a plausible but wrong one, and cannot calibrate their confidence accordingly.

Far from making knowledge less important, AI tools make the metacognitive functions that depend on knowledge more important.

Cognition is inference, not search

The “knowledge is less important” framing implicitly models cognition as search: you need a fact, you look it up. But this gets the computational nature of cognition wrong. The mind does not work by searching a database. It works by maintaining an internal model of the world, generating predictions from that model, and updating when those predictions fail.

Schemas, the knowledge structures that decades of expertise research have documented, function as hierarchical generative models: structured prior knowledge that constrains the space of hypotheses a person entertains (Tenenbaum et al. 2011; Kemp, Perfors, and Tenenbaum 2007). When an experienced physician examines a radiograph, their perceptual system does not search memory for matching patterns. It generates a prediction, shaped by thousands of prior cases, before conscious deliberation begins. The chess master’s advantage in Chase and Simon (1973) is not superior recall; it is a generative model of meaningful board positions that makes random arrangements no more memorable than they are for anyone else. The physicist’s deep categorisation in Chi, Feltovich, and Glaser (1981) reflects priors over causal structure, not a filing system organised by surface features.

Learning, in this framing, is driven by prediction error: the mismatch between what the model expected and what actually occurred. That signal only exists if you had a prediction in the first place, which requires prior knowledge. No prior knowledge, no predictions. No predictions, no prediction errors. No prediction errors, no learning. The entire mechanism depends on already knowing something.

There is a further point. Knowledge does not merely provide content that can be retrieved. It determines the hypothesis space: the set of possible explanations a person can entertain. A physician who has learned pathophysiology does not simply have more facts available than a layperson. They have a different space of diagnostic possibilities, organised by causal and mechanistic relationships, that determines what they can even consider when looking at a radiograph. No amount of searching will give the layperson that hypothesis space, because it is not a set of facts. It is a structured set of possible explanations that constrains what counts as a relevant observation in the first place.

Access to an external information source is not a substitute for knowledge in the mind. A search engine or a language model can return an answer, but it cannot restructure the prior distributions that shape how a person perceives, predicts, and reasons. These are not retrieval operations that can be delegated. They are consequences of having knowledge woven into a generative model of the world.

Knowing and thinking are not separable

The preceding sections might seem to invite a compromise: concede that domain knowledge matters, but argue that what education should really prioritise is the capacity for flexible thinking, reasoning, creativity, the “higher-order” skills that knowledge merely serves. This is the move that treats knowing and thinking as separable faculties, one dispensable, the other essential. But the separation does not hold up.

Summerfield (2022) makes the case precisely. What distinguishes human intelligence is not the ability to store vast quantities of information, nor the ability to retrieve it quickly, but compositional generalisation: the capacity to recombine known elements in novel configurations to handle situations never previously encountered. A child who has learned “red” and has learned “square” can immediately recognise a red square without ever having seen one. This is so natural that we barely notice it. It is also something that current artificial neural networks struggle with, despite their advantages in pattern recognition and retrieval at scale.

The crucial point is that compositional generalisation requires structured representations to compose over. You cannot recombine elements you have not represented as separable, abstract components in the first place. This is exactly what distinguishes the expert from the novice in the studies discussed above. The physicist in Chi, Feltovich, and Glaser (1981) who categorises problems by “conservation of energy” has decomposed the problem space into abstract structural components that can transfer across superficially different situations. The novice who categorises by “inclined plane” is bound to surface features; there is nothing to recombine. The chess master in Chase and Simon (1973) perceives board positions in terms of meaningful configurations, chunks that can be flexibly redeployed. The novice sees individual pieces.

This closes an important escape route. Someone might accept that schemas matter but still claim that “reasoning ability” is the transferable, domain-general skill that education should focus on, independent of any particular body of knowledge. Summerfield’s framework says otherwise. Reasoning is the flexible deployment of domain-structured knowledge. The compositional operations that constitute thinking are operations over representations; without the representations, there is nothing for the operations to work on. Knowing is the substrate of thinking, not a separate input to it.

There is a further consequence worth drawing out. Human knowledge, when deeply integrated, is generative rather than merely reproductive. A person with a thorough understanding of music theory does not simply replay compositions they have heard. They can compose new ones, constrained but not determined by the structural regularities they have internalised. The physicist who understands conservation laws can reason about systems they have never encountered, because the abstract components of their knowledge recombine to generate new predictions. This generativity is precisely what makes human cognition different from retrieval. AI systems are, at present, arguably better at retrieval and statistical pattern-matching; humans are better at the flexible compositional reasoning that depends on deeply integrated knowledge. The division of labour between humans and AI is therefore not “AI knows, humans think.” It is that both knowing and thinking are intertwined in human cognition, and the kind of knowing that supports compositional generalisation is exactly what cannot be offloaded.

The logic of offloading

None of this implies that externalising cognitive work is inherently harmful. Cognitive offloading, the use of physical actions or external tools to alter information-processing demands, is a normal and pervasive feature of human cognition (Risko and Gilbert 2016). We write things down, draw diagrams, count on our fingers, and use calculators. The question is not whether offloading is acceptable but when it helps and when it does damage.

Offloading makes sense when working memory is the bottleneck and the person already possesses the relevant knowledge. A surgeon using a checklist frees attentional resources for judgement. An expert programmer using a linter offloads routine error detection to focus on architecture. In both cases, the underlying competence is intact; the tool relieves a performance constraint, not a knowledge gap. The competence determines how the tool is used, and the tool amplifies what the competence can achieve. This is the extended mind in the sense that Clark and Chalmers (1998) intended: cognitive processes genuinely distributed across brain and environment, but with the internal contribution doing indispensable structural work.

The critical asymmetry is between performance bottlenecks and learning bottlenecks. Offloading is beneficial when it relieves a performance constraint — working memory limits, attentional capacity, processing speed — while preserving the underlying competence. It becomes harmful when it relieves a learning bottleneck, when the cognitive effort being avoided was itself the mechanism that would have built competence. A calculator extends the mathematician’s mind. The same calculator in the hands of someone who does not understand what division means is not extending anything; it is producing outputs the person cannot evaluate.

The empirical evidence confirms this asymmetry. Grinschgl, Papenmeier, and Meyerhoff (2021) found that cognitive offloading boosts immediate performance but diminishes subsequent memory for the offloaded information. The gain in the moment comes at the cost of learning that would have occurred through internal processing. Hu, Luo, and Fleming (2019) showed that metamemory, awareness of what one does and does not know, plays a mediating role: people who offload habitually may also lose the metacognitive signal that tells them what they still need to learn. The offloading does not merely bypass one instance of encoding. It can erode the self-monitoring that would have prompted further study.

The connection to compositional generalisation is direct. What matters, as argued above, is whether knowledge is represented in a way that supports flexible recombination (Summerfield 2022). Offloading that preserves this structure is unproblematic: externalising intermediate results while working through a proof frees working memory for the compositional operations that matter. Offloading that prevents this structure from forming in the first place is where the damage occurs. If a student never works through the intermediate steps of a derivation, the abstract structural components that would have supported transfer to new problems are never acquired. The deliverable is produced, but the generative capacity is not.

The convergence point across these literatures is straightforward. What matters is not whether a person has access to knowledge, but whether knowledge is integrated into their cognitive architecture in a way that supports flexible thinking and effective tool use. Offloading works when there is a rich internal model to offload from. It fails when it prevents that model from being built in the first place. The question that educators should be asking is not “can the student look this up?” but “does the student have the internal structure that makes looking it up useful?”

The stronger objection

There is a version of the argument that this analysis does not touch, and it deserves an honest answer. The claim is not that humans can look things up faster, but that the entire cognitive loop, from retrieval through comprehension and evaluation to action, can be automated. If one AI system produces information and another AI system consumes, evaluates, and acts on it, the human is no longer a bottleneck. The human is no longer in the loop at all.

This is a coherent position, but it is a different argument. It is not claiming that knowledge matters less for humans. It is claiming that humans matter less. And whatever its merits as a prediction about the economy or the labour market, it is not an argument about education. The educational discourse this post is responding to still assumes that students should become competent professionals, thoughtful citizens, people capable of independent judgement. It simply asserts, wrongly, that they can become these things without building knowledge.

If someone wants to argue that human competence itself is becoming obsolete, that is a conversation worth having. But it is not a conversation that helps a teacher design a course, a curriculum developer choose learning objectives, or a student decide how to spend their afternoon. For anyone still operating on the premise that human understanding matters, the rest of the argument stands.

What actually changes

This is not to say that AI changes nothing about the value of knowledge. But the common move of distinguishing “mere rote recall” from “higher-order thinking” and then dismissing the former is too hasty. Much of what looks like rote memorisation is actually cognitive infrastructure. A child who has automated the times table is not simply recalling facts; they have freed working memory for algebra, estimation, and mathematical reasoning. A person who knows the chronology of the French Revolution is not hoarding trivia. That chronology gives them a temporal scaffold on which to hang causal explanations. The distinction that matters is not between memorised knowledge and conceptual understanding, because the former is often the substrate of the latter.

What does shift is the return on different types of knowledge. Genuinely isolated facts, the kind that serve no further cognitive function, were arguably never the most important educational outcome. Conceptual understanding, the ability to frame problems, to integrate information across domains, to evaluate and critique, these become more important when AI handles routine lookup and generation. But the path to conceptual understanding runs through a large body of well-organised, readily accessible knowledge. There are no shortcuts.

The implication for education is not that we should teach less knowledge, but that we should be more thoughtful about which knowledge we prioritise and how we help students build it. The goal is not to compete with AI on information retrieval, but to develop the deep, structured understanding that makes AI tools genuinely useful rather than superficially convenient.

The container is not the cognition

The claim that knowledge matters less in the age of AI mistakes the container for the cognition. AI changes where information is stored and how fast it can be retrieved. It does not change the fact that human thinking, understanding, judgement, and learning are constitutively dependent on knowledge organised in human minds.

A future in which people have less knowledge is not a future in which knowledge matters less. It is a future in which people are less capable of using the tools available to them. The very instruments designed to make cognition easier may, if used naively, undermine the knowledge base required to use them well.

References

Bjork, Robert A. 1994. “Memory and Metamemory Considerations in the Training of Human Beings.” In Metacognition: Knowing about Knowing, 185–205. Cambridge, MA, US: The MIT Press. https://doi.org/10.7551/mitpress/4561.001.0001.

Bjork, Robert, and Elizabeth Bjork. 1992. “A New Theory of Disuse and an Old Theory of Stimulus Fluctuation.” Essays in Honor of William K. Estes, Vol. 1991: From Learning Theory to Connectionist Theory, January, 1935–67.

Chase, William G., and Herbert A. Simon. 1973. “Perception in Chess.” Cognitive Psychology 4 (1): 55–81. https://doi.org/10.1016/0010-0285(73)90004-2.

Chi, Michelene T. H., Paul J. Feltovich, and Robert Glaser. 1981. “Categorization and Representation of Physics Problems by Experts and Novices.” Cognitive Science 5 (2): 121–52. https://www.sciencedirect.com/science/article/pii/S0364021381800298.

Clark, Andy, and David Chalmers. 1998. “The Extended Mind.” Analysis 58 (1): 7–19. https://www.jstor.org/stable/3328150.

Fleming, Stephen M., and Nathaniel D. Daw. 2017. “Self-Evaluation of Decision-Making: A General Bayesian Framework for Metacognitive Computation.” Psychological Review 124 (1): 91–114. https://doi.org/10.1037/rev0000045.

Grinschgl, Sandra, Frank Papenmeier, and Hauke S Meyerhoff. 2021. “Consequences of Cognitive Offloading: Boosting Performance but Diminishing Memory.” Quarterly Journal of Experimental Psychology (2006) 74 (9): 1477–96. https://doi.org/10.1177/17470218211008060.

Hu, Xiao, Liang Luo, and Stephen M. Fleming. 2019. “A Role for Metamemory in Cognitive Offloading.” Cognition 193 (December): 104012. https://doi.org/10.1016/j.cognition.2019.104012.

Kemp, Charles, Andrew Perfors, and Joshua B. Tenenbaum. 2007. “Learning Overhypotheses with Hierarchical Bayesian Models.” Developmental Science 10 (3): 307–21. https://doi.org/10.1111/j.1467-7687.2007.00585.x.

Risko, Evan F., and Sam J. Gilbert. 2016. “Cognitive Offloading.” Trends in Cognitive Sciences 20 (9): 676–88. https://doi.org/10.1016/j.tics.2016.07.002.

Summerfield, Christopher. 2022. Natural General Intelligence: How Understanding the Brain Can Help Us Build AI. Oxford University Press. https://doi.org/10.1093/oso/9780192843883.001.0001.

Tenenbaum, Joshua B., Charles Kemp, Thomas L. Griffiths, and Noah D. Goodman. 2011. “How to Grow a Mind: Statistics, Structure, and Abstraction.” Science (New York, N.Y.) 331 (6022): 1279–85. https://doi.org/10.1126/science.1192788.

Reuse

CC BY 4.0