Jake Quilty-Dunn, Nicolas Porot, and Eric Mandelbaum recently published a robust defense of the Language of Thought Hypothesis (LoTH). They provide an impressive defense for the idea that certain behavioral results are best explained by a LoT. They define LoT by a cluster of properties--the use of discrete representations, variable-binding, predication, abstraction, and logical inference--which seem to co-occur in many cases.
Moreover, they focus on non-linguistic cases: vision, animal and infant studies, and implicit belief tests. In doing so they force researchers to take seriously numerous findings from psychology which reveal behaviors which are easily explained by a LoT, but are less easily explained on other theories. This aims to shift the burden onto researchers in other traditions, especially those working with neural networks: how do you explain compositional behavior that deeply suggests discrete representations, variable-binding, and logical inference? The work is impressive and commendable.
But it isn't clear this is the right paper--or papers--to convince people about the LoTH. This is because critics are often concerned about different issues which this paper seems to leave unaddressed: how a LoT could be implemented, and whether computational models using LoT (what I will call symbolic systems) are justifiable. Concerning the first, since the birth of cognitive science there has been a need for an implementation story which grounds the speculative theories which treat the brain as a computer as something which can be cashed out. That paper hasn't been written yet and, until it does, we still don't have compelling grounds for accepting a diversity of findings in cognitive psychology as validating the LoTH. Concerning the second, many computational cognitive psychologists rely on LoT-based theories, often used to explain findings centered around innate primitives, hypothesis-testing, and rapid learning. That paper also hasn't been written.
This is a major takeaway of the commentaries. Without showing definitive reasons for accepting the LoTH as describing neural processes, we are pushed back onto descriptions of behavior and inference to the best explanation. And many commentators note, it is possible to argue that things like variable-binding or logical inference are not actual evidence of a LoT, but are only evidence of language-like behavior. This criticism is the focus both of Attah and Machery and McGrath et al.'s commentary. Language-like behavior is important and interesting, but what is needed is evidence that the internal machinery of the brain actually computes using discrete symbols, variable-binding, predication, and so on. As Chalmers notes, this stronger claim isn't argued for. The paper shows that the brain is using structured representations, not that the computational LoTH is right. We have good reasons to think that, at a certain level of abstraction, we can describe these behaviors using a LoT; we do not have reasons to think this will be the best description when we know more about how these behaviors are implemented. And we certainly are short of reasons for thinking the abstract computational models depending on these representations can be cashed out in the brain.
Why are these complaints so important? The issue is that the problem of implementation has changed in the last few decades, and Quilty-Dunn et al.'s paper seems to be defending an approach to cognitive psychology from a prior era. This commentary focuses on the history of computational modelling to show why the examples in Quilty-Dunn et al.'s paper, while highly suggestive, fall short of the evidence for the LoTH many were hoping for. For many researchers, rather than settling the debate, they are likely to increase their distrust of cognitive psychologist’s methods—an issue left unaddressed in the paper, but one that needs to be addressed head-on if the LoTH is to claim its status as the best game in town.
This commentary has three parts: a history of the debate, the trade-offs between perspectives, and the present state of cognitive science. The take-away from this is that we are now at a different place where behavioral findings which accord well with a LoT model are insufficient evidence to support the LoTH. There is a much bigger evidentiary demand than there used to be. This means the LoTH is a player in a bigger game, not the best game in town.
The Debate
In the early 1980s, arguably the heyday of the language of thought hypothesis (LoTH), David Marr's Vision, laid out a bold agenda for cognitive science. Central to this agenda were the “three levels” which cognitive scientists should aim to unify in their work: a computational level, in which an isolated function or task of an organism is identified; an algorithmic level, in which the appropriate rules and representations for accomplishing the function or task; and an implementation level, in which the other two levels are grounded in specific properties of the organism’s brain (or, potentially, the machine’s processor). The book provided a roadmap for how this approach would work by focusing on vision, highlighting different tasks the visual system might solve and algorithms and representations for solving them. These alogrithms and representations relied exclusively on the resources of the LoTH: discrete representations, logical operators, and variable-binding.
Marr's book was and is an incredible, influential book. But it was heavy on psychological theorizing and complex symbolic systems, but light on explanations for how these algorithms might be realized in the brain. Marr passed away tragically at a young age so it is impossible to know how he would have developed his view over time. But a central criticism—one leveled by his friend Shimon Ullman in his foreword for the 25th anniversary re-release of the book—is that Marr’s approach overly focused on the top two-levels to the exclusion of the third. But without implementation details, Ullman argues, the theorizing is too speculative. Without at least some neural grounding, Ullman complains, there is no assurance that we are effectively picking out the right problems.
In fairness to Marr, he couldn't focus on implementation details at the time. The resources weren't there for neural investigation. As Marr complains in the book, when he was studying as a neuroscientist there was little beyond single-neuron readings; more complex readings of the brain just weren't available in the late-70s. Moreover, there was no option besides symbolic systems for building computational models; artificial neural networks, although available, could not be very deep because backpropagation was still (largely) unknown. So it isn't quite fair to complain about Marr ignoring implementation details: neural investigations were too limited by the technology, and the only computational resources were symbolic systems. They were the best game in town for building edge- and shape-detectors, for example, because they were the only game in town.
But the limitations Marr faced came to seem, for many, objective features of the brain--rightly or wrongly. For Marr—as well as contemporaries such as T.O. Binford and Jay Tenenbaum (1973)—the assumption ran that any visual feature of an object (edges, shading, movement, texture, etc.) needed a discrete feature-detector attuned to that specific feature. On Marr’s approach, this meant each feature was a computational problem, each feature required its algorithm and representations, and each feature needed to be implemented into a different part of the system. This discrete module approach generated new problems higher-up the chain--such as the binding problem and information sharing. It was wildly ambitious and exciting, but also remarkably speculative, based less on empirical results and more on Marr’s hunches about how he vision system needed to work given all the different functions he had identified and some assumptions for how they might come together.
The Alternative
But competitor approaches (which Marr did not live to see) were on the horizon, such as Yann LeCun's convolutional neural network. It focused on broadly mimicking how an algorithm that learns a distributed representation might be implemented in the brain. He did so by (roughly) modelling the visual cortex, based on the pioneering work of Hubel and Weisel on simple and complex cells, and by modifying Fukushima's design of the neo-cognitron, a hybrid symbolic system-neural network also modeled on Hubel and Weisel. LeCun's model involved wiring up a neural network with layers of parallel feature detectors that fed into another layer of parallel feature detectors, and so on. Then a simple learning algorithm was given to the system and it learned which features mattered for a visual system, coming up with features similar to those Hubel and Weisel discovered and which Fukushima had to hand program.
The result was a collection of discrete feature-detectors all learned without direct supervision and integrated into a single, complex, network which could perform a vision task. In short, the “problems” of vision Marr had identified as distinct and explicable with numerous modules could instead all be learned without any hand-built features--and without any discrete symbols, variable-binding, or other features of symbolic systems. For LeCun, Marr had carved nature up too finely, assuming there were joints where, in fact, only a single system was functioning. He also ran into problems, like the binding problem, that did not seem a problem at all for those working with neural networks. The difficult work of computationally modelling all the separate parts and layers and explaining how they fit together could instead be accomplished just by training a single network end-to-end.
Whether or not convolutional neural networks are a decent model of vision is an important question (see Bowers et al 2023), but ultimately a secondary one for this paper. The point is those pursuing neural networks—dubbing themselves “connectionists”—rejected the idea that we can just guess what the joints in the brain are by looking at what functions organisms accomplish. While the visual system may solve many different problems, we cannot assume each problem is solved on its own and can be investigated in isolation from all the other problems. Doing so ends up with a profusion of different representations and algorithms where, instead, only a single learned distributed representation might do the same job.
Reasons for Caution about the Language of Thought
It is worth noting, it would make good sense if evolution discovered how to implement discrete representations and variable-binding in the brain. It would be an incredibly efficient method for encoding information in the brain, bind together diverse features into a single object, build up rules, engage in rapid learning through hypothesis-testing, and many other capacities. There should be no question, a brain would work better in many ways if it did use a LoT, since it would permit kinds of abstraction and learning, kinds of problem-solving, and kinds of inferences that are difficult to damn-near-impossible on the connectionist model.
But evolution involves trade-offs. Evolution selected a useful, digital encoding scheme for DNA, but this is much slower and more costly to modify than what was chosen for neural networks: updating weights between neurons. But choosing neural networks, by contrast, makes encoding and binding symbols far more difficult—much less maintaining their identity over time and inferences. If connectionist models are even approximately good models of biological neural networks, then the ways the brain can learn and solve problems are very different--and, in many ways, much more limited--than the alternatives available to those of a symbolic system. To be sure, there are many benefits to connectionist models (the end-to-end learning, for example). But still.
In the 1990s, during the "systematicity debates," there were major questions about whether neural networks could perform the role of a symbolic system. While it was held to be in principle possible it might "implement" a symbolic system, it was seen as more likely that it would be systematically dysfunctional—approximating a LoT system in typical cases but flailing outside the narrow domain of normal scenarios. This worry has proven prescient. Ellie Pavlick and her lab have shown that the impressive capacities of ChatGPT and other large language models break down outside of typical cases. Although they might achieve some symbolic behaviors--including abstraction and logical inference--when the situation maps onto common situations in human textual scenarios, the model falls back to near chance when common-sense conflicts with logical form.
Pavlick and her team still argue, in an excellent response to Quilty-Dunn et al.'s paper, that this does not mean the brain really does implement a LoT; it may simply, like ChatGPT, approximate symbolic behaviors in most cases while still not relying on discrete representations and variable-binding. And results, like Lake and Baroni 2023, suggest neural networks might be able to accomplish these results if they receive the right diet and inductive biases. In short, evolution may have limited its ability to implement a LoT because of its choice to use neural networks and, instead, simply learned certain behaviors which accomplished similar (but subtly different) functions. No symbolic systems or discrete representations needed.
What Games are We Playing?
But it is worth noting, the debate is importantly different than it was in Marr's day or even during the systematicity debates. The more recent success of connectionist (now "neoconnectionist) systems and tools like fMRIs, PET scans, and other ways of measuring brain function have also transformed cognitive science. This is because we are now able to explore how similar the distributed representations in the brain are--and how similar their performance is--compared to those distributed representations in artificial neural networks. This can often be precise, showing similarities in how both networks respond to unexpected stimuli and mechanical interventions (such as those using transcranial magnetic simulation). We often can discern impressive homologies between brains and machines in how they encode categories.
As a result, the older debates about neural networks and symbol systems play out differently these days. The LoTH defenders can highlight numerous behaviors—in animals and humans—that would be easily explained by discrete representations, variable-binding, and logical inference that are difficult to explain using distributed representations. But the connectionist can simply respond, “no representation without implementation”: it is not enough for the LoTH theorist to come up with a speculative hypothesis but need to provide some neural justification for investigating a particular task in isolation and positing certain representations and algorithms as the “solution.” And there is an important asymmetry here: if the connectionist has success at their work, they will in the process of their normal investigations also provide answers for the LoTH theorist’s concerns by showing how neural networks can approximate symbolic systems without solving each problem in isolation from the other.
But the LoTH theorist is in a different position: further behavioral research and symbolic computational models of them will address none of the concerns of computational neuroscience. The LoTH theorist focuses on generating symbolic computational models of psychological processes, but the task of explaining how these are implemented are left to the neuroscientists. When Marr did this, it was forgivable given the tools of the day. But, in this day and age, to be building things like inductive programming models based on a LoT without any sense of how it could be implemented is kind of whacky.
Moving the Debate Along
This is why the Quilty-Dunn et al. paper, while important and useful, is unlikely to shift the conversation in the way they hope. The paper they wrote makes the need for an implementation story more acute, not less. By providing example of symbolic systems in animals, infants, and the early vision system, they make it even more pressing to show that the LoT is ubiquitous in the brain (contra work, such as Frankland and Greene, where the LoT is located in the default mode network). This ups the ante markedly but in a way that is hostage to fortune; we just don't know where the neural evidence goes next.
To be sure, contemporary neuroscience—and especially deep learning models—cannot explain the findings Quilty-Dunn et al. focus on. As such, any attempt to asset it as empirically obvious that neurons cannot instantiate a LoT is just more armchair speculation (e.g., Piccinni), this time coming from the other direction. But the asymmetry of burdens matters: neuroscience will eventually either vindicate or invalidate the LoTH solely by pursuing its own research. Computational psychology, however, is relying on someone else to provide the heaviest lift, transforming provocative behavioral details into a plausible understanding of the brain.
This would be a major criticism of any field, but computational psychology is in an especially uncomfortable area. It is constantly promoting new theories—such as massive modularity or Bayesianism. This would be one thing if all the results reproduced, but psychology is a byword for dubious results obtained by ambitious researchers, usually through good faith mistakes by occasionally through statistical reaching or outright fraud. Worries about developmental psychology are especially pressing right now with the ManyBabies Project. Many nativist models endorse a LoTH with lots of innate primitives which can rapidly be scaled up into theories through hypothesis-testing--something that is now in question. It's a tough time for computational models in psychology.
This is not to take anything away from Quilty-Dunn et al. Their paper is necessary, along with papers on implementation and the validity of computational modelling. But the title is over selling it. The LoTH is still in the game. But by not having any evidence to show from cognitive neuroscience, it can't claim to be the best around.
Comments