Computational Neuroscience Meets the 17th Century
Do we live in a world of ideas? 💡
Descartes famously kicked off an era of Western philosophy with “Cogito, ergo sum”—an attempt to establish a basis for human existence and knowledge that assumed as little as possible. Out of this idea sprung a dualistic theory of mind and body. The theory posits that the world is composed of fundamentally two types of ‘stuff’: stuff that extends into three-dimensional space (matter) and stuff that doesn’t (ideas). For a long time this idea was popular. It was partly popular because it left fairly explicit room in philosophy for a god concept, but the basic idea also made some intuitive sense. How else do we account for our feeling of consciousness? However, in modern circles of philosophy and science, dualism has largely gone out of fashion in favor of materialism (one type of stuff). Stripping away this notion of “ideas” as an independent manifestation from “matter” seems more sensible nowadays where more hard-nosed empiricism so fundamentally shapes our society. And, frankly, dualism raises some difficult problems upon close examination. The hardest problem boils down to this question: if there are two types of “stuff,” then how do they interact with each other?
In the 17th century, however, things weren’t quite as settled. Irish philosopher George Berkeley advocated for the other solution to this philosophical conundrum: the world (including all matter) was composed only of ideas 1. Like modern materialists, he didn’t really understand how dualism could make much sense. Setting aside the problems of dualism for a moment, my question is: is Berkeleyan idealism a useful tool that offers us a relevant perspective when we study neuroscience?
My line of thinking comes from reading a 2014 paper2 that summarizes, and to a degree challenges, the neuroscience research on scene analysis—the study of how our brain turns information (materialist philosophers like the term “sense-data”) into an internal, ultimately material, representation of the world. It then argues that scene analysis is not only an incredibly hard problem, but that many of the modern ways we study this problem fail to address some very fundamental questions.
The paper points out that a lot of empirical research into scene analysis is conducted using very simplified forms—lines and basic geometric figures on simple backgrounds and using neutral lighting. Arguably, this simplified approach makes sense because it provides a controlled experimental environment. In such an environment scientists can create a simple and reproducable baseline and then vary from that baseline one parameter at a time. Unfortunately, while this research framework has yielded many useful insights, the paper argues that we remain relatively stuck on how we map those insights to ecologically relevant scene analysis problems that all animals are constantly solving.3
As I was reading, I was thinking about Berkeley’s idealism. As I learn more about human perception in general, the more I realize that our perceptual world is far more like a Star Trek holodeck—a constructed reality—than some kind of direct4 perception. Of course, I am not the first person to have this idea5 and its generally given the name predictive coding, predictive processing, or (more creatively) controlled hallucination. In a nutshell, our brains cannot process the sensory information we recieve quickly or energetically efficiently enough to reach the performance levels we do. Not only is our brain heavily involved in numerous layers of represental transformations when processing sense-data, our resulting real-time representational model of the world (essentially our engine of consciousness) is informed by a shockingly tiny subset of the available information (neurons are expensive!6). Because of this limitation, our consciousness experience is mostly an informed guess about what is going on in the world. Our lived conscious experience is therefore much less objective than it seems, and is in fact the opposite—deeply subjective. So, while I don’t think the world is built of ideas, our perception of the world is literally built of ideas! So, it makes sense to me to put ideas at the center of how materialists think about consciousness.
One example from neuroscience of our constructed reality are saccades. Saccades are rapid eye movements that constantly shift our gaze. The movements are essential for our ability to process visual information because (as mentioned) we physically don’t have the neural bandwidth to take in all of our visual field in without heavy compression7. Our brain controls where our eyes are pointed based on a mix of unconscious neural control systems tracking our head, body, and eye position. Random saccade and even smaller microsaccade movements, as well as high-level task-driven signals continuously, and mostly unconsciously, determine our attention. All of the high-acuity visual data we get (e.g. detail and color) comes via a shockingly narrow field of view (a few degrees arc), and the rest we fill in with prior knowledge and expectations (memories and saccades) and much more limited peripheral visual data.
Moreover, this motif recurs throughout the perceptive systems of most animals. For example, the scene analysis paper discusses jumping spiders having three pairs of different kinds of eyes. One of the pairs provides detail, and provides it as a one-dimensional vertical slit. The spider uses these eyes to scan selected areas of its visual field horizontally like a crazed dot-matrix printer. Like us, the spider then assembles these samples into a three-dimensional representation of the world—the only way to explain its complex behaviors. Unlike us, visual sensing is performed by three distinct kinds of eyes each specializing on certain kinds of information.
Bats have even more impressive autonomic systems in place to operate their neural sonar. Not only do they generate (encode) and process (decode) various sophisticated sonar signals, they focus sonic energy at targets of interest and actively adjust their sonar transmission frequencies away from those of nearby bats to reduce signal interference.
Meanwhile, in neuroscience land, many of our most famous computational vision models bear little resemblance to these more attention-driven biological systems. Consider the incredibly successful ResNet image classifier. It reads entire images, breaks them down into spatio-temporal chunks, then assembles those chunks all back together into increasingly complex groups of pixels for grouping and labeling. And yes, there is ample evidence for our visual cortex doing at least highly similar spatio-temporal filtering8 and layered synthesis of these signals. However, Resnet processes the entire image, every time. It has no concept of three-dimensional space other than what it has memorized, no representation of what is most important, and no way to direct its attention to (disregard or prioritize) parts of an image. So, while pieces of Resnet resemble some pieces of biological brains, its overall computational architecture doesn’t really trace back to any biological systems. The success of ResNet as a model has much more to do with its accomplishments as a technology than its ability to explain actual biological neural architectures or animal behavior9—the more fundamental aim of science.
Yet scientists understandably have the urge to apply ResNet when searching to understand the biological brain because its impressive performance might indeed help reveal things that can get us closer to understanding neurobiology. But because its overall architecture resonates with our subjective experience of perception, we may also have a natural bias to trust it more readily than we should. The human vision system makes us feel very connected to the world. It feels like our eyes are taking it all in like a camera taking pictures or video. Therefore, it seems like the mind might, at some stage, process images like ResNet. It is tempting to dismiss the unconscious mechanisms of eye movement as boring biological engineering details—a distraction from the really interesting stuff that makes up human perception and thought—when the very opposite (or neither!) could be true.
Back to idealism and materialism. So while we are not literally living in Berkeleyan world of ideas, the vast bulk of our perceptive reality is a somewhat Berkeleyan mental construct. Our brain has made us a personal holodeck10 very much tuned for evolutionary survival. Disbanding this bias of objectivity first leaves us with even more mysteries about how minds work. However, a major thrust of this scene analysis paper is that scientists should design experiments that account for the complexities animal perception. What these means in practice is to step away from experiments with “pure” dots and lines to ones in ecologically relevant conditions. While these experiments are more difficult to perform, they end up being more productive, because the minds we are studying are highly tuned for survival in that specific environment. Doing so puts scientific observation more directly in the driver’s seat, and gives us a chance of seeing ourselves in a way that that pierces the veil of our own controlled hallucinations.
-
Star Trek (TNG) not-so-famously introduced a character “Barclay” who was obsessed with spending time in the holodeck—a ship system that could create rich virtual environments. The character name might be a subtle nod to Berkeley or just a coincidence. ↩︎
-
Lewicki MS, Olshausen BA, Surlykke AS, and Moss CF. (2014) Scene analysis in the natural environment. Frontiers in Psychology 199 doi:10.3389/fpsycg.2014.00199 ↩︎
-
Granted, it has been a bit over 10 years since this paper was published… 😂 ↩︎
-
There is an epistemological theory based on this idea of direct access to reality called direct realism: https://en.wikipedia.org/wiki/Direct_and_indirect_realism ↩︎
-
Just one example: Frith, C. D. (2007). Making Up the Mind: How the Brain Creates Our Mental World. Blackwell Publishing. ↩︎
-
I am reading Neuron to Brain and the introductory chapters outline how the limits of physical chemistry drive to work as locally as possible, where molecular proceses can occur much more efficiently than generating expensive electrical pulse trains—especially over longer distances. ↩︎
-
Saccades also compensate for our eyes being a kind of high pass filter, suppressing visual signals that aren’t moving. ↩︎
-
This is really just one of many papers: Field D. J. (1987). Relations between the statistics of natural images and the response properties of cortical cells. J Opt Soc Am A., Optics and image science, 4(12), 2379–2394. doi:10.1364/josaa.4.002379 ↩︎
-
There is a really intersting paper11 that attempts to do this with a pretty famous Google-developed neural network that broke Internet CAPTCHAs that I am probably going to write about soon. ↩︎
-
See Footnote 1. ↩︎
-
George D, Lázaro-Gredilla M, Lehrach W, Dedieu A, Zhou G, and Marino J. (2025) A detailed theory of thalamic and cortical microcircuits for predictive visual inference. Sci. Adv. 11, eadr6698. doi:10.1126/sciadv.adr6698 ↩︎