When we talk about the challenges facing artificial intelligence today, we often mention well-known AI blunders. Take, for instance, the image classifiers that mistakenly identify objects: an AI might see a banana but categorize it as a toaster just because there’s a peculiar sticker on it. Similarly, consider the unfortunate incidents involving self-driving cars that have been attributed to models misinterpreting sensory data, such as failing to distinguish a white truck against the backdrop of a bright sky. These errors are not just isolated incidents; they shed light on the underlying shortcomings of our current AI models. Further magnifying the gravity of the situation is the fact that these seemingly innocuous errors can manifest into potentially life-threatening scenarios, raising ethical concerns.
Broadly speaking there are two questions we can ask to understand these shortcomings.
- Is the mistake a result of the model’s over-reliance on patterns in the training data? This is essentially suggesting that the issue lies in data representation or pattern-recognition limitations.
- Alternatively, could the problem be attributed to the AI’s lack of genuine comprehension or foundational reasoning about the objects and situations it encounters, or a lack of what we often term “common sense”?
Intriguingly, these questions align with cognitive theories about human intelligence. Renowned psychologist Daniel Kahneman’s theory of System 1 and System 2 provides a fitting lens through which we can examine these ideas. According to Kahneman, our brain leverages two systems: System 1, which offers quick, instinctual responses, and System 2, which facilitates deeper, more deliberate thought processes.
Kahneman himself pointed out the potential parallels between his theory and artificial intelligence. The first question above (the pattern-recognition-centric viewpoint) aligns well with System 1 thinking. This perspective is embodied in the workings of neural networks, which make decisions based on patterns ingrained during their training.
In contrast, the second question resonates with System 2 thinking, representing a more deliberate and reasoning-based approach to problem-solving. This is represented by what’s called symbolic AI, which might sound unfamiliar and even arcane to some. Symbolic AI seeks to embed explicit knowledge and reasoning capabilities into machines, ensuring they don’t just recognize patterns, but also understand the “why” behind them, mirroring the intricate, nuanced understanding humans possess. (Note: The brain is far more complex than this dichotomy suggests, and its workings are influenced by a myriad of factors beyond Systems 1 and 2.)
Today, the term “neural networks” has become nearly synonymous with AI, as most of the AI products and services we see are powered by this technology. When people discuss AI, really what they’re referring to are these neural networks. Yet, what’s often overlooked is that the rise of neural networks, particularly deep learning, is a relatively recent phenomenon. There was a period of time when symbolic AI was at the forefront of AI research and application, which not many people—including those in the industry today—seem to recall.
A Little Bit of History
From its inception, AI has oscillated between periods of immense optimism and challenging skepticism — these cyclical patterns are often referred to as AI’s “summers” and “winters”. When Alan Turing proposed the idea that machines could potentially imitate human intelligence, he laid the foundation for both symbolic and connectionist schools of thought. (For our purposes, we can consider “connectionist” as an older term for neural networks.)
In the initial days of AI in the 1950s and early 1960s, both symbolic AI and early forms of connectionism emerged almost at the same time. Symbolic AI, with proponents like Marvin Minsky and John McCarthy, envisioned that human intelligence could be mirrored through precise rules and logic. This led to the creation of the first knowledge-based systems, rule-driven engines that attempted to emulate human reasoning. However, representing the complexities and nuances of human knowledge proved a monumental challenge. This became a recurring theme of setbacks faced by the symbolists in the subsequent decades.
On the other hand, there was growing interest in connectionism too. The perceptron, an early neural network model introduced by Frank Rosenblatt in the late 1950s, promised to learn from data rather than relying on hard-coded rules. While perceptrons marked an exciting development, their limitations, especially their inability to scale to real-world applications, led to skepticism about the potential of neural networks. The initial excitement about unlocking human intelligence waned, which led to the first “AI winter.”
Throughout the 1970s, the field of AI navigated through a phase of introspection. The early challenges faced by both symbolic and connectionist approaches meant that researchers were searching for new paths and methods. Funding for AI research became more selective, emphasizing practical applications. But while there were fewer headline-making breakthroughs, many researchers laid foundational groundwork, setting the stage for the innovations and resurgence that the 1980s would bring.
By the 1980s, symbolic AI experienced a brief resurgence with expert systems, which aimed to emulate human expertise in specific domains through extensive rule-based structures. While some view this period as the second winter of AI due to its lack of adaptability and generalization, it is worth highlighting that every AI focus area from this time resulted in some form of practical developments, even when the loftier aspirations remained elusive. And it’s crucial to recognize the significant advances made during this time, such as the development of compilers and databases, which played a role in Sebastian Thrun’s autonomous driving code too.
In the meantime, researchers like Geoffrey Hinton, Yann LeCun, and Yoshua Bengio remained committed to the connectionist approach, despite facing a lack of mainstream acceptance. The 1980s and 1990s were particularly challenging for them as they navigated an academic landscape where neural network research was often marginalized and underfunded. Hinton recalls those years as the “dark ages” for connectionism, with only a handful of researchers persisting in the field. LeCun, now a VP and Chief AI Scientist at Meta, faced periods where he struggled to get his research papers on neural networks accepted at major conferences, a stark contrast to the recognition and influence they command today.
In many ways, their current standing is a testament to their perseverance. With the surge in computational power and the influx of datasets in the late 2000s, the landscape shifted. The deep learning era of the 2010s, powered by the foundational work of these connectionist pioneers, achieved significant milestones in tasks like image and speech recognition. Connectionism, once considered the underdog, became the preferred method for a majority of developers and researchers.
The Status Quo
However, as the 21st century unfolds, the constraints of relying solely on deep learning are becoming apparent. There’s a growing recognition that while neural networks are exceptional at pattern recognition, they lack explicit reasoning. Subsequently, challenges such as data dependency, the “black box” problem (lack of interpretability), environmental overheads, overfitting (leading to hallucinations), and commonsense reasoning have emerged. Proponents argue that deep learning can overcome these challenges with refined architectures and improved training methods.
Considering the gravity of some of these issues, it would be wise to explore all possible solutions at our disposal. And this is reigniting the flames of interest in a combined approach, merging the symbolic and connectionist paradigms. This logical progression has paved the way for a hybrid domain known as “neuro-symbolic AI,” which represents the wide variety of strategies researchers are using to try to get the best of both the neural and symbolic worlds. One such approach is MIT’s Probabilistic Computing Project, where we use probabilistic programs to manage uncertainties within a neuro-symbolic framework as we have outlined in another blog post.
It is worth noting that probabilistic (or Bayesian) programming remained prominent in the 90s when neural networks were not as popular as they are now. Leading AI researchers like Judea Pearl and Stuart Russell were all exploring this field, while Clippy, developed by Microsoft, was based on Bayesian networks for its first 5-10 years. This suggests our journey in understanding intelligence (whether artificial or natural) is far from over.
Even though neural networks have gained something of a “common sense” status, it’s critical to maintain an open-minded perspective on the future of AI. As we have seen, historical trajectories suggest we’re still in the early phases of AI, with the pendulum continually oscillating between varied methodologies. As challenges become more pronounced and intertwined with our daily lives and societal contexts, technologists will persist in seeking solutions. Consequently, prevailing approaches to AI will keep evolving too. So we might well imagine a future where AI systems are both intuitive and logical, skilled at absorbing vast datasets while being able to explain their decisions.
Joseph Park is the Content Lead at DAL (joseph@dalab.xyz)
Research: Lulu Ito & Junpei Fukuda
Illustration: Satoshi Hashimoto
Edits: Janine Liberty