Peter Gärdenfors
Artificial intelligence (AI) has undoubtedly made remarkable progress in recent years, demonstrating its ability to process vast amounts of data and perform complex tasks. However, it is crucial to recognize that AI lacks a fundamental quality that humans possess: judgment.
Aristotle distinguishes between three types of knowledge: episteme, techne and phronesis. Episteme is the theoretical knowledge of how humans and their environment work. This form of knowledge is the one most commonly taught in schools and universities. Techne includes skills and abilities and is about how things are done. Knowing a craft is a good example. Technology is a later form of such knowledge. Phronesis is the judgmental knowledge that enables one to make informed decisions in concrete situations. This form of knowledge can be translated into practical wisdom or good judgment.
AI provides us with tools to create new knowledge. The question is what kind of knowledge AI gives us. To answer, we must first find out what AI can do. One of the first AI areas was various games that require human intelligence. For example, the chess program Deep Blue beat the then world champion Garri Kasparov in 1997 and since then several gaming programs have evolved to a level far beyond human ability. They are, however, highly specialized and cannot be modified to solve other tasks.
Gaming programs deal more with problem solving than creating new knowledge. AI programs that provide new knowledge fall into the episteme. Recently, the focus has been on the programming method known as deep learning, where an artificial neural network with many layers (hence 'deep') is trained with large amounts of data to recognize different types of patterns, for example faces. Such training can make the program categorize or identify better than humans.
Deep learning has been successfully used in medicine to interpret X-ray images. The AlphaFold program has shown that, based on the sequence of amino acids in a protein, it can describe the three-dimensional structure of the protein in almost 90% of cases. Such tasks have previously been very difficult for researchers to solve and the AI program opens up great opportunities for medical research and the pharmaceutical industry.
Similar methods are used in speech recognition, as used by Siri and Alexa, among others. When it comes to automatic translation, AI programs are mainly based on statistical methods. However, newer large language models, such as the recent GPT series, which automatically generates texts, are also based on deep learning. The programs are enormous (they contain hundreds of billions of variables) and are trained on huge amounts of text examples from Internet. The large language models generate good texts, but they are typically deemed lacking in creativity. The models only live within computers and have no possibility of acting in the real world. They have no bodies and thereby no bodily experience.
The success of different types of AI programs makes it easy to forget what they cannot do. The performance of AI systems heavily relies on the datasets they are trained on. If an exceptional case arises, they have no way of adapting. A poignant example occurred when Amazon's AI inventory system faltered under the unexpected surge in orders for toilet paper and face masks during the onset of the COVID-19 pandemic. The programs cannot find solutions to new types of problems. Here, we humans are clearly superior to AI systems. If we encounter a completely new problem, we can often come up with a somewhat reasonable solution, even if it is not optimal.
Training data can also distort results. AI programs have been used in courts to adjudicate certain routine cases, such as traffic violations. The program has been trained on a large number of previous cases and has learned which factors are relevant and which sentences should be given. However, it turned out that the program sentences black people more harshly than white people because the previous cases also had such a bias. It is therefore very important to use appropriate training data to obtain fair assessments in programs that determine who should receive, for example, health care benefits, social assistance or bank loans.
Another example is programs that fill in an image when they are only shown part of it. An American study shows that if such a program is shown an image of a man's head, it will in 43% of cases complete the image to a man in a suit. On the other hand, if an image of a woman's head, even one as famous as Congresswoman Alexandria Ocasio-Cortez, is used, the program will create an image of a woman wearing a low-cut top or a bikini 53% of the time. These distortions are due to the training data reflecting the images available on the internet.
AI programs based on deep learning provide opaque knowledge because they cannot justify or explain their results. A person who receives a rejection of an application for a loan via such a program, for example, cannot know the reason for the rejection. There is a movement in AI research that strives for 'explainable' AI, but it has not yet produced any concrete results of note.
Robots and AI are often confused, but they are two different fields. The word 'robot', introduced by author Karel Čapek, means 'work'. So, a robot is something that performs concrete actions. We have been fooled by science fiction movies into thinking that intelligent and capable robots exist. But the robots that exist in reality – industrial robots, robot vacuum cleaners and robot lawnmowers – have no techne. They can't go outside the narrow range of actions they've been programmed for, nor can they explain why they behave the way they do. They are utterly stupid and, in most cases, clumsy. Creating robots that can independently solve new practical problems requires a great deal of technical and cognitive knowledge, and progress is slow. A skilled robot carpenter is a long way off.
To return to Aristotle's types of knowledge, AI can exhibit a certain level of episteme but the programs have no techne. What about phronesis, or judgment?
While it is possible to incorporate rule-based systems into AI, the challenge lies in determining the applicability of rules in individual cases. Professionals across various fields, including nurses, doctors, healthcare practitioners, social workers, lawyers, and teachers, understand that relying solely on a rulebook is inadequate. There are always borderline cases and unique combinations of circumstances that demand novel approaches. This is precisely when human judgment becomes indispensable.
To comprehend why AI lacks judgment, it is crucial to grasp the concept itself. Phronesis involves the wise resolution of problems in specific situations, striking a balance between different values based on the knowledge available about the situation. It does not imply a conflict between emotions and reason, but rather an intricate interplay between the two. However, AI systems lack emotions and morality, rendering them incapable of assuming responsibility. Regardless of how extensive a rulebook may be, it is insufficient to understand how to adapt values to specific circumstances. Professionals with judgment possess the ability to do so, even though the exact mechanisms of their decision-making process remain elusive. Unfortunately, the concept of judgment receives relatively little attention in the humanities and social sciences.
Judgment will also be necessary for future robots when interacting with humans. Self-driving cars require judgment not only to determine whether a cyclist will turn left or when a pedestrian should cross the street, but especially in situations where an accident is imminent. Decisions must be made regarding prioritizing the safety of the car's occupants or individuals outside the vehicle.
In summary, AI systems possess a degree of episteme but lack both techne and phronesis. Consequently, these systems cannot currently function independently as decision-makers. Human involvement, particularly from individuals with experience, is necessary to handle exceptional cases that AI programs are not trained for and to prevent biased decision-making.
The upshot is that in order to create AI systems with broader capabilities, expertise in programming and technology alone is insufficient. Knowledge of human behavior, societies, and values is equally essential. Therefore, a multidisciplinary approach that combines technical expertise with insights from the humanities and social sciences is crucial for the advancement of AI and its responsible integration into various domains of society. Key research tasks for the research should be, first, to create a better understanding of what it means to have good judgment and, second, to study whether it is possible to develop programs in the field of AI that have such a characteristic.
Peter Gärdenfors is professor of cognitive science at the University of Lund. His research area includes models of concept formation, semantics, their applications in robotics, and the evolution of thinking and language. He is the author of the influential book Conceptual Spaces: The Geometry of Thought, which offers a theory of conceptual representations as being a bridge between the symbolic and connectionist approaches of modeling representations.
THE COGNIZER
Extending Cognition
The Cognizer is a publishing platform initiated by CogIST, a cognitive science community from Turkey. On this platform, articles and essays on different topics from different fields of cognitive science are published in a way that would bridge the gap between public audience and experts.