by Anna Johansson
Machine learning is amazing at many tasks. Our most advanced algorithms can process and understand data faster than any human ever could, and we’re even to the point where they can beat our best players at some of the most complex games in existence. But even with all its inherent advantages, and even with a near-constant pace of development over the past couple of decades, machine learning is still glaringly bad at one crucial element: context.
Examples of the Context Problem
Machine learning algorithms can quickly process data, and can provide answers for some of our most complicated questions. But it’s much harder to create a program that can recognize the context for those questions and answers.
It’s easier to think about it in terms of examples:
The upside-down image. We have computer programs that can detect whether or not an image has been altered based on unexpected changes or patterns embedded in the image at a granular level. But those same algorithms find it hard to determine whether an image is upside-down—whereas a human would pick this up immediately.
Driving a car. Self-driving car technology utilizes many detection methods, including radar, LIDAR, and traditional video cameras. Even so, it’s hard for it to determine whether that object in front of it is a human being—or just a shrub that has a similar outline. Flight controls, which are becoming increasingly important for consumer safety, don’t require the same attention to surroundings—but still rely on massive levels of data processing.
Sarcasm. Though we’ve yet to experience this problem firsthand, natural language algorithms and chatbots have a tremendous amount of difficulty recognizing and responding to sarcasm, since they’re naturally programmed to take the things we say at face value.
What We Can Do to Solve the Problem
So what can we do to solve the problem?
Multi-INT analysis. One of the most popular approaches to solving the problem is multiple intelligence (Multi-INT) analysis. As the name suggests, this is a process that depends on forming conclusions based on inputs from multiple, distinct sources. A program would hypothetically be able to call on multiple different algorithms to determine different elements of a different situation, then use a “master” process to stitch them together into a coherent conclusion. That way, you don’t have to “teach” a machine how to recognize human faces. Instead, you can teach it to recognize eyes, ears, and noses, then indicate a face when all three are present.
Sub-problems (and new tools). We’re also addressing the problem of context by specializing in specific instances of the problem. Rather than tackling the challenge directly, we’re studying sub-problems and creating new tools that can be applied to broader contexts. For example, natural language processing is attempting to bring more intuitive dialogue in conversations between humans and machines; companies can then use existing natural language tools to enhance their own products, which may otherwise lack nuance.
More brain-like interfaces. Another solution lies in the path that led us to deep learning—an attempt to mimic the connections and activation patterns that already exist in the human neocortex. If the human brain has already mastered context, and we’re trying to bring machines up to our level, it makes sense that we’d take advantage of the same architecture. The idea here is to use a neural network, full of complex connections between different nodes that activate in response to different types of stimuli. Already, this type of algorithm has led to machines making moves and decisions that seem like a leap in intuition, so it may lead us to even more advancement in the future.
Results-oriented training. Finally, we should stop emphasizing the process of “teaching” a machine how to recognize a pattern the conventional way—with tons of upfront data, or rules that dictate what constitutes a positive result. Instead, we’re forced to allow machines to learn on their own, letting them make mistakes and correcting them as necessary until they develop their own rubrics for evaluating a situation.
Do We Even Need Context?
Do we even need context to maintain our rate of progress with machine learning? The short answer is no—at least not for the time being. For example, it doesn’t matter if a machine learning algorithm “knows” that an image is upside-down if it’s looking for other information; it only becomes a problem if we need the algorithm to tell us whether the image is upside-down.
Humans excel at context; it’s because our brains are naturally programmed for pattern recognition. We don’t really need machines to step in and tell us basic things about our environments—at least not yet.
For the foreseeable future, our best tools will be ones that complement or enhance the human brain, rather than replacing it, so context will continue to be a secondary problem.