Deep learning is not really the future of AI

Deep learning—people mention it everywhere. They boasted as if it is the future of AI. Oftentimes, it is placed as the definition of AI.

First thing first, no. AI is not deep learning. Deep learning is a special case of neural network, which is just one of machine learning models available out there. Machine learning itself is just a branch of AI.

So, what is the "future" that we expect? Let's say, what we imagine is the future where there are robots and humanoid with human-level intelligence. I am not even sure that we have settled the definition of "intelligence", and I will not say too much as I am not into this philosophical part of AI. But let's just say that human-level means that the agents with AI can solve general tasks and making decisions, just like human. Can deep learning achieve that?

For the current state of deep learning, no. What do we have now? Yes, a model that can "learn" from data. We can see the model as a machine with a bunch of configurations and internal states. The machine can accept inputs, process it and spit out an answer. For example, when we give a picture of a "cat", it will say "it is a cat!" We usually provide it a bunch of examples to learn from: input samples and the corresponding (assumed-to-be) correct answers. The model aims to find the best configuration to make a prediction based on an input, which is as close as possible to the correct answer.

Initially, we start with a stupid model: it will produce garbage predictions. Fortunately, we also provide the correct answers, so the model knows how wrong its predictions are compared to the actual answers. The prediction quality is measured by some "stupidity score". The score is high in the beginning, that means the mode is still dumb, and its configurations are purely randomized. Based on that stupidity score, the model then updates its internal configuration, such that the predictions in the next iterations can be better. In other words, the stupidity score gets lower the more the model makes predictions and updates its configuration iteratively. Finally, after we are confident enough with the model, let's expect the model can perform well on the unseen data in the wild.

Inability to explain its own decision

If you have ever watched Psycho-Pass anime (No? What are you doing? Go watch it!) you will see an example an idealized AI-assisted law enforcement system. A group of police officers are equipped with an AI-powered gun, called the dominator.

When the Dominator is aimed at a target, it continuously reads and sends psychological data ‒ the individual's Psycho-Pass ‒ which it sends to the Sibyl System for calculation of their Crime Coefficient. When this value exceeds a certain level, one indicating that the target is mentally unstable and likely to commit a violent crime, the gun will be operable. If the level does not exceed such levels, the muzzle will not open and a safety device will be activated to prevent the user from pulling the gun's trigger. ...

Simply put, the officers rely on AI to determine the crime level of a suspect. The higher the level, the more lethal the bullet is. The prediction is highly accurate and they seem to trust it. Pretty neat, isn't it?

Psycho-Pass Wallpapers - Wallpaper Cave

Is there any chance of adopting such AI-assisted decision making with current state of deep learning? It is not difficult to sense physiological sensor of a person and fit a prediction model of the crime level. In fact, researchers are actually using physiological data for many purposes. It is one thing. The problem is that the existing deep learning models are bad at explaining their own decision.

In the implementation, a deep learning model works by applying a sequence of transformations (convolutions, matrix multiplications, and other arithmetical operations) on the input data to get the predicted output. A deep learning model may contain a thousands-to-billions of individual numbers. The transformations are parameterized by those numbers. Unfortunately, we cannot directly interpret them. Neither can the model. What we can rely for now is the low stupidity score of the model, which doesn't say much about the reasoning behind the decision. The score only tells us about how good the model predict the previous data.

I doubt we can expect anything from an AI that cannot explain its own decision. Especially, if we are talking about people's life. Imagine if an AI judge in a court accuse you as a level-5 criminal, and when you ask why, it says, "I don't know why. What I know is that I am good enough in predicting previous people's crime levels. Thus, you should trust me."

Very specific to one ore few tasks

By current learning scheme, we will only get a very specific model. When we train a model to recognize objects based on images, it will solely work on that object recognition task. It is pretty much useless when we feed it emails in our inbox and ask it to distinguish which ones are spam and which ones aren't.

I understand that there is a model variant called multi-task learning, which aims to solve several tasks using a single model. Self-driving cars use this type of model to detect, for example, moving objects, road signs, and traffic light color, at the same time using the same model. It can accept multiple types of input. However, the tasks should be closely related. Self-driving car AI can only work with driving and traffic related tasks. It will have a hard time predicting weather in next few hours and future stock market prices.

But it is still useful

Despite of the limitation at current stage, in practice, deep learning still works incredibly well to solve the problems in several domains. To name few:

Nonetheless, if you think these kind of technology is the future that you are talking about, then the future is now. The journey still continues.