Note: This post is a Gemini space version of my post originally published on September 13, 2020
I used to think neural networks were a reasonably good technology, until I decided to try setting up a neural network myself. After some experimentation with neural networks, I dramatically reversed my stance. I now believe neural networks should not be in charge of any decisions such that could have life-or-death impact. Find out why below.
Disclosure: These are my personal thoughts about neural networks based on my limited experience with them. I do not claim to be an expert in neural networks.
First I'm going to propose a definition for a neural network:
An interpolation system whose behavior becomes undefined when extrapolating beyond its training data.
In simple language, "undefined" may be thought of as "random by human standards". OK, how did this definition come about? It all started when I desired to make a system for classifying fonts a year or two ago.
Even Google Fonts had at the time very few parameters for manually finding suitable fonts. For picking a font for a website, for example, the current font categories that most font sites use (serif, sans-serif, handwriting, display) were to me almost useless in finding a suitable font quickly. While Google Fonts at the time included a few additional parameters such as the width/narrowness of the font, even they had very little more to offer than that. This resulted in a lot of man-hours spent in selecting a suitable font. I desired to be able to ask a neural network for the types of qualitative features I was seeking in a font, and have it name a suitable font for me.
To this end, my first and pilot goal was to set up a neural network that could successfully distinguish between the simplest of cases: serif and sans-serif fonts. Then I'd move on to other font parameters.
But I'd never worked with neural networks before! So before setting up a network that could handle font images, I wanted to make sure I could set up a simple neural network in general. For my experiments, I used the excellent open-source Fast Artificial Neural Network Library, also known as FANN and designed for ease of use, installed on my local Linux computer.
I purposely set up to be a simple experiment: I wasn't setting up the type of neural network which adapts itself in an ongoing manner after its training; I set up a network that learns from a training set of data and then does not change itself further after the training.
First, I set up training data where the input was a number x, and the desired output was x/2 + 1. I trained this on many (but not all) numbers from 1 to 100. Most humans would be able to spot this formula pretty easily.
When querying my neural network for a number that was not in the training data, but was within the 1 to 100 range, it gave me a number that was not exact but somewhat reasonable. It wasn't too far away from the expected number. However, when I queried what the answer would be when I put an input of 50,000 the answer shocked and scared me. It was not even close to the correct answer.
So with the simple formula given, the neural network was able to interpolate reasonably well within the parameters of the training data. However, when outside the limits of the training data, the behavior seemed random.
You may be pointing out right now that this was a small training set, the experiment was tiny, and so on. All of that is true. But my point is, a human would have been able to spot the relationship of inputs and outputs and extrapolated more accurately when asked to give an answer for 50,000.
For those who are wondering what happened with the rest of the project, I went on to serif and sans-serif font images and the network was able to tell the difference. I started on some of the other parameters, however, beyond serif/sans-serif it became much more time-consuming (of my own time, not just of processor time) than I had anticipated. I therefore decided to abandon the project. But this experience left me alarmed at what sorts of decisions we as humans are entrusting to neural networks.
This experience also made me think more deeply about neural networks in general, and how they "think" completely differently to humans.
Let's take an example where a neural network has been tested to be 99% accurate. That sounds pretty good, doesn't it? Would you be comfortable with a 99% accurate car driving you? Let's take a quick look at that other 1%. As humans, we like to think that the 1% was wrong because it was "a little bit off the target", which is in reality what we as humans are likely to arrive at ourselves when we get it wrong. That's not the case with neural networks. The 1% error from a machine may be very, very far from correct. It can be so far incorrect as to appear almost random by human standards.
This magnitude of error might not be a problem if you're asking the neural network to tag people in your photos. If it gets some of the people wildly wrong, so what? But it will be a problem if you're asking it to make self-driving car decisions.
Let's look at a more concrete example: a stop sign covered in snow.
These are theoretical examples. But they're not far away from reality - an article from Ars Technica in 2017 shows that hacking street signs with camouflaged stickers could confuse self-driving cars.
So you could see how by human standards, a human looking at a stop sign covered in snow or in camouflaged stickers might get it wrong by a tiny bit, but a neural network could get it wrong by a long way.
Do you want to use a neural network for anything that could have a life-or-death impact now? You might point out that the solution is simple: just expand the training sets to include more edge cases, such as stop signs covered with snow. This way, the neural network will actually recognize a stop sign covered with snow as a stop sign. In other words, you're expanding out the training parameters to ensure that a particular case is now going to be interpolated instead of extrapolated. But this merely is a "quick fix" that deals with a particular symptom (stop signs being classified wrongly) and not the cause (neural network giving undefined results when extrapolating).
The other bigger issue at hand is, even if you feel that trying to use massive training sets to get the error rate down, is that the user doesn't know when they are experiencing an occasion where the neural network is extrapolating outside of the training sets. It would be really nice if a neural network could tell you "I'm not sure what that funny-looking blob is. It's outside of my training data." If it could tell you that, it would be a step in the right direction. No, the bigger problem is that even the neural network is not aware that it's being posed with something that's outside of its training data.
Where to from here? Should we shun neural networks altogether? Not necessarily. I think we as humans need to do 2 things:
1. Be aware of the limitations of neural networks - for example, not use them in situations with life-or-death consequences.
2. Continue to look at and develop the technology to overcome its problems. For example, try to build in a system where the neural network can tell you where it's experiencing uncertainty due to being outside the training data in at least one parameter. And then have it refuse to output a result or decision for those cases, instead forcing it to request a human decision.
Just because some neural networks are currently being used in life-or-death situations, does this mean they should be? These are ethical questions we should be posing ourselves.
Non-learning neural networks and learning ones both suffer from these problems.
The neural network I had made was non-learning: it couldn't adapt or change itself after the training phase.
By contrast, the learning neural networks are able to add data from feedback to themselves and thereby improve their accuracy. One example would be a neural network which makes a prediction about oil prices, then checks it against an actual new updated oil price, and feeds the new data back into the training set. This type of neural network is going to have a better accuracy than the non-learning type. But even a learning network can still suffer from all of the same underlying problems as the non-learning ones - the only difference is that is has a bigger and ongoing training set. We as humans still need to understand that it's still possible to ask for a result from input is outside of its training data, and that this will still pose a big problem. Even in a neural network which learns and adapts itself, we should still aim to build in the ability to refuse to give a decision and to instead demand a human decision when it's experiencing uncertainty.
Thus, I think my definition of a neural network still holds:
A neural network is an interpolation system whose behavior becomes undefined when extrapolating beyond its training data
I won't be putting myself in a self-driving car anytime soon.