Become a fan of Slashdot on Facebook

 



Forgot your password?
typodupeerror
×
AI

The Flaw Lurking In Every Deep Neural Net 230

mikejuk (1801200) writes "A recent paper, 'Intriguing properties of neural networks,' by Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow and Rob Fergus, a team that includes authors from Google's deep learning research project, outlines two pieces of news about the way neural networks behave that run counter to what we believed — and one of them is frankly astonishing. Every deep neural network has 'blind spots' in the sense that there are inputs that are very close to correctly classified examples that are misclassified. To quote the paper: 'For all the networks we studied, for each sample, we always manage to generate very close, visually indistinguishable, adversarial examples that are misclassified by the original network.' To be clear, the adversarial examples looked to a human like the original, but the network misclassified them. You can have two photos that look not only like a cat but the same cat, indeed the same photo, to a human, but the machine gets one right and the other wrong. What is even more shocking is that the adversarial examples seem to have some sort of universality. That is a large fraction were misclassified by different network architectures trained on the same data and by networks trained on a different data set. You might be thinking 'so what if a cat photo that is clearly a photo a cat is recognized as a dog?' If you change the situation just a little and ask what does it matter if a self-driving car that uses a deep neural network misclassifies a view of a pedestrian standing in front of the car as a clear road? There is also the philosophical question raised by these blind spots. If a deep neural network is biologically inspired we can ask the question, does the same result apply to biological networks? Put more bluntly, 'Does the human brain have similar built-in errors?' If it doesn't, how is it so different from the neural networks that are trying to mimic it?"
This discussion has been archived. No new comments can be posted.

The Flaw Lurking In Every Deep Neural Net

Comments Filter:
  • by jaeztheangel ( 2644535 ) on Tuesday May 27, 2014 @09:43AM (#47098875)
    Deep neural networks are implicitly generating dynamic-ontologies. The 'mis-categorisation' occurs when you only have one functional exit point. The fact is that if you are within the network itself, the adversarial are held in-frame alongside other possibilities, and the network only tilts towards one when the prevailing system requires it through external stimulus. From the outside it will look like an error, (because we already decided that) but internally each possible interpretation is valid.
  • by bunratty ( 545641 ) on Tuesday May 27, 2014 @09:51AM (#47098933)
    More importantly, the human brain has feedback loops. All the artificial neural nets I've seen are only feed-forward, except during the training phase in which case there is only feed-forward or only feed-backward and never any looping of signals. In effect, the human brain is always training itself.
  • by ganv ( 881057 ) on Tuesday May 27, 2014 @09:56AM (#47098975)

    Your model of the brain as multiple neural nets and a voter is a good and useful simplification. I think we still know relatively little about how accurate it is. You would expect evolution to have optimized the brain to avoid blind spots that threatened survival, and redundancy makes sense as a way to do this.

    However, I wouldn't classify blind spots as 'no problem whatsoever'. If the simple model of multiple neural nets and a voter is a good one, then there will be cases where several nets give errors and the conclusion is wrong. Knowing what kinds of errors are produced after what kind of training is critical to understanding when a redundant system will fail. In the end though, I suspect that the brain is quite a bit more complicated that a collection of the neural nets like those this research is working with.

  • Ensemble neural nets (Score:3, Interesting)

    by Theorem Futile ( 638969 ) on Tuesday May 27, 2014 @10:01AM (#47099017)
    That makes sense. Rare errors will be screened out if instead of a single deterministic selection process you use a distribution of schemes and select based on the most probable outcome... I am wondering what our brain does with its minority reports...
  • by Rashdot ( 845549 ) on Tuesday May 27, 2014 @10:03AM (#47099027)

    Apparently these neural nets are taught to classify "images", instead of breaking these images down into recognizable forms and properties first.

  • Re:Errors (Score:3, Interesting)

    by Anonymous Coward on Tuesday May 27, 2014 @10:04AM (#47099029)
    Ok, I need to share story of my boss. Hope it is relevant.

    My boss was hardware engineer and had total blind spot for software. We involved him many times in the discussion to make sure he understands different layers of software, but everything in vain.

    It used to create funny situations. For example, one of the developer was developing a UI and had bug in his code. Unfortunately he was stuck for an hour when my boss happened to ask him how he was doing. After hearing the problem, he jumped and said the problem is in power supply, and ordered replacement immediately.

    Hundreds of times, the developers got ICs replaced, capacitors replaced, boards replaced, complete laptops replaced, CPUs replaced, monitors replaced (for bug in QT code).

    I have wasted hours to make sure he understands that it is not a hardware issue, but always failed. It was painful to deal with him.

  • by Anonymous Coward on Tuesday May 27, 2014 @10:05AM (#47099035)

    Indeed, remembering the experiments done in the 1960s by Sperry and Gazzaniga on patients who had a divided corpus callosum, there are clearly multiple systems that can argue with each other about recognising objects. Maybe part of what makes us really good at it, is not relying on one model of the world, but many overlaid views of the same data by different mechanisms.

  • Re:Errors (Score:5, Interesting)

    by TapeCutter ( 624760 ) on Tuesday May 27, 2014 @10:56AM (#47099443) Journal
    A NNet is basically trying to fit a curve, the problem of "overfitting" manifests itself as two almost identical data points being separated because the curve has contorted itself to fit one data point, So yes, a video input would likely help. The really interesting bit is that it seems all NNets make the same mis-classification, even when trained with different data. What these guys are saying is "that's odd", I think mathematicians will go nuts trying to explain this and it will probably will lead to AI insights.

    The AI system in an autonomous car is much more than a Boltzmann machine running on a video card. The problem for man or machine when driving a car is that it's "life" depends on predicting the future, the problem is that neither man or machine can can confirm their calculation before the future happens. If the universe fails to co-operate with their prediction it's too late. What's important from a public safety POV is who gets it right more often, if cars killing people was totally unacceptable we wouldn't allow cars in the first place.
  • by Gibgezr ( 2025238 ) on Tuesday May 27, 2014 @11:00AM (#47099485)

    Just to back up what James Clay said, I took a course from Sebastian Thrun (the driving force behind the Google cars) on programming robotic cars, and no neural networks were involved, nor mentioned with regards to the Google car project. As far as I can tell, if the LIDAR says something is in the way, the deterministic algorithms attempt to avoid it safely; if you can't avoid it safely, you brake and halt. That's it. Maybe someone who actually worked on the Google car can comment further?
    Does anyone know of any neural networks used in potentially dangerous conditions? This study: www-isl.stanford.edu/~widrow/papers/j1994neuralnetworks.pdf states that
    accurateness and robustness issues need to be addressed when using neural network algorithms, and gives a baseline of more than 95% accuracy as a useful performance metric to aim for. This makes neural nets useful for things like auto-focus in cameras and handwriting recognition for tablets, but means that using a neural network as a primary decision-maker to drive a car is perhaps something best left to video games (where it has been used to great success) rather than real cars with real humans involved.

  • by jcochran ( 309950 ) on Tuesday May 27, 2014 @11:11AM (#47099553)

    incompleteness theorem. And as some earlier posters' stated, the correction is simple. Simply look again. The 2nd image collected will be different from the previous and if the NN is correct, will resolve to the correct interpretation.

  • Re:Errors (Score:5, Interesting)

    by dinfinity ( 2300094 ) on Tuesday May 27, 2014 @11:46AM (#47099857)

    The neural network "problem" they're talking about was while identifying a single image frame

    Yes, and even more important: they designed an algorithm to generate exactly the images that the network misperformed on. The nature of these images is explained in the paper:

    Indeed, if the network can generalize well, how can it be confused by these adversarial negatives, which are indistinguishable from the regular examples? The explanation is that the set of adversarial negatives is of extremely low probability, and thus is never (or rarely) observed in the test set, yet it is dense (much like the rational numbers[)], and so it is found near every virtually every test case.

    A network that generalizes well correctly classifies a large part of the test set. If you'd had the perfect dog classifier, trained with millions of dog images and tested with 100% accuracy on its test set, it would be really weird if the given 'adversarial negatives' would still exist. Considering that the networks did not generalize 100%, it isn't at all surprising that they made errors on seemingly easy images (humans would probably have very little problem in getting 100% accuracy for the test sets used). That is just how artificial neural networks are currently performing,

    The slightly surprising part is that the misclassified images seem so close to those in the training set. If I'm interpreting the results correctly (IANANNE), what happens is that their algorithm modifies the images in such a way that the feature detectors in the 10 neuron wide penultimate layer fire just under the required threshold for the final binary classifier to fire.

    Maybe the greatest thing about this research is that it contains a new way to automatically increase the size of the training set with these meaningful adversarial examples:

    We have successfully trained a two layer 100-100-10 non-convolutional neural network with a test error below 1.2% by keeping a pool of adversarial examples a random subset of which is continuously replaced by newly generated adversarial examples and which is mixed into the original training set all the time. For comparison, a network of this size gets to 1.6% errors when regularized by weight decay alone and can be improved to around 1.3% by using carefully applied dropout. A subtle, but essential detail is that adversarial examples are generated for each layer output and are used to train all the layers above. Adversarial examples for the higher layers seem to be more useful than those on the input or lower layers.

    It might prove to be much more effective in terms of learning speed than just adding noise to the training samples as it seems to grow the test set based on which features the network already uses in its classification instead of the naive noise approach. In fact, the authors hint at exactly that:

    Already, a variety of recent state of the art computer vision models employ input deformations during training for increasing the robustness and convergence speed of the models [9, 13]. These deformations are, however, statistically inefcient, for a given example: they are highly correlated and are drawn from the same distribution throughout the entire training of the model. We propose a scheme to make this process adaptive in a way that exploits the model and its deciencies in modeling the local space around the training data.

  • Re:Errors (Score:4, Interesting)

    by tlhIngan ( 30335 ) <slashdot.worf@net> on Tuesday May 27, 2014 @12:33PM (#47100247)

    Actually, not only is this common in humans, but the "fix" is the same for neural networks as it is in humans. When you misidentify a paper bag as a dog, you only do so for a split second. Then it moves (or you move, or your eyes move - they constantly vibrate so that the picture isn't static!), and you get another slightly different image milliseconds later which the brain does identify correctly (or at least, tells your brain "wait a minute there's a confusing exception here, let's turn the head and try a different angle).

    The neural network "problem" they're talking about was while identifying a single image frame. In the context of a robot or autonomous car, the same process a human goes through above would correct the issue within milliseconds, because confusing and/or misleading frames (at the level we're talking about here) are rare. Think of it as a realtime error detection algorithm.

    For some humans, it's a smack in the head, though.

    The human wetware is powerful but easy to mislead. For example, the face-recognition bit in human vision is extremely easy to fool - or why we see a face on the Moon, or a face on a rock on Mars, or Jesus on toast, a potato chip, or whatever.

    Human vision is especially vulnerable - see optical illusions. The resolution of the human eye is quite low (approx. 1MP concentrated in a tiny area of central vision, and another 1MP for peripheral vision), however, the vision system is coordinated with e motor system to control eye muscles so the eyeball moves ~200 times a second to get a higher resolution image from a low-resolution camera (which results in an image that is approximately 40+MP over the entire visual field).

    But then you have blind spots which the wetware interpolates (to great amusement at times), and annoying, habits like unidentifiable objects that are potentially in our way can lead to target fixation while the brain attempts to identify.

    Hell, humans are very vulnerable to this - the brain is wired for pattern recognition, and seeing patterns where there is none is a VERY common human habit.

    Fact tis, the only reason we're not constantly making errors is because we do just that - we take more glances and more time to look closer to give more input to the recognition system.

    Likewise, an autonomous vehicle would have plenty of information to derive recognition from - including a history of frames. These vehicles will have a past history of the images it received and processed, and the new anomalous ones could be temporally compared with images before and after.

If all else fails, lower your standards.

Working...