
Preventing Overfitting in Neural Networks with Dropout Technique
"Learn about the Dropout technique proposed by Srivastava et al. to prevent overfitting in neural networks. The technique involves randomly dropping units during training to prevent co-adaptation, leading to improved performance on various supervised learning tasks in vision, speech recognition, and more."
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
Dropout: A Simple Way to Prevent Neural Dropout: A Simple Way to Prevent Neural Networks from Overfitting Networks from Overfitting NITISH SRIVASTAVA, GEOFFREY HINTON, ALEX KRIZHEVSKY, ILYA SUTSKEVER, AND RUSLAN SALAKHUTDINOV Journal of Machine Learning Research, Journal of Machine Learning Research, 15 (2014) 1929 15 (2014) 1929- -1958 1958 Presenter: Ke-Xin Zhu Date: 2018/12/25 1
Abstract (1/2) Abstract (1/2) Deep neural nets with a large number of parameters are very powerful machine learning systems. However, overfitting is a serious problem in such networks. Large networks are also slow to use, making it difficult to deal with overfitting by combining the predictions of many different large neural nets at test time. Dropout is a technique for addressing this problem. The key idea is to randomly drop units (along with their connections) from the neural network during training. This prevents units from co-adapting too much. During training, dropout samples from an exponential number of different thinned networks. 2
Abstract (2/2) Abstract (2/2) At test time, it is easy to approximate the effect of averaging the predictions of all these thinned networks by simply using a single unthinned network that has smaller weights. This significantly reduces overfitting and gives major improvements over other regularization methods. We show that dropout improves the performance of neural networks on supervised learning tasks in vision, speech recognition, document classification and computational biology, obtaining state-of-the-art results on many benchmark data sets. 3
INTRODUCTION INTRODUCTION (1/2) (1/2) overfitting regularization model combination 4
INTRODUCTION INTRODUCTION (2/2) (2/2) train 2n-> test one; Motivation: Sexual Reproduction break up complex co-adaptations. 5
Model Description Model Description (1/2) (1/2) 6
Model Description Model Description (2/2) (2/2) learning W -> learning r(0/1) 7
Experimental Results Experimental Results (1/8) (1/8) 8
Experimental Results Experimental Results (2/8) (2/8) 9
Experimental Results Experimental Results (3/8) (3/8) 10
Experimental Results Experimental Results (4/8) (4/8) 11
Experimental Results Experimental Results (5/8) (5/8) 12
Experimental Results Experimental Results (6/8) (6/8) 13
Experimental Results Experimental Results (7/8) (7/8) 14
Experimental Results Experimental Results (8/8) (8/8) 15
Effect on Features Effect on Features 16
Effect on Sparsity Effect on Sparsity 17
Effect of Dropout Rate Effect of Dropout Rate 18
Dropout Restricted Boltzmann Machines Dropout Restricted Boltzmann Machines (1/2) (1/2) 19
Dropout Restricted Boltzmann Machines Dropout Restricted Boltzmann Machines (2/2) (2/2) 20
Logistic Regression and Deep Networks Logistic Regression and Deep Networks 21