Authors
Geoffrey E Hinton, Nitish Srivastava, Alex Krizhevsky, Ilya Sutskever, Ruslan R Salakhutdinov
Publication date
2012/7/3
Journal
arXiv preprint arXiv:1207.0580
Description
When a large feedforward neural network is trained on a small training set, it typically performs poorly on held-out test data. This "overfitting" is greatly reduced by randomly omitting half of the feature detectors on each training case. This prevents complex co-adaptations in which a feature detector is only helpful in the context of several other specific feature detectors. Instead, each neuron learns to detect a feature that is generally helpful for producing the correct answer given the combinatorially large variety of internal contexts in which it must operate. Random "dropout" gives big improvements on many benchmark tasks and sets new records for speech and object recognition.
Total citations
19891990199119921993199419951996199719981999200020012002200320042005200620072008200920102011201220132014201520162017201820192020202120222023202435363537427156817598801078554161482013252528286317937161672988311141219122112521106943326
Scholar articles
GE Hinton, N Srivastava, A Krizhevsky, I Sutskever… - arXiv preprint arXiv:1207.0580, 2012
GE Hinton, N Srivastava, A Krizhevsky, I Sutskever… - arXiv preprint arXiv:1207.0580, 2012