On Convergence and Generalization of Dropout Training
- Resource Type
- Working Paper
- Authors
- Mianjy, Poorya; Arora, Raman
- Source
- In Proceedings of Advances in Neural Information Processing Systems (NeurIPS), 2020
- Subject
- Computer Science - Machine Learning
Statistics - Machine Learning
- Language
We study dropout in two-layer neural networks with rectified linear unit (ReLU) activations. Under mild overparametrization and assuming that the limiting kernel can separate the data distribution with a positive margin, we show that dropout training with logistic loss achieves $\epsilon$-suboptimality in test error in $O(1/\epsilon)$ iterations.