학술논문

An initial alignment between neural network and target is needed for gradient descent to learn

Document Type

Working Paper

Author

Abbe, Emmanuel; Cornacchia, Elisabetta; Hązła, Jan; Marquis, Christopher

Source

Proceedings of the International Conference on Machine Learning, 2022

Subject

Computer Science - Machine Learning

Language

Abstract

This paper introduces the notion of ``Initial Alignment'' (INAL) between a neural network at initialization and a target function. It is proved that if a network and a Boolean target function do not have a noticeable INAL, then noisy gradient descent on a fully connected network with normalized i.i.d. initialization will not learn in polynomial time. Thus a certain amount of knowledge about the target (measured by the INAL) is needed in the architecture design. This also provides an answer to an open problem posed in [AS20]. The results are based on deriving lower-bounds for descent algorithms on symmetric neural networks without explicit knowledge of the target function beyond its INAL.

Online Access

Open Access (Arxiv) Find it@PNU

이메일

부산대학교 도서관

Online Access

메일 발송