학술논문

Understanding the Distributions of Aggregation Layers in Deep Neural Networks

Document Type

Periodical

Author

Source

IEEE Transactions on Neural Networks and Learning Systems IEEE Trans. Neural Netw. Learning Syst. Neural Networks and Learning Systems, IEEE Transactions on. 35(4):5536-5550 Apr, 2024

Subject

Computing and Processing
Communication, Networking and Broadcast Technologies
Components, Circuits, Devices and Systems
General Topics for Engineers
Convolution
Tensors
Task analysis
Probability distribution
Mathematical models
Gamma distribution
Deep learning
Activation probability distributions
aggregation layers
deep neural networks (DNNs)

Language

ISSN

2162-237X
2162-2388

Abstract

The process of aggregation is ubiquitous in almost all the deep nets’ models. It functions as an important mechanism for consolidating deep features into a more compact representation while increasing the robustness to overfitting and providing spatial invariance in deep nets. In particular, the proximity of global aggregation layers to the output layers of DNNs means that aggregated features directly influence the performance of a deep net. A better understanding of this relationship can be obtained using information theoretic methods. However, this requires knowledge of the distributions of the activations of aggregation layers. To achieve this, we propose a novel mathematical formulation for analytically modeling the probability distributions of output values of layers involved with deep feature aggregation. An important outcome is our ability to analytically predict the Kullback–Leibler (KL)-divergence of output nodes in a DNN. We also experimentally verify our theoretical predictions against empirical observations across a broad range of different classification tasks and datasets.

Online Access

Full Text (IEEE) Web of Science JCR 저널정보 Scopus Find it@PNU

이메일

부산대학교 도서관

Online Access

메일 발송