학술논문

Attention in a family of Boltzmann machines emerging from modern Hopfield networks
Document Type
Working Paper
Source
Subject
Computer Science - Machine Learning
Computer Science - Neural and Evolutionary Computing
Statistics - Machine Learning
Language
Abstract
Hopfield networks and Boltzmann machines (BMs) are fundamental energy-based neural network models. Recent studies on modern Hopfield networks have broaden the class of energy functions and led to a unified perspective on general Hopfield networks including an attention module. In this letter, we consider the BM counterparts of modern Hopfield networks using the associated energy functions, and study their salient properties from a trainability perspective. In particular, the energy function corresponding to the attention module naturally introduces a novel BM, which we refer to as the attentional BM (AttnBM). We verify that AttnBM has a tractable likelihood function and gradient for certain special cases and is easy to train. Moreover, we reveal the hidden connections between AttnBM and some single-layer models, namely the Gaussian--Bernoulli restricted BM and the denoising autoencoder with softmax units coming from denoising score matching. We also investigate BMs introduced by other energy functions and show that the energy function of dense associative memory models gives BMs belonging to Exponential Family Harmoniums.
Comment: 15 pages, 3 figures. v2: added figures and various corrections/improvements especially in Introduction and Section 3. Published version