How do Boltzmann machines work part 1(Artificial Intelligence) | by Monodeep Mukherjee | October 2022

  1. Thermodynamics of the Ising model encoded in restricted Boltzmann machines(arXiv)

Author : Jing Gu, Kai Zhang

Summary : The restricted Boltzmann machine (RBM) is a two-layer energy-based model that uses its hidden-visible connections to learn the underlying distribution of visible units, whose interactions are often complicated by high-order correlations . Previous studies on the Ising model of small system sizes have shown that RBMs are able to accurately learn the Boltzmann distribution and reconstruct thermal quantities at temperatures far from the critical point Tc. How the RBM encodes the Boltzmann distribution and captures the phase transition, however, is not well explained. In this work, we perform RBM learning of the 2d and 3d Ising model and carefully examine how RBM extracts useful probabilistic and physical information from Ising configurations. There are several indicators from the matrix of weights that could characterize the Ising phase transition. We verify that the hidden encoding of a visible state tends to have an equal number of positive and negative units, whose sequence is randomly assigned during training and can be inferred by analyzing the weight matrix. We also explore the physical significance of the visible energy and loss function (pseudo-likelihood) of the RBM and show that they could be exploited to predict the critical point or estimate physical quantities such as entropy.

2.Three stages of learning and precision-efficiency trade-off of restricted Boltzmann machines (arXiv)

Author : Lennart Dabelow, Masahito Ueda

Summary : Restricted Boltzmann machines (RBMs) offer a versatile architecture for unsupervised machine learning that can in principle approximate any target probability distribution with arbitrary precision. However, the RBM model is generally not directly accessible due to its computational complexity, and Markov chain sampling is invoked to analyze the learned probability distribution. For training and eventual applications, it is therefore desirable to have a sampler that is both accurate and efficient. We emphasize that these two objectives are usually in competition and cannot be achieved simultaneously. More specifically, we identify and quantitatively characterize three RBM learning regimes: independent learning, where accuracy improves without losing efficiency; correlation learning, where higher accuracy leads to lower efficiency; and degradation, where accuracy and efficiency no longer improve or even deteriorate. These results are based on numerical experiments and heuristic arguments

3. Untangling Representations in Restricted Boltzmann Machines Without Adversaries(arXiv)

Author : Jorge Fernandez-de-Cossio-Diaz, Simona Cocco, Rémi Monasson

Summary : A goal of unsupervised machine learning is to untangle complex high-dimensional data representations, allowing significant latent variation factors in the data to be interpreted as well as manipulated to generate new data with desirable characteristics. These methods often rely on an adversarial scheme, in which the representations are adjusted to prevent the discriminators from being able to reconstruct specific data information (labels). We propose a simple and efficient way to disentangle representations without the need to train adversarial discriminators, and apply our approach to Restricted Boltzmann Machines (RBMs), one of the simplest representation-based generative models. Our approach relies on introducing adequate constraints on the weights during training, which allows us to focus label information on a small subset of latent variables. The efficiency of the approach is illustrated on the MNIST dataset, the two-dimensional Ising model and the taxonomy of protein families. Moreover, we show how our framework allows to calculate the cost, in terms of log-likelihood of the data, associated with disentangling their representations.

James G. Williams