How Boltzmann Machines Work, Part 2 (Artificial Intelligence) | by Monodeep Mukherjee | October 2022

  1. Improved Gaussian-Bernoulli restricted Boltzmann machines for UAV-to-ground communication systems(arXiv)

Author : Osamah A. Abdullah, Michael C. Batistatos, Hayder Al-Hraisawi

Summary : Unmanned aerial vehicles (UAVs) are steadily growing as a promising technology for next-generation communication systems due to their attractive features such as wide high-altitude coverage, low-cost on-demand deployment, and quick responses. UAV communications are fundamentally different from conventional terrestrial and satellite communications due to the high mobility and unique channel characteristics of air-to-ground links. However, obtaining effective channel state information (CSI) is difficult due to the dynamic propagation environment and variable transmission delay. In this paper, a deep learning (DL)-based CSI prediction framework is proposed to solve the channel aging problem by extracting the most discriminating features from UAV wireless signals. Specifically, we develop a Bernoulli Gaussian Restricted Multiple Boltzmann Machine (GBRBM) procedure for dimension reduction and using pre-training incorporated into an autoencoder-based deep neural network (DNN). To evaluate the proposed approach, real data measurements from a UAV communicating with base stations within a commercial cellular network are obtained and used for training and validation. The numerical results demonstrate that the proposed method is accurate in acquiring channels for various UAV flight scenarios and outperforms conventional DNNs.

2.Learning a Restricted Boltzmann Machine Using Biased Monte Carlo Sampling (arXiv)

Author : Nicolas Béreux, Aurélien Decelle, Cyril Furtlehner, Beatriz Seoane

Summary : Restricted Boltzmann machines are simple and powerful generative models that can encode any complex dataset. Despite all their advantages, in practice the formations are often unstable and it is difficult to assess their quality because the dynamics are affected by extremely slow temporal dependencies. This situation becomes critical when dealing with low-dimensional clustered datasets, where the time required to ergodically sample the trained models becomes computationally prohibitive. In this work, we show that this divergence of Monte Carlo mixing times is related to a phenomenon of phase coexistence, similar to that which occurs in physics near a first-order phase transition. We show that sampling the equilibrium distribution using the Markov chain Monte Carlo method can be significantly accelerated when using biased sampling techniques, in particular the Tethered Monte Carlo method ( TMC). This sampling technique effectively solves the problem of assessing the quality of a given trained model and generating new samples in a reasonable time. Moreover, we show that this sampling technique can also be used to improve log-likelihood gradient computation during training, leading to dramatic improvements in training RBMs with artificial clustered datasets. On real low-dimensional datasets, this new learning method fits RBM models with significantly faster relaxation dynamics than those obtained with standard PCD recipes. We also show that TMC sampling can be used to recover the free energy profile of the RBM. This is extremely useful for calculating the probability distribution of a given model and for improving the generation of new uncorrelated samples in slow PCD-trained models.

3.Pattern reconstruction with restricted Boltzmann machines (arXiv)

Author : Giuseppe Genovese

Summary : Restricted Boltzmann machines are energy models consisting of a visible layer and a hidden layer. We identify an effective energy function describing the zero temperature landscape on the visible units and depending only on the tail behavior of the hidden layer prior distribution. By studying the location of the local minima of such an energy function, we show that the ability of a restricted Boltzmann machine to reconstruct a random pattern indeed depends only on the tail of the hidden prior distribution. We find that hidden priors with strictly super-Gaussian tails give only a logarithmic loss in pattern recovery, whereas efficient recovery is much more difficult with hidden units with strictly sub-Gaussian tails; if the hidden prior has Gaussian tails, the recoverability is determined by the number of hidden units (as in the Hopfield model)

James G. Williams