Paul M. Baggenstoss

h-index17

10papers

14citations

Novelty42%

AI Score24

Ranked #173,152 of 194,257 authors (top 89%)#37,533 in LG (top 93%)

10 Papers

4.6LGApr 25, 2022

Using the Projected Belief Network at High Dimensions

Paul M Baggenstoss

The projected belief network (PBN) is a layered generative network (LGN) with tractable likelihood function, and is based on a feed-forward neural network (FFNN). There are two versions of the PBN: stochastic and deterministic (D-PBN), and each has theoretical advantages over other LGNs. However, implementation of the PBN requires an iterative algorithm that includes the inversion of a symmetric matrix of size M X M in each layer, where M is the layer output dimension. This, and the fact that the network must be always dimension-reducing in each layer, can limit the types of problems where the PBN can be applied. In this paper, we describe techniques to avoid or mitigate these restrictions and use the PBN effectively at high dimension. We apply the discriminatively aligned PBN (PBN-DA) to classifying and auto-encoding high-dimensional spectrograms of acoustic events. We also present the discriminatively aligned D-PBN for the first time.

3.8LGNov 24, 2023

A Comparison of PDF Projection with Normalizing Flows and SurVAE

Paul M. Baggenstoss, Felix Govaers

Normalizing flows (NF) recently gained attention as a way to construct generative networks with exact likelihood calculation out of composable layers. However, NF is restricted to dimension-preserving transformations. Surjection VAE (SurVAE) has been proposed to extend NF to dimension-altering transformations. Such networks are desirable because they are expressive and can be precisely trained. We show that the approaches are a re-invention of PDF projection, which appeared over twenty years earlier and is much further developed.

6.6LGSep 14, 2023

Improved Auto-Encoding using Deterministic Projected Belief Networks

Paul M Baggenstoss

In this paper, we exploit the unique properties of a deterministic projected belief network (D-PBN) to take full advantage of trainable compound activation functions (TCAs). A D-PBN is a type of auto-encoder that operates by "backing up" through a feed-forward neural network. TCAs are activation functions with complex monotonic-increasing shapes that change the distribution of the data so that the linear transformation that follows is more effective. Because a D-PBN operates by "backing up", the TCAs are inverted in the reconstruction process, restoring the original distribution of the data, thus taking advantage of a given TCA in both analysis and reconstruction. In this paper, we show that a D-PBN auto-encoder with TCAs can significantly out-perform standard auto-encoders including variational auto-encoders.

4.6LGApr 25, 2022

Trainable Compound Activation Functions for Machine Learning

Paul M. Baggenstoss

Activation functions (AF) are necessary components of neural networks that allow approximation of functions, but AFs in current use are usually simple monotonically increasing functions. In this paper, we propose trainable compound AF (TCA) composed of a sum of shifted and scaled simple AFs. TCAs increase the effectiveness of networks with fewer parameters compared to added layers. TCAs have a special interpretation in generative networks because they effectively estimate the marginal distributions of each dimension of the data using a mixture distribution, reducing modality and making linear dimension reduction more effective. When used in restricted Boltzmann machines (RBMs), they result in a novel type of RBM with mixture-based stochastic units. Improved performance is demonstrated in experiments using RBMs, deep belief networks (DBN), projected belief networks (PBN), and variational auto-encoders (VAE).

2.6LGJul 19, 2024

On Maximum Entropy Linear Feature Inversion

Paul M Baggenstoss

We revisit the classical problem of inverting dimension-reducing linear mappings using the maximum entropy (MaxEnt) criterion. In the literature, solutions are problem-dependent, inconsistent, and use different entropy measures. We propose a new unified approach that not only specializes to the existing approaches, but offers solutions to new cases, such as when data values are constrained to [0, 1], which has new applications in machine learning.

2.6LGJan 20, 2024

Projected Belief Networks With Discriminative Alignment for Acoustic Event Classification: Rivaling State of the Art CNNs

Paul M. Baggenstoss, Kevin Wilkinghoff, Felix Govaers et al.

The projected belief network (PBN) is a generative stochastic network with tractable likelihood function based on a feed-forward neural network (FFNN). The generative function operates by "backing up" through the FFNN. The PBN is two networks in one, a FFNN that operates in the forward direction, and a generative network that operates in the backward direction. Both networks co-exist based on the same parameter set, have their own cost functions, and can be separately or jointly trained. The PBN therefore has the potential to possess the best qualities of both discriminative and generative classifiers. To realize this potential, a separate PBN is trained on each class, maximizing the generative likelihood function for the given class, while minimizing the discriminative cost for the FFNN against "all other classes". This technique, called discriminative alignment (PBN-DA), aligns the contours of the likelihood function to the decision boundaries and attains vastly improved classification performance, rivaling that of state of the art discriminative networks. The method may be further improved using a hidden Markov model (HMM) as a component of the PBN, called PBN-DA-HMM. This paper provides a comprehensive treatment of PBN, PBN-DA, and PBN-DA-HMM. In addition, the results of two new classification experiments are provided. The first experiment uses air-acoustic events, and the second uses underwater acoustic data consisting of marine mammal calls. In both experiments, PBN-DA-HMM attains comparable or better performance as a state of the art CNN, and attain a factor of two error reduction when combined with the CNN.

1.6LGApr 13, 2021

Maximum Entropy Auto-Encoding

Paul M Baggenstoss

In this paper, it is shown that an auto-encoder using optimal reconstruction significantly outperforms a conventional auto-encoder. Optimal reconstruction uses the conditional mean of the input given the features, under a maximum entropy prior distribution. The optimal reconstruction network, which is called deterministic projected belied network (D-PBN), resembles a standard reconstruction network, but with special non-linearities that mist be iteratively solved. The method, which can be seen as a generalization of maximum entropy image reconstruction, extends to multiple layers. In experiments, mean square reconstruction error reduced by up to a factor of two. The performance improvement diminishes for deeper networks, or for input data with unconstrained values (Gaussian assumption).

1.4MLAug 14, 2020

The Projected Belief Network Classfier : both Generative and Discriminative

Paul M Baggenstoss

The projected belief network (PBN) is a layered generative network with tractable likelihood function, and is based on a feed-forward neural network (FF-NN). It can therefore share an embodiment with a discriminative classifier and can inherit the best qualities of both types of network. In this paper, a convolutional PBN is constructed that is both fully discriminative and fully generative and is tested on spectrograms of spoken commands. It is shown that the network displays excellent qualities from either the discriminative or generative viewpoint. Random data synthesis and visible data reconstruction from low-dimensional hidden variables are shown, while classifier performance approaches that of a regularized discriminative network. Combination with a conventional discriminative CNN is also demonstrated.

3.8MLFeb 18, 2020

A Neural Network Based on First Principles

Paul M Baggenstoss

In this paper, a Neural network is derived from first principles, assuming only that each layer begins with a linear dimension-reducing transformation. The approach appeals to the principle of Maximum Entropy (MaxEnt) to find the posterior distribution of the input data of each layer, conditioned on the layer output variables. This posterior has a well-defined mean, the conditional mean estimator, that is calculated using a type of neural network with theoretically-derived activation functions similar to sigmoid, softplus, and relu. This implicitly provides a theoretical justification for their use. A theorem that finds the conditional distribution and conditional mean estimator under the MaxEnt prior is proposed, unifying results for special cases. Combining layers results in an auto-encoder with conventional feed-forward analysis network and a type of linear Bayesian belief network in the reconstruction path.

1.3MLJun 21, 2016

Kernel-based Generative Learning in Distortion Feature Space

Bo Tang, Paul M. Baggenstoss, Haibo He

This paper presents a novel kernel-based generative classifier which is defined in a distortion subspace using polynomial series expansion, named Kernel-Distortion (KD) classifier. An iterative kernel selection algorithm is developed to steadily improve classification performance by repeatedly removing and adding kernels. The experimental results on character recognition application not only show that the proposed generative classifier performs better than many existing classifiers, but also illustrate that it has different recognition capability compared to the state-of-the-art discriminative classifier - deep belief network. The recognition diversity indicates that a hybrid combination of the proposed generative classifier and the discriminative classifier could further improve the classification performance. Two hybrid combination methods, cascading and stacking, have been implemented to verify the diversity and the improvement of the proposed classifier.