LGJul 14, 2024
Deep Learning Activation Functions: Fixed-Shape, Parametric, Adaptive, Stochastic, Miscellaneous, Non-Standard, EnsembleM. M. Hammad
In the architecture of deep learning models, inspired by biological neurons, activation functions (AFs) play a pivotal role. They significantly influence the performance of artificial neural networks. By modulating the non-linear properties essential for learning complex patterns, AFs are fundamental in both classification and regression tasks. This paper presents a comprehensive review of various types of AFs, including fixed-shape, parametric, adaptive, stochastic/probabilistic, non-standard, and ensemble/combining types. We begin with a systematic taxonomy and detailed classification frameworks that delineates the principal characteristics of AFs and organizes them based on their structural and functional distinctions. Our in-depth analysis covers primary groups such as sigmoid-based, ReLU-based, and ELU-based AFs, discussing their theoretical foundations, mathematical formulations, and specific benefits and limitations in different contexts. We also highlight key attributes of AFs such as output range, monotonicity, and smoothness. Furthermore, we explore miscellaneous AFs that do not conform to these categories but have shown unique advantages in specialized applications. Non-standard AFs are also explored, showcasing cutting-edge variations that challenge traditional paradigms and offer enhanced adaptability and model performance. We examine strategies for combining multiple AFs to leverage complementary properties. The paper concludes with a comparative evaluation of 12 state-of-the-art AFs, using rigorous statistical and experimental methodologies to assess their efficacy. This analysis not only aids practitioners in selecting and designing the most appropriate AFs for their specific deep learning tasks but also encourages continued innovation in AF development within the machine learning community.
LGJul 27, 2024
Comprehensive Survey of Complex-Valued Neural Networks: Insights into Backpropagation and Activation FunctionsM. M. Hammad
Artificial neural networks (ANNs), particularly those employing deep learning models, have found widespread application in fields such as computer vision, signal processing, and wireless communications, where complex numbers are crucial. Despite the prevailing use of real-number implementations in current ANN frameworks, there is a growing interest in developing ANNs that utilize complex numbers. This paper presents a comprehensive survey of recent advancements in complex-valued neural networks (CVNNs), focusing on their activation functions (AFs) and learning algorithms. We delve into the extension of the backpropagation algorithm to the complex domain, which enables the training of neural networks with complex-valued inputs, weights, AFs, and outputs. This survey considers three complex backpropagation algorithms: the complex derivative approach, the partial derivatives approach, and algorithms incorporating the Cauchy-Riemann equations. A significant challenge in CVNN design is the identification of suitable nonlinear Complex Valued Activation Functions (CVAFs), due to the conflict between boundedness and differentiability over the entire complex plane as stated by Liouville theorem. We examine both fully complex AFs, which strive for boundedness and differentiability, and split AFs, which offer a practical compromise despite not preserving analyticity. This review provides an in-depth analysis of various CVAFs essential for constructing effective CVNNs. Moreover, this survey not only offers a comprehensive overview of the current state of CVNNs but also contributes to ongoing research and development by introducing a new set of CVAFs (fully complex, split and complex amplitude-phase AFs).
LGAug 12, 2024
Artificial Neural Network and Deep Learning: Fundamentals and TheoryM. M. Hammad
"Artificial Neural Network and Deep Learning: Fundamentals and Theory" offers a comprehensive exploration of the foundational principles and advanced methodologies in neural networks and deep learning. This book begins with essential concepts in descriptive statistics and probability theory, laying a solid groundwork for understanding data and probability distributions. As the reader progresses, they are introduced to matrix calculus and gradient optimization, crucial for training and fine-tuning neural networks. The book delves into multilayer feed-forward neural networks, explaining their architecture, training processes, and the backpropagation algorithm. Key challenges in neural network optimization, such as activation function saturation, vanishing and exploding gradients, and weight initialization, are thoroughly discussed. The text covers various learning rate schedules and adaptive algorithms, providing strategies to optimize the training process. Techniques for generalization and hyperparameter tuning, including Bayesian optimization and Gaussian processes, are also presented to enhance model performance and prevent overfitting. Advanced activation functions are explored in detail, categorized into sigmoid-based, ReLU-based, ELU-based, miscellaneous, non-standard, and combined types. Each activation function is examined for its properties and applications, offering readers a deep understanding of their impact on neural network behavior. The final chapter introduces complex-valued neural networks, discussing complex numbers, functions, and visualizations, as well as complex calculus and backpropagation algorithms. This book equips readers with the knowledge and skills necessary to design, and optimize advanced neural network models, contributing to the ongoing advancements in artificial intelligence.
NAJan 5
Variational (Energy-Based) Spectral Learning: A Machine Learning Framework for Solving Partial Differential EquationsM. M. Hammad
We introduce variational spectral learning (VSL), a machine learning framework for solving partial differential equations (PDEs) that operates directly in the coefficient space of spectral expansions. VSL offers a principled bridge between variational PDE theory, spectral discretization, and contemporary machine learning practice. The core idea is to recast a given PDE \[ \mathcal{L}u = f \quad \text{in} \quad Q=Ω\times(0,T), \] together with boundary and initial conditions, into differentiable space-time energies built from strong-form least-squares residuals and weak (Galerkin) formulations. The solution is represented as a finite spectral expansion \[ u_N(x,t)=\sum_{n=1}^{N} c_n\,φ_n(x,t), \] where $φ_n$ are tensor-product Chebyshev bases in space and time, with Dirichlet-satisfying spatial modes enforcing homogeneous boundary conditions analytically. This yields a compact linear parameterization in the coefficient vector $\mathbf{c}$, while all PDE complexity is absorbed into the variational energy. We show how to construct strong-form and weak-form space-time functionals, augment them with initial-condition and Tikhonov regularization terms, and minimize the resulting objective with gradient-based optimization. In practice, VSL is implemented in TensorFlow using automatic differentiation and Keras cosine-decay-with-restarts learning-rate schedules, enabling robust optimization of moderately sized coefficient vectors. Numerical experiments on benchmark elliptic and parabolic problems, including one- and two-dimensional Poisson, diffusion, and Burgers-type equations, demonstrate that VSL attains accuracy comparable to classical spectral collocation with Crank-Nicolson time stepping, while providing a differentiable objective suitable for modern optimization tooling.
LGOct 27, 2025
Schrodinger Neural Network and Uncertainty Quantification: Quantum MachineM. M. Hammad
We introduce the Schrodinger Neural Network (SNN), a principled architecture for conditional density estimation and uncertainty quantification inspired by quantum mechanics. The SNN maps each input to a normalized wave function on the output domain and computes predictive probabilities via the Born rule. The SNN departs from standard parametric likelihood heads by learning complex coefficients of a spectral expansion (e . g ., Chebyshev polynomials) whose squared modulus yields the conditional density $p(y|x)=\left| ψ_x(y)\right| {}^2$ with analytic normalization. This representation confers three practical advantages: positivity and exact normalization by construction, native multimodality through interference among basis modes without explicit mixture bookkeeping, and yields closed-form (or efficiently computable) functionals$-$such as moments and several calibration diagnostics$-$as quadratic forms in coefficient space. We develop the statistical and computational foundations of the SNN, including (i) training by exact maximum-likelihood with unit-sphere coefficient parameterization, (ii) physics-inspired quadratic regularizers (kinetic and potential energies) motivated by uncertainty relations between localization and spectral complexity, (iii) scalable low-rank and separable extensions for multivariate outputs, (iv) operator-based extensions that represent observables, constraints, and weak labels as self-adjoint matrices acting on the amplitude space, and (v) a comprehensive framework for evaluating multimodal predictions. The SNN provides a coherent, tractable framework to elevate probabilistic prediction from point estimates to physically inspired amplitude-based distributions.