LGMLMay 12

A Composite Activation Function for Learning Stable Binary Representations

arXiv:2605.1155851.1
Predicted impact top 49% in LG · last 90 daysOriginality Synthesis-oriented
AI Analysis

For researchers training neural networks with binary activations, this work provides a practical solution to the non-differentiability problem, though it is an incremental improvement over existing smooth approximations.

The paper proposes a smooth approximation to the Heaviside function (HTAF) that enables stable training of binary activation networks via gradient-based optimization. The method achieves prediction performance comparable to or better than standard models on image datasets.

Activation functions play a central role in neural networks by shaping internal representations. Recently, learning binary activation representations has attracted significant attention due to their advantages in computational and memory efficiency, as well as interpretability. However, training neural networks with Heaviside activations remains challenging, as their non-differentiability obstructs standard gradient-based optimization. In this paper, we propose Heavy Tailed Activation Function (HTAF), a smooth approximation to the Heaviside function that enables stable training with gradient-based optimization. We construct HTAF as a sigmoid hyperbolic tangent composite function and theoretically show that it maintains a large gradient mass around zero inputs while exhibiting slower gradient decay in the tail regions. We show that Spiking Neural Networks, Binary Neural Networks and Deep Heaviside neural Networks can be trained stably using HTAF with gradient-based optimization. Finally, we introduce Implicit Concept Bottleneck Models (ICBMs), an interpretable image model that leverages HTAF to induce discrete feature representations. Extensive experiments across various architectures and image datasets demonstrate that ICBM enables stable discretization while achieving prediction performance comparable to or better than standard models.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes