LG ARMar 25

TsetlinWiSARD: On-Chip Training of Weightless Neural Networks using Tsetlin Automata on FPGAs

Shengyu Duan, Marcos L. L. Sartori, Rishad Shafik, Alex Yakovlev

arXiv:2603.241868.1h-index: 4

AI Analysis

This work addresses the need for efficient, on-chip training of ML algorithms at the edge, offering a novel method that improves accuracy and hardware efficiency for WNNs, though it is incremental in advancing existing WNN techniques.

The paper tackles the problem of overfitting and inefficiency in training Weightless Neural Networks (WNNs) for on-chip applications by proposing TsetlinWiSARD, which uses Tsetlin Automata for probabilistic learning, achieving over 1000x faster training and significant hardware improvements like 22% reduced resource usage and 93.3% lower latency compared to other FPGA-based accelerators.

Increasing demands for adaptability, privacy, and security at the edge have persistently pushed the frontiers for a new generation of machine learning (ML) algorithms with training and inference capabilities on-chip. Weightless Neural Network (WNN) is such an algorithm that is principled on lookup table based simple neuron structures. As a result, it offers architectural benefits, such as low-latency, low-complexity inference, compared to deep neural networks that depend heavily on multiply-accumulate operations. However, traditional WNNs rely on memorization-based one-shot training, which either leads to overfitting and reduced accuracy or requires tedious post-training adjustments, limiting their effectiveness for efficient on chip training. In this work, we propose TsetlinWiSARD, a training approach for WNNs that leverages Tsetlin Automata (TAs) to enable probabilistic, feedback-driven learning. It overcomes the overfitting of WiSARD's one-shot training with iterative optimization, while maintaining simple, continuous binary feedback for efficient on-chip training. Central to our approach is a field programmable gate array (FPGA)-based training architecture that delivers state-of-the-art accuracy while significantly improving hardware efficiency. Our approach provides over 1000x faster training when compared with the traditional WiSARD implementation of WNNs. Further, we demonstrate 22% reduced resource usage, 93.3% lower latency, and 64.2% lower power consumption compared to FPGA-based training accelerators implementing other ML algorithms.

View on arXiv PDF

Similar