CVAug 20, 2024
Hierarchical Attention Diffusion Networks with Object Priors for Video Change DetectionAndrew Kiruluta, Eric Lundy, Andreas Lemos
We present a unified change detection pipeline that combines instance level masking, multi\-scale attention within a denoising diffusion model, and per pixel semantic classification, all refined via SSIM to match human perception. By first isolating only temporally novel objects with Mask R\-CNN, then guiding diffusion updates through hierarchical cross attention to object and global contexts, and finally categorizing each pixel into one of C change types, our method delivers detailed, interpretable multi\-class maps. It outperforms traditional differencing, Siamese CNNs, and GAN\-based detectors by 10\-25 points in F1 and IoU on both synthetic and real world benchmarks, marking a new state of the art in remote sensing change detection.
LGAug 19, 2024
Unsupervised Machine Learning Hybrid Approach Integrating Linear Programming in Loss Function: A Robust Optimization TechniqueAndrew Kiruluta, Andreas Lemos
This paper presents a novel hybrid approach that integrates linear programming (LP) within the loss function of an unsupervised machine learning model. By leveraging the strengths of both optimization techniques and machine learning, this method introduces a robust framework for solving complex optimization problems where traditional methods may fall short. The proposed approach encapsulates the constraints and objectives of a linear programming problem directly into the loss function, guiding the learning process to adhere to these constraints while optimizing the desired outcomes. This technique not only preserves the interpretability of linear programming but also benefits from the flexibility and adaptability of machine learning, making it particularly well-suited for unsupervised or semi-supervised learning scenarios.
CVApr 4, 2025
A Hybrid Wavelet-Fourier Method for Next-Generation Conditional Diffusion ModelsAndrew Kiruluta, Andreas Lemos
We present a novel generative modeling framework,Wavelet-Fourier-Diffusion, which adapts the diffusion paradigm to hybrid frequency representations in order to synthesize high-quality, high-fidelity images with improved spatial localization. In contrast to conventional diffusion models that rely exclusively on additive noise in pixel space, our approach leverages a multi-transform that combines wavelet sub-band decomposition with partial Fourier steps. This strategy progressively degrades and then reconstructs images in a hybrid spectral domain during the forward and reverse diffusion processes. By supplementing traditional Fourier-based analysis with the spatial localization capabilities of wavelets, our model can capture both global structures and fine-grained features more effectively. We further extend the approach to conditional image generation by integrating embeddings or conditional features via cross-attention. Experimental evaluations on CIFAR-10, CelebA-HQ, and a conditional ImageNet subset illustrate that our method achieves competitive or superior performance relative to baseline diffusion models and state-of-the-art GANs, as measured by Fréchet Inception Distance (FID) and Inception Score (IS). We also show how the hybrid frequency-based representation improves control over global coherence and fine texture synthesis, paving the way for new directions in multi-scale generative modeling.
AIFeb 14, 2025
A Self-Supervised Reinforcement Learning Approach for Fine-Tuning Large Language Models Using Cross-Attention SignalsAndrew Kiruluta, Andreas Lemos, Priscilla Burity
We propose a novel reinforcement learning framework for post training large language models that does not rely on human in the loop feedback. Instead, our approach uses cross attention signals within the model itself to derive a self supervised reward, thereby guiding iterative fine tuning of the model policy. By analyzing how the model attends to the input prompt during generation, we construct measures of prompt coverage, focus, and coherence. We then use these measures to rank or score candidate responses, providing a reward signal that encourages the model to produce well aligned, on topic text. In empirical comparisons against standard policy gradient methods and RL fine tuning with synthetic preference models, our method shows significant gains in prompt relevance and consistency over a non RL baseline. While it does not yet match the performance of fully human supervised RLHF systems, it highlights an important direction for scaling alignment with minimal human labeling. We provide a detailed analysis, discuss potential limitations, and outline future work for combining cross-attention based signals with smaller amounts of human feedback.
LGMar 16, 2025
State Fourier Diffusion Language Model (SFDLM): A Scalable, Novel Iterative Approach to Language ModelingAndrew Kiruluta, Andreas Lemos
In recent years, diffusion based methods have emerged as a powerful paradigm for generative modeling. Although discrete diffusion for natural language processing has been explored to a lesser extent, it shows promise for tasks requiring iterative denoising of token based data. In standard approaches to text generation, transformers dominate, but their reliance on self attention often incurs high computational costs. This paper introduces a fully diffusion driven discrete text generation model built without any transformer or large convolution modules. Instead, the model integrates structured state space dynamics in the time domain with a novel Complex Fourier Multi Layer Perceptron module that operates in the frequency domain. The forward noising process randomly samples the vocabulary to replace tokens with a controlled probability, while the learned reverse model systematically reverts corrupted sequences toward their original states. By composing local state space updates with global Fourier based mixing, the approach effectively captures both short and long range dependencies.
LGJul 27, 2025
Beyond Neural Networks: Symbolic Reasoning over Wavelet Logic Graph SignalsAndrew Kiruluta, Andreas Lemos, Priscilla Burity
We present a fully non neural learning framework based on Graph Laplacian Wavelet Transforms (GLWT). Unlike traditional architectures that rely on convolutional, recurrent, or attention based neural networks, our model operates purely in the graph spectral domain using structured multiscale filtering, nonlinear shrinkage, and symbolic logic over wavelet coefficients. Signals defined on graph nodes are decomposed via GLWT, modulated with interpretable nonlinearities, and recombined for downstream tasks such as denoising and token classification. The system supports compositional reasoning through a symbolic domain-specific language (DSL) over graph wavelet activations. Experiments on synthetic graph denoising and linguistic token graphs demonstrate competitive performance against lightweight GNNs with far greater transparency and efficiency. This work proposes a principled, interpretable, and resource-efficient alternative to deep neural architectures for learning on graphs.
LGJul 27, 2025
Operator-Based Machine Intelligence: A Hilbert Space Framework for Spectral Learning and Symbolic ReasoningAndrew Kiruluta, Andreas Lemos, Priscilla Burity
Traditional machine learning models, particularly neural networks, are rooted in finite-dimensional parameter spaces and nonlinear function approximations. This report explores an alternative formulation where learning tasks are expressed as sampling and computation in infinite dimensional Hilbert spaces, leveraging tools from functional analysis, signal processing, and spectral theory. We review foundational concepts such as Reproducing Kernel Hilbert Spaces (RKHS), spectral operator learning, and wavelet-domain representations. We present a rigorous mathematical formulation of learning in Hilbert spaces, highlight recent models based on scattering transforms and Koopman operators, and discuss advantages and limitations relative to conventional neural architectures. The report concludes by outlining directions for scalable and interpretable machine learning grounded in Hilbertian signal processing.
CLJun 8, 2025
History-Aware Cross-Attention Reinforcement: Self-Supervised Multi Turn and Chain-of-Thought Fine-Tuning with vLLMAndrew Kiruluta, Andreas Lemos, Priscilla Burity
We present CAGSR-vLLM-MTC, an extension of our Self-Supervised Cross-Attention-Guided Reinforcement (CAGSR) framework, now implemented on the high-performance vLLM runtime, to address both multi-turn dialogue and chain-of-thought reasoning. Building upon our original single-turn approach, we first instrumented vLLM's C++/CUDA kernels to asynchronously capture per-layer, per-head cross-attention weights during generation. We then generalized our self-supervised reward function to accumulate attention signals over entire conversation histories and intermediate chain-of-thought steps. We discuss practical trade-offs, including an entropy-based clamping mechanism to prevent attention collapse on early context, and outline future directions for multi-party dialogues and hierarchical reasoning.
LGMar 4, 2025
FourierNAT: A Fourier-Mixing-Based Non-Autoregressive Transformer for Parallel Sequence GenerationAndrew Kiruluta, Eric Lundy, Andreas Lemos
We present FourierNAT, a novel non-autoregressive Transformer (NAT) architecture that employs Fourier-based mixing in the decoder to generate output sequences in parallel. While traditional NAT approaches often face challenges with capturing global dependencies, our method leverages a discrete Fourier transform to mix token embeddings across the entire sequence dimension, coupled with learned frequency-domain gating. This allows the model to efficiently propagate context without explicit autoregressive steps. Empirically, FourierNAT achieves competitive results against leading NAT baselines on standard benchmarks like WMT machine translation and CNN/DailyMail summarization, providing significant speed advantages over autoregressive Transformers. We further demonstrate that learned frequency-domain parameters allow the model to adaptively focus on long-range or short-range dependencies, partially mitigating the well-known coherence gaps in one-pass NAT generation. Overall, FourierNAT highlights the potential of integrating spectral-domain operations to accelerate and improve parallel text generation. This approach can potentially provide great computational and time savings in inference tasks LLMs.
CLNov 25, 2021
Beyond Self Attention: A Subquadratic Fourier Wavelet Transformer with Multi Modal FusionAndrew Kiruluta, Andreas Lemos, Eric Lundy
We revisit the use of spectral techniques to replaces the attention mechanism in Transformers through Fourier Transform based token mixing, and present a comprehensive and novel reformulation of this technique in next generation transformer models. We provide expanded literature context, detailed mathematical formulations of Fourier mixing and causal masking, and introduce a novel MultiDomain Fourier Wavelet Attention(MDFWA) that integrates frequency and time localized transforms to capture both global and local dependencies efficiently. We derive the complexity bounds, gradient formulas, and show that MDFWA achieves sub quadratic time and memory cost while improving expressive power. We validate our design on an abstractive summarization task using PubMed dataset, by enhancing the proposed approach with learned frequency bases, adaptive scale selection, and multi-modal extensions.