LGMay 25
Classification and detection of multiple UAVs using rational Gaussian wavelet neural networksUngvári Gergő, Ferenc Braun, Attila Ámon et al.
The detection of unmanned aerial vehicles (UAVs) is important for the protection of civilian and military infrastructure. In this paper we propose a cost effective UAV detection system using sound signals obtained from microphones. The recorded signals are passed through a signal processing pipeline which employs interpretable adaptive feature extractors using so-called rational Gaussian wavelets. These adaptive wavelet transformations are embedded into and trained together with an underlying small neural network which detects and classifies UAVs based on the obtained features. This leads to a physically interpretable machine learning algorithm that in addition to classifying UAVs is also capable of detecting UAV swarms. We demonstrate our results using data collected in indoor studio and noisy outdoor environments. We conclude that the proposed method outperforms traditional machine learning approaches for detecting and classifying single UAVs as well as drone swarms, while retaining a high degree of interpretability. Our implementation of the proposed methods is made publicly available for reproducibility.
LGJul 4, 2023
Learning ECG Signal Features Without Backpropagation Using Linear LawsPéter Pósfay, Marcell T. Kurbucz, Péter Kovács et al.
This paper introduces LLT-ECG, a novel method for electrocardiogram (ECG) signal classification that leverages concepts from theoretical physics to automatically generate features from time series data. Unlike traditional deep learning approaches, LLT-ECG operates in a forward manner, eliminating the need for backpropagation and hyperparameter tuning. By identifying linear laws that capture shared patterns within specific classes, the proposed method constructs a compact and verifiable representation, enhancing the effectiveness of downstream classifiers. We demonstrate LLT-ECG's state-of-the-art performance on real-world ECG datasets from PhysioNet, underscoring its potential for medical applications where speed and verifiability are crucial.
MLFeb 3, 2025
Rational Gaussian wavelets and corresponding model driven neural networksAttila Miklós Ámon, Kristian Fenech, Péter Kovács et al.
In this paper we consider the continuous wavelet transform using Gaussian wavelets multiplied by an appropriate rational term. The zeros and poles of this rational modifier act as free parameters and their choice highly influences the shape of the mother wavelet. This allows the proposed construction to approximate signals with complex morphology using only a few wavelet coefficients. We show that the proposed rational Gaussian wavelets are admissible and provide numerical approximations of the wavelet coefficients using variable projection operators. In addition, we show how the proposed variable projection based rational Gaussian wavelet transform can be used in neural networks to obtain a highly interpretable feature learning layer. We demonstrate the effectiveness of the proposed scheme through a biomedical application, namely, the detection of ventricular ectopic beats (VEBs) in real ECG measurements.
LGApr 18, 2025
Transformer Encoder and Multi-features Time2Vec for Financial PredictionNguyen Kim Hai Bui, Nguyen Duy Chien, Péter Kovács et al.
Financial prediction is a complex and challenging task of time series analysis and signal processing, expected to model both short-term fluctuations and long-term temporal dependencies. Transformers have remarkable success mostly in natural language processing using attention mechanism, which also influenced the time series community. The ability to capture both short and long-range dependencies helps to understand the financial market and to recognize price patterns, leading to successful applications of Transformers in stock prediction. Although, the previous research predominantly focuses on individual features and singular predictions, that limits the model's ability to understand broader market trends. In reality, within sectors such as finance and technology, companies belonging to the same industry often exhibit correlated stock price movements. In this paper, we develop a novel neural network architecture by integrating Time2Vec with the Encoder of the Transformer model. Based on the study of different markets, we propose a novel correlation feature selection method. Through a comprehensive fine-tuning of multiple hyperparameters, we conduct a comparative analysis of our results against benchmark models. We conclude that our method outperforms other state-of-the-art encoding methods such as positional encoding, and we also conclude that selecting correlation features enhance the accuracy of predicting multiple stock prices.
GRMar 12, 2025
Hybrid Rendering for Multimodal Autonomous Driving: Merging Neural and Physics-Based SimulationMáté Tóth, Péter Kovács, Zoltán Bendefy et al.
Neural reconstruction models for autonomous driving simulation have made significant strides in recent years, with dynamic models becoming increasingly prevalent. However, these models are typically limited to handling in-domain objects closely following their original trajectories. We introduce a hybrid approach that combines the strengths of neural reconstruction with physics-based rendering. This method enables the virtual placement of traditional mesh-based dynamic agents at arbitrary locations, adjustments to environmental conditions, and rendering from novel camera viewpoints. Our approach significantly enhances novel view synthesis quality -- especially for road surfaces and lane markings -- while maintaining interactive frame rates through our novel training method, NeRF2GS. This technique leverages the superior generalization capabilities of NeRF-based methods and the real-time rendering speed of 3D Gaussian Splatting (3DGS). We achieve this by training a customized NeRF model on the original images with depth regularization derived from a noisy LiDAR point cloud, then using it as a teacher model for 3DGS training. This process ensures accurate depth, surface normals, and camera appearance modeling as supervision. With our block-based training parallelization, the method can handle large-scale reconstructions (greater than or equal to 100,000 square meters) and predict segmentation masks, surface normals, and depth maps. During simulation, it supports a rasterization-based rendering backend with depth-based composition and multiple camera models for real-time camera simulation, as well as a ray-traced backend for precise LiDAR simulation.
LGJun 28, 2020
VPNet: Variable Projection NetworksPéter Kovács, Gergő Bognár, Christian Huber et al.
We introduce VPNet, a novel model-driven neural network architecture based on variable projection (VP). Applying VP operators to neural networks results in learnable features, interpretable parameters, and compact network structures. This paper discusses the motivation and mathematical background of VPNet and presents experiments. The VPNet approach was evaluated in the context of signal processing, where we classified a synthetic dataset and real electrocardiogram (ECG) signals. Compared to fully connected and one-dimensional convolutional networks, VPNet offers fast learning ability and good accuracy at a low computational cost of both training and inference. Based on these advantages and the promising results obtained, we anticipate a profound impact on the broader field of signal processing, in particular on classification, regression and clustering problems.