SEMar 31
Automatic Identification of Parallelizable Loops Using Transformer-Based Source Code RepresentationsIzavan dos S. Correia, Henrique C. T. Santos, Tiago A. E. Ferreira
Automatic parallelization remains a challenging problem in software engineering, particularly in identifying code regions where loops can be safely executed in parallel on modern multi-core architectures. Traditional static analysis techniques, such as dependence analysis and polyhedral models, often struggle with irregular or dynamically structured code. In this work, we propose a Transformer-based approach to classify the parallelization potential of source code, focusing on distinguishing independent (parallelizable) loops from undefined ones. We adopt DistilBERT to process source code sequences using subword tokenization, enabling the model to capture contextual syntactic and semantic patterns without handcrafted features. The approach is evaluated on a balanced dataset combining synthetically generated loops and manually annotated real-world code, using 10-fold cross-validation and multiple performance metrics. Results show consistently high performance, with mean accuracy above 99\% and low false positive rates, demonstrating robustness and reliability. Compared to prior token-based methods, the proposed approach simplifies preprocessing while improving generalization and maintaining computational efficiency. These findings highlight the potential of lightweight Transformer models for practical identification of parallelization opportunities at the loop level.
SDSep 26, 2025
Text-Independent Speaker Identification Using Audio Looping With Margin Based Loss FunctionsElliot Q C Garcia, Nicéias Silva Vilela, Kátia Pires Nascimento do Sacramento et al.
Speaker identification has become a crucial component in various applications, including security systems, virtual assistants, and personalized user experiences. In this paper, we investigate the effectiveness of CosFace Loss and ArcFace Loss for text-independent speaker identification using a Convolutional Neural Network architecture based on the VGG16 model, modified to accommodate mel spectrogram inputs of variable sizes generated from the Voxceleb1 dataset. Our approach involves implementing both loss functions to analyze their effects on model accuracy and robustness, where the Softmax loss function was employed as a comparative baseline. Additionally, we examine how the sizes of mel spectrograms and their varying time lengths influence model performance. The experimental results demonstrate superior identification accuracy compared to traditional Softmax loss methods. Furthermore, we discuss the implications of these findings for future research.
LGSep 5, 2025
Discovering Software Parallelization Points Using Deep Neural NetworksIzavan dos S. Correia, Henrique C. T. Santos, Tiago A. E. Ferreira
This study proposes a deep learning-based approach for discovering loops in programming code according to their potential for parallelization. Two genetic algorithm-based code generators were developed to produce two distinct types of code: (i) independent loops, which are parallelizable, and (ii) ambiguous loops, whose dependencies are unclear, making them impossible to define if the loop is parallelizable or not. The generated code snippets were tokenized and preprocessed to ensure a robust dataset. Two deep learning models - a Deep Neural Network (DNN) and a Convolutional Neural Network (CNN) - were implemented to perform the classification. Based on 30 independent runs, a robust statistical analysis was employed to verify the expected performance of both models, DNN and CNN. The CNN showed a slightly higher mean performance, but the two models had a similar variability. Experiments with varying dataset sizes highlighted the importance of data diversity for model performance. These results demonstrate the feasibility of using deep learning to automate the identification of parallelizable structures in code, offering a promising tool for software optimization and performance improvement.
NEJan 15, 2021
A New Artificial Neuron Proposal with Trainable Simultaneous Local and Global Activation FunctionTiago A. E. Ferreira, Marios Mattheakis, Pavlos Protopapas
The activation function plays a fundamental role in the artificial neural network learning process. However, there is no obvious choice or procedure to determine the best activation function, which depends on the problem. This study proposes a new artificial neuron, named global-local neuron, with a trainable activation function composed of two components, a global and a local. The global component term used here is relative to a mathematical function to describe a general feature present in all problem domain. The local component is a function that can represent a localized behavior, like a transient or a perturbation. This new neuron can define the importance of each activation function component in the learning phase. Depending on the problem, it results in a purely global, or purely local, or a mixed global and local activation function after the training phase. Here, the trigonometric sine function was employed for the global component and the hyperbolic tangent for the local component. The proposed neuron was tested for problems where the target was a purely global function, or purely local function, or a composition of two global and local functions. Two classes of test problems were investigated, regression problems and differential equations solving. The experimental tests demonstrated the Global-Local Neuron network's superior performance, compared with simple neural networks with sine or hyperbolic tangent activation function, and with a hybrid network that combines these two simple neural networks.
GR-QCMar 22, 2020
Gravitational Wave Detection and Information Extraction via Neural NetworksGerson R. Santos, Marcela P. Figueiredo, Antonio de Pádua Santos et al.
Laser Interferometer Gravitational-Wave Observatory (LIGO) was the first laboratory to measure the gravitational waves. It was needed an exceptional experimental design to measure distance changes much less than a radius of a proton. In the same way, the data analyses to confirm and extract information is a tremendously hard task. Here, it is shown a computational procedure base on artificial neural networks to detect a gravitation wave event and extract the knowledge of its ring-down time from the LIGO data. With this proposal, it is possible to make a probabilistic thermometer for gravitational wave detection and obtain physical information about the astronomical body system that created the phenomenon. Here, the ring-down time is determined with a direct data measure, without the need to use numerical relativity techniques and high computational power.
SPJan 16, 2020
Wine quality rapid detection using a compact electronic nose system: application focused on spoilage thresholds by acetic acidJuan C. Rodriguez Gamboa, Eva Susana Albarracin E., Adenilton J. da Silva et al.
It is crucial for the wine industry to have methods like electronic nose systems (E-Noses) for real-time monitoring thresholds of acetic acid in wines, preventing its spoilage or determining its quality. In this paper, we prove that the portable and compact self-developed E-Nose, based on thin film semiconductor (SnO2) sensors and trained with an approach that uses deep Multilayer Perceptron (MLP) neural network, can perform early detection of wine spoilage thresholds in routine tasks of wine quality control. To obtain rapid and online detection, we propose a method of rising-window focused on raw data processing to find an early portion of the sensor signals with the best recognition performance. Our approach was compared with the conventional approach employed in E-Noses for gas recognition that involves feature extraction and selection techniques for preprocessing data, succeeded by a Support Vector Machine (SVM) classifier. The results evidence that is possible to classify three wine spoilage levels in 2.7 seconds after the gas injection point, implying in a methodology 63 times faster than the results obtained with the conventional approach in our experimental setup.