Dat Thanh Tran

CV
22papers
579citations
Novelty51%
AI Score47

22 Papers

96.9NEJun 2Code
Beyond Static Priors: Dynamic Neural Guidance for Large-Scale Ant Colony Optimization

Dat Thanh Tran, Van Khu Vu, Yining Ma

Neural-guided Ant Colony Optimization (ACO) suffers from a fundamental training-inference misalignment: policies are typically trained to generate static priors (e.g., heatmaps), yet deployed to guide iterative, long-horizon search processes. In this paper, we present DyNACO, a novel framework that achieves dynamic neural guidance by periodically observing the pheromone distribution and the incumbent solution. To make DyNACO tractable at scale, we pair the policy with a perturbation-based ACO backend and a scope-restricted refinement mechanism that jointly ensure efficacy and stable credit assignment. On TSP, DyNACO scales to 100,000-node instances and outperforms neural baselines while often reducing total runtime compared to the unguided solver. We extend DyNACO to CVRP via a capacity-aware backend, consistently improving the unguided baseline with less than 1% neural overhead. We further provide in-depth analysis validating the model's generalization capabilities and elucidating why dynamic guidance outperforms static priors. Our work underscores the necessity of aligning neural training with iterative search dynamics in learning-guided optimization. The code is available at https://github.com/shoraaa/DyNACO.

CVNov 1, 2022
Recognition of Defective Mineral Wool Using Pruned ResNet Models

Mehdi Rafiei, Dat Thanh Tran, Alexandros Iosifidis

Mineral wool production is a non-linear process that makes it hard to control the final quality. Therefore, having a non-destructive method to analyze the product quality and recognize defective products is critical. For this purpose, we developed a visual quality control system for mineral wool. X-ray images of wool specimens were collected to create a training set of defective and non-defective samples. Afterward, we developed several recognition models based on the ResNet architecture to find the most efficient model. In order to have a light-weight and fast inference model for real-life applicability, two structural pruning methods are applied to the classifiers. Considering the low quantity of the dataset, cross-validation and augmentation methods are used during the training. As a result, we obtained a model with more than 98% accuracy, which in comparison to the current procedure used at the company, it can recognize 20% more defective products.

LGJul 23, 2022
Augmented Bilinear Network for Incremental Multi-Stock Time-Series Classification

Mostafa Shabani, Dat Thanh Tran, Juho Kanniainen et al.

Deep Learning models have become dominant in tackling financial time-series analysis problems, overturning conventional machine learning and statistical methods. Most often, a model trained for one market or security cannot be directly applied to another market or security due to differences inherent in the market conditions. In addition, as the market evolves through time, it is necessary to update the existing models or train new ones when new data is made available. This scenario, which is inherent in most financial forecasting applications, naturally raises the following research question: How to efficiently adapt a pre-trained model to a new set of data while retaining performance on the old data, especially when the old data is not accessible? In this paper, we propose a method to efficiently retain the knowledge available in a neural network pre-trained on a set of securities and adapt it to achieve high performance in new ones. In our method, the prior knowledge encoded in a pre-trained neural network is maintained by keeping existing connections fixed, and this knowledge is adjusted for the new securities by a set of augmented connections, which are optimized using the new data. The auxiliary connections are constrained to be of low rank. This not only allows us to rapidly optimize for the new task but also reduces the storage and run-time complexity during the deployment phase. The efficiency of our approach is empirically validated in the stock mid-price movement prediction problem using a large-scale limit order book dataset. Experimental results show that our approach enhances prediction performance as well as reduces the overall number of network parameters.

LGJul 4, 2022
Variational Neural Networks

Illia Oleksiienko, Dat Thanh Tran, Alexandros Iosifidis

Bayesian Neural Networks (BNNs) provide a tool to estimate the uncertainty of a neural network by considering a distribution over weights and sampling different models for each input. In this paper, we propose a method for uncertainty estimation in neural networks which, instead of considering a distribution over weights, samples outputs of each layer from a corresponding Gaussian distribution, parametrized by the predictions of mean and variance sub-layers. In uncertainty quality estimation experiments, we show that the proposed method achieves better uncertainty quality than other single-bin Bayesian Model Averaging methods, such as Monte Carlo Dropout or Bayes By Backpropagation methods.

LGOct 2, 2023
Cryptocurrency Portfolio Optimization by Neural Networks

Quoc Minh Nguyen, Dat Thanh Tran, Juho Kanniainen et al.

Many cryptocurrency brokers nowadays offer a variety of derivative assets that allow traders to perform hedging or speculation. This paper proposes an effective algorithm based on neural networks to take advantage of these investment products. The proposed algorithm constructs a portfolio that contains a pair of negatively correlated assets. A deep neural network, which outputs the allocation weight of each asset at a time interval, is trained to maximize the Sharpe ratio. A novel loss term is proposed to regulate the network's bias towards a specific asset, thus enforcing the network to learn an allocation strategy that is close to a minimum variance strategy. Extensive experiments were conducted using data collected from Binance spanning 19 months to evaluate the effectiveness of our approach. The backtest results show that the proposed algorithm can produce neural networks that are able to make profits in different market situations.

NESep 21, 2025
NeuFACO: Neural Focused Ant Colony Optimization for Traveling Salesman Problem

Dat Thanh Tran, Khai Quang Tran, Khoi Anh Pham et al.

This study presents Neural Focused Ant Colony Optimization (NeuFACO), a non-autoregressive framework for the Traveling Salesman Problem (TSP) that combines advanced reinforcement learning with enhanced Ant Colony Optimization (ACO). NeuFACO employs Proximal Policy Optimization (PPO) with entropy regularization to train a graph neural network for instance-specific heuristic guidance, which is integrated into an optimized ACO framework featuring candidate lists, restricted tour refinement, and scalable local search. By leveraging amortized inference alongside ACO stochastic exploration, NeuFACO efficiently produces high-quality solutions across diverse TSP instances.

LGJan 14, 2022
Multi-head Temporal Attention-Augmented Bilinear Network for Financial time series prediction

Mostafa Shabani, Dat Thanh Tran, Martin Magris et al.

Financial time-series forecasting is one of the most challenging domains in the field of time-series analysis. This is mostly due to the highly non-stationary and noisy nature of financial time-series data. With progressive efforts of the community to design specialized neural networks incorporating prior domain knowledge, many financial analysis and forecasting problems have been successfully tackled. The temporal attention mechanism is a neural layer design that recently gained popularity due to its ability to focus on important temporal events. In this paper, we propose a neural layer based on the ideas of temporal attention and multi-head attention to extend the capability of the underlying neural network in focusing simultaneously on multiple temporal instances. The effectiveness of our approach is validated using large-scale limit-order book market data to forecast the direction of mid-price movements. Our experiments show that the use of multi-head temporal attention modules leads to enhanced prediction performances compared to baseline models.

CVSep 2, 2021
Remote Multilinear Compressive Learning with Adaptive Compression

Dat Thanh Tran, Moncef Gabbouj, Alexandros Iosifidis

Multilinear Compressive Learning (MCL) is an efficient signal acquisition and learning paradigm for multidimensional signals. The level of signal compression affects the detection or classification performance of a MCL model, with higher compression rates often associated with lower inference accuracy. However, higher compression rates are more amenable to a wider range of applications, especially those that require low operating bandwidth and minimal energy consumption such as Internet-of-Things (IoT) applications. Many communication protocols provide support for adaptive data transmission to maximize the throughput and minimize energy consumption. By developing compressive sensing and learning models that can operate with an adaptive compression rate, we can maximize the informational content throughput of the whole application. In this paper, we propose a novel optimization scheme that enables such a feature for MCL models. Our proposal enables practical implementation of adaptive compressive signal acquisition and inference systems. Experimental results demonstrated that the proposed approach can significantly reduce the amount of computations required during the training phase of remote learning systems but also improve the informational content throughput via adaptive-rate sensing.

STSep 1, 2021
Bilinear Input Normalization for Neural Networks in Financial Forecasting

Dat Thanh Tran, Juho Kanniainen, Moncef Gabbouj et al.

Data normalization is one of the most important preprocessing steps when building a machine learning model, especially when the model of interest is a deep neural network. This is because deep neural network optimized with stochastic gradient descent is sensitive to the input variable range and prone to numerical issues. Different than other types of signals, financial time-series often exhibit unique characteristics such as high volatility, non-stationarity and multi-modality that make them challenging to work with, often requiring expert domain knowledge for devising a suitable processing pipeline. In this paper, we propose a novel data-driven normalization method for deep neural networks that handle high-frequency financial time-series. The proposed normalization scheme, which takes into account the bimodal characteristic of financial multivariate time-series, requires no expert knowledge to preprocess a financial time-series since this step is formulated as part of the end-to-end optimization process. Our experiments, conducted with state-of-the-arts neural networks and high-frequency data from two large-scale limit order books coming from the Nordic and US markets, show significant improvements over other normalization techniques in forecasting future stock price dynamics.

CVMar 31, 2021
Knowledge Distillation By Sparse Representation Matching

Dat Thanh Tran, Moncef Gabbouj, Alexandros Iosifidis

Knowledge Distillation refers to a class of methods that transfers the knowledge from a teacher network to a student network. In this paper, we propose Sparse Representation Matching (SRM), a method to transfer intermediate knowledge obtained from one Convolutional Neural Network (CNN) to another by utilizing sparse representation learning. SRM first extracts sparse representations of the hidden features of the teacher CNN, which are then used to generate both pixel-level and image-level labels for training intermediate feature maps of the student network. We formulate SRM as a neural processing block, which can be efficiently optimized using stochastic gradient descent and integrated into any CNN in a plug-and-play manner. Our experiments demonstrate that SRM is robust to architectural differences between the teacher and student networks, and outperforms other KD techniques across several datasets.

CVSep 22, 2020
Performance Indicator in Multilinear Compressive Learning

Dat Thanh Tran, Moncef Gabbouj, Alexandros Iosifidis

Recently, the Multilinear Compressive Learning (MCL) framework was proposed to efficiently optimize the sensing and learning steps when working with multidimensional signals, i.e. tensors. In Compressive Learning in general, and in MCL in particular, the number of compressed measurements captured by a compressive sensing device characterizes the storage requirement or the bandwidth requirement for transmission. This number, however, does not completely characterize the learning performance of a MCL system. In this paper, we analyze the relationship between the input signal resolution, the number of compressed measurements and the learning performance of MCL. Our empirical analysis shows that the reconstruction error obtained at the initialization step of MCL strongly correlates with the learning performance, thus can act as a good indicator to efficiently characterize learning performances obtained from different sensor configurations without optimizing the entire system.

LGMay 25, 2020
Attention-based Neural Bag-of-Features Learning for Sequence Data

Dat Thanh Tran, Nikolaos Passalis, Anastasios Tefas et al.

In this paper, we propose 2D-Attention (2DA), a generic attention formulation for sequence data, which acts as a complementary computation block that can detect and focus on relevant sources of information for the given learning objective. The proposed attention module is incorporated into the recently proposed Neural Bag of Feature (NBoF) model to enhance its learning capacity. Since 2DA acts as a plug-in layer, injecting it into different computation stages of the NBoF model results in different 2DA-NBoF architectures, each of which possesses a unique interpretation. We conducted extensive experiments in financial forecasting, audio analysis as well as medical diagnosis problems to benchmark the proposed formulations in comparison with existing methods, including the widely used Gated Recurrent Units. Our empirical analysis shows that the proposed attention formulations can not only improve performances of NBoF models but also make them resilient to noisy data.

CVFeb 17, 2020
Multilinear Compressive Learning with Prior Knowledge

Dat Thanh Tran, Moncef Gabbouj, Alexandros Iosifidis

The recently proposed Multilinear Compressive Learning (MCL) framework combines Multilinear Compressive Sensing and Machine Learning into an end-to-end system that takes into account the multidimensional structure of the signals when designing the sensing and feature synthesis components. The key idea behind MCL is the assumption of the existence of a tensor subspace which can capture the essential features from the signal for the downstream learning task. Thus, the ability to find such a discriminative tensor subspace and optimize the system to project the signals onto that data manifold plays an important role in Multilinear Compressive Learning. In this paper, we propose a novel solution to address both of the aforementioned requirements, i.e., How to find those tensor subspaces in which the signals of interest are highly separable? and How to optimize the sensing and feature synthesis components to transform the original signals to the data manifold found in the first question? In our proposal, the discovery of a high-quality data manifold is conducted by training a nonlinear compressive learning system on the inference task. Its knowledge of the data manifold of interest is then progressively transferred to the MCL components via multi-stage supervised training with the supervisory information encoding how the compressed measurements, the synthesized features, and the predictions should be like. The proposed knowledge transfer algorithm also comes with a semi-supervised adaption that enables compressive learning models to utilize unlabeled data effectively. Extensive experiments demonstrate that the proposed knowledge transfer method can effectively train MCL models to compressively sense and synthesize better features for the learning tasks with improved performances, especially when the complexity of the learning task increases.

LGFeb 17, 2020
Subset Sampling For Progressive Neural Network Learning

Dat Thanh Tran, Moncef Gabbouj, Alexandros Iosifidis

Progressive Neural Network Learning is a class of algorithms that incrementally construct the network's topology and optimize its parameters based on the training data. While this approach exempts the users from the manual task of designing and validating multiple network topologies, it often requires an enormous number of computations. In this paper, we propose to speed up this process by exploiting subsets of training data at each incremental training step. Three different sampling strategies for selecting the training samples according to different criteria are proposed and evaluated. We also propose to perform online hyperparameter selection during the network progression, which further reduces the overall training time. Experimental results in object, scene and face recognition problems demonstrate that the proposed approach speeds up the optimization procedure considerably while operating on par with the baseline approach exploiting the entire training set throughout the training process.

CVMay 17, 2019
Multilinear Compressive Learning

Dat Thanh Tran, Mehmet Yamac, Aysen Degerli et al.

Compressive Learning is an emerging topic that combines signal acquisition via compressive sensing and machine learning to perform inference tasks directly on a small number of measurements. Many data modalities naturally have a multi-dimensional or tensorial format, with each dimension or tensor mode representing different features such as the spatial and temporal information in video sequences or the spatial and spectral information in hyperspectral images. However, in existing compressive learning frameworks, the compressive sensing component utilizes either random or learned linear projection on the vectorized signal to perform signal acquisition, thus discarding the multi-dimensional structure of the signals. In this paper, we propose Multilinear Compressive Learning, a framework that takes into account the tensorial nature of multi-dimensional signals in the acquisition step and builds the subsequent inference model on the structurally sensed measurements. Our theoretical complexity analysis shows that the proposed framework is more efficient compared to its vector-based counterpart in both memory and computation requirement. With extensive experiments, we also empirically show that our Multilinear Compressive Learning framework outperforms the vector-based framework in object classification and face recognition tasks, and scales favorably when the dimensionalities of the original signals increase, making it highly efficient for high-dimensional multi-dimensional signals.

LGMar 5, 2019
Data-driven Neural Architecture Learning For Financial Time-series Forecasting

Dat Thanh Tran, Juho Kanniainen, Moncef Gabbouj et al.

Forecasting based on financial time-series is a challenging task since most real-world data exhibits nonstationary property and nonlinear dependencies. In addition, different data modalities often embed different nonlinear relationships which are difficult to capture by human-designed models. To tackle the supervised learning task in financial time-series prediction, we propose the application of a recently formulated algorithm that adaptively learns a mapping function, realized by a heterogeneous neural architecture composing of Generalized Operational Perceptron, given a set of labeled data. With a modified objective function, the proposed algorithm can accommodate the frequently observed imbalanced data distribution problem. Experiments on a large-scale Limit Order Book dataset demonstrate that the proposed algorithm outperforms related algorithms, including tensor-based methods which have access to a broader set of input information.

NEAug 20, 2018
Progressive Operational Perceptron with Memory

Dat Thanh Tran, Serkan Kiranyaz, Moncef Gabbouj et al.

Generalized Operational Perceptron (GOP) was proposed to generalize the linear neuron model in the traditional Multilayer Perceptron (MLP) and this model can mimic the synaptic connections of the biological neurons that have nonlinear neurochemical behaviours. Progressive Operational Perceptron (POP) is a multilayer network composing of GOPs which is formed layer-wise progressively. In this work, we propose major modifications that can accelerate as well as augment the progressive learning procedure of POP by incorporating an information-preserving, linear projection path from the input to the output layer at each progressive step. The proposed extensions can be interpreted as a mechanism that provides direct information extracted from the previously learned layers to the network, hence the term "memory". This allows the network to learn deeper architectures with better data representations. An extensive set of experiments show that the proposed modifications can surpass the learning capability of the original POPs and other related algorithms.

NEApr 13, 2018
Heterogeneous Multilayer Generalized Operational Perceptron

Dat Thanh Tran, Serkan Kiranyaz, Moncef Gabbouj et al.

The traditional Multilayer Perceptron (MLP) using McCulloch-Pitts neuron model is inherently limited to a set of neuronal activities, i.e., linear weighted sum followed by nonlinear thresholding step. Previously, Generalized Operational Perceptron (GOP) was proposed to extend conventional perceptron model by defining a diverse set of neuronal activities to imitate a generalized model of biological neurons. Together with GOP, Progressive Operational Perceptron (POP) algorithm was proposed to optimize a pre-defined template of multiple homogeneous layers in a layerwise manner. In this paper, we propose an efficient algorithm to learn a compact, fully heterogeneous multilayer network that allows each individual neuron, regardless of the layer, to have distinct characteristics. Based on the complexity of the problem, the proposed algorithm operates in a progressive manner on a neuronal level, searching for a compact topology, not only in terms of depth but also width, i.e., the number of neurons in each layer. The proposed algorithm is shown to outperform other related learning methods in extensive experiments on several classification problems.

CEDec 4, 2017
Temporal Attention augmented Bilinear Network for Financial Time-Series Data Analysis

Dat Thanh Tran, Alexandros Iosifidis, Juho Kanniainen et al.

Financial time-series forecasting has long been a challenging problem because of the inherently noisy and stochastic nature of the market. In the High-Frequency Trading (HFT), forecasting for trading purposes is even a more challenging task since an automated inference system is required to be both accurate and fast. In this paper, we propose a neural network layer architecture that incorporates the idea of bilinear projection as well as an attention mechanism that enables the layer to detect and focus on crucial temporal information. The resulting network is highly interpretable, given its ability to highlight the importance and contribution of each temporal instance, thus allowing further analysis on the time instances of interest. Our experiments in a large-scale Limit Order Book (LOB) dataset show that a two-hidden-layer network utilizing our proposed layer outperforms by a large margin all existing state-of-the-art results coming from much deeper architectures while requiring far fewer computations.

CVOct 29, 2017
Multilinear Class-Specific Discriminant Analysis

Dat Thanh Tran, Moncef Gabbouj, Alexandros Iosifidis

There has been a great effort to transfer linear discriminant techniques that operate on vector data to high-order data, generally referred to as Multilinear Discriminant Analysis (MDA) techniques. Many existing works focus on maximizing the inter-class variances to intra-class variances defined on tensor data representations. However, there has not been any attempt to employ class-specific discrimination criteria for the tensor data. In this paper, we propose a multilinear subspace learning technique suitable for applications requiring class-specific tensor models. The method maximizes the discrimination of each individual class in the feature space while retains the spatial structure of the input. We evaluate the efficiency of the proposed method on two problems, i.e. facial image analysis and stock price prediction based on limit order book data.

CVSep 28, 2017
Improving Efficiency in Convolutional Neural Network with Multilinear Filters

Dat Thanh Tran, Alexandros Iosifidis, Moncef Gabbouj

The excellent performance of deep neural networks has enabled us to solve several automatization problems, opening an era of autonomous devices. However, current deep net architectures are heavy with millions of parameters and require billions of floating point operations. Several works have been developed to compress a pre-trained deep network to reduce memory footprint and, possibly, computation. Instead of compressing a pre-trained network, in this work, we propose a generic neural network layer structure employing multilinear projection as the primary feature extractor. The proposed architecture requires several times less memory as compared to the traditional Convolutional Neural Networks (CNN), while inherits the similar design principles of a CNN. In addition, the proposed architecture is equipped with two computation schemes that enable computation reduction or scalability. Experimental results show the effectiveness of our compact projection that outperforms traditional CNN, while requiring far fewer parameters.

CESep 5, 2017
Tensor Representation in High-Frequency Financial Data for Price Change Prediction

Dat Thanh Tran, Martin Magris, Juho Kanniainen et al.

Nowadays, with the availability of massive amount of trade data collected, the dynamics of the financial markets pose both a challenge and an opportunity for high frequency traders. In order to take advantage of the rapid, subtle movement of assets in High Frequency Trading (HFT), an automatic algorithm to analyze and detect patterns of price change based on transaction records must be available. The multichannel, time-series representation of financial data naturally suggests tensor-based learning algorithms. In this work, we investigate the effectiveness of two multilinear methods for the mid-price prediction problem against other existing methods. The experiments in a large scale dataset which contains more than 4 millions limit orders show that by utilizing tensor representation, multilinear models outperform vector-based approaches and other competing ones.