LGMay 27, 2022
Incorporating the Barzilai-Borwein Adaptive Step Size into Sugradient Methods for Deep Network TrainingAntonio Robles-Kelly, Asef Nazari
In this paper, we incorporate the Barzilai-Borwein step size into gradient descent methods used to train deep networks. This allows us to adapt the learning rate using a two-point approximation to the secant equation which quasi-Newton methods are based upon. Moreover, the adaptive learning rate method presented here is quite general in nature and can be applied to widely used gradient descent approaches such as Adagrad and RMSprop. We evaluate our method using standard example network architectures on widely available datasets and compare against alternatives elsewhere in the literature. In our experiments, our adaptive learning rate shows a smoother and faster convergence than that exhibited by the alternatives, with better or comparable performance.
STMar 26, 2023
Causal Modelling of Cryptocurrency Price Movements Using Discretisation-Aware Bayesian NetworksRasoul Amirzadeh, Asef Nazari, Dhananjay Thiruvady et al.
This study identifies the key factors influencing the price movements of major cryptocurrencies, Bitcoin, Binance Coin, Ethereum, Litecoin, Ripple, and Tether, using Bayesian networks (BNs). This study addresses two key challenges: modelling price movements in highly volatile cryptocurrency markets and enhancing predictive performance through discretisation-aware Bayesian Networks. It analyses both macro-financial indicators (gold, oil, MSCI, S and P 500, USDX) and social media signals (tweet volume) as potential price drivers. Moreover, since discretisation is a critical step in the effectiveness of BNs, we implement a structured procedure to build 54 BNs models by combining three discretisation methods (equal interval, equal quantile, and k-means) with several bin counts. These models are evaluated using four metrics, including balanced accuracy, F1 score, area under the ROC curve and a composite score. Results show that equal interval with two bins consistently yields the best predictive performance. We also provide deeper insights into each network's structure through inference, sensitivity, and influence strength analyses. These analyses reveal distinct price-driving patterns for each cryptocurrency, underscore the importance of coin-specific analysis, and demonstrate the value of BNs for interpretable causal modelling in volatile cryptocurrency markets.
AIOct 14, 2023
A Framework for Empowering Reinforcement Learning Agents with Causal Analysis: Enhancing Automated Cryptocurrency TradingRasoul Amirzadeh, Dhananjay Thiruvady, Asef Nazari et al.
Despite advances in artificial intelligence-enhanced trading methods, developing a profitable automated trading system remains challenging in the rapidly evolving cryptocurrency market. This research focuses on developing a reinforcement learning (RL) framework to tackle the complexities of trading five prominent altcoins: Binance Coin, Ethereum, Litecoin, Ripple, and Tether. To this end, we present the CausalReinforceNet~(CRN) framework, which integrates both Bayesian and dynamic Bayesian network techniques to empower the RL agent in trade decision-making. We develop two agents using the framework based on distinct RL algorithms to analyse performance compared to the Buy-and-Hold benchmark strategy and a baseline RL model. The results indicate that our framework surpasses both models in profitability, highlighting CRN's consistent superiority, although the level of effectiveness varies across different cryptocurrencies.
LGJun 13, 2023
Dynamic Bayesian Networks for Predicting Cryptocurrency Price Directions: Uncovering Causal RelationshipsRasoul Amirzadeh, Dhananjay Thiruvady, Asef Nazari et al.
Cryptocurrencies have gained popularity across various sectors, especially in finance and investment. Despite their growing popularity, cryptocurrencies can be a high-risk investment due to their price volatility. The inherent volatility in cryptocurrency prices, coupled with the effects of external global economic factors, makes predicting their price movements challenging. To address this challenge, we propose a dynamic Bayesian network (DBN)-based approach to uncover potential causal relationships among various features including social media data, traditional financial market factors, and technical indicators. Six popular cryptocurrencies, Bitcoin, Binance Coin, Ethereum, Litecoin, Ripple, and Tether are studied in this work. The proposed model's performance is compared to five baseline models of auto-regressive integrated moving average, support vector regression, long short-term memory, random forests, and support vector machines. The results show that while DBN performance varies across cryptocurrencies, with some cryptocurrencies exhibiting higher predictive accuracy than others, the DBN significantly outperforms the baseline models.
SIFeb 16
An Intelligent Hybrid Cross-Entropy System for Maximising Network Homophily via Soft Happy ColouringMohammad Hadi Shekarriz, Asef Nazari, Dhananjay Thiruvady
The Soft Happy Colouring (SHC) problem serves as a rigorous mathematical framework for identifying homophilic structures in complex networks. The SHC seeks to maximise the number of $Ï$-happy vertices, which are those vertices that the proportion of their neighbours sharing colour with them is at least $Ï$. The problem is NP-hard, making optimal solutions computationally intractable for large-scale networks. Consequently, metaheuristic approaches are useful, yet existing methods often struggle with premature convergence. Based on the problem's solution structure and the characteristics of the feasible region, an effective solution method needs to navigate efficiently among promising solutions while utilising information learned from less favourable ones. The Cross-Entropy method is suitable for this because it has a smoothing mechanism that adaptively balances exploration and exploitation, informed by the knowledge accumulated during the search process. This paper introduces a novel intelligent hybrid algorithm, CE+LS, which synergises the adaptive probabilistic learning of the Cross-Entropy method with a fast, structure-aware local search (LS) mechanism. We conduct a comprehensive experimental evaluation on an extensive dataset of 28,000 randomly generated graphs using the Stochastic Block Model as the ground-truth benchmark. Test results demonstrate that CE+LS consistently outperforms existing heuristic and memetic algorithms in homophily maximisation, exhibiting superior scalability and solution quality. Notably, the proposed algorithm remains efficient even in the tight regime, which is the most challenging category of problem instances where comparative algorithms fail to yield effective solutions.
SDDec 4, 2025
Physics-Guided Deepfake Detection for Voice Authentication SystemsAlireza Mohammadi, Keshav Sood, Dhananjay Thiruvady et al.
Voice authentication systems deployed at the network edge face dual threats: a) sophisticated deepfake synthesis attacks and b) control-plane poisoning in distributed federated learning protocols. We present a framework coupling physics-guided deepfake detection with uncertainty-aware in edge learning. The framework fuses interpretable physics features modeling vocal tract dynamics with representations coming from a self-supervised learning module. The representations are then processed via a Multi-Modal Ensemble Architecture, followed by a Bayesian ensemble providing uncertainty estimates. Incorporating physics-based characteristics evaluations and uncertainty estimates of audio samples allows our proposed framework to remain robust to both advanced deepfake attacks and sophisticated control-plane poisoning, addressing the complete threat model for networked voice authentication.
MLDec 30, 2023
SALSA: Sequential Approximate Leverage-Score Algorithm with Application in Analyzing Big Time Series DataAli Eshragh, Luke Yerbury, Asef Nazari et al.
We develop a new efficient sequential approximate leverage score algorithm, SALSA, using methods from randomized numerical linear algebra (RandNLA) for large matrices. We demonstrate that, with high probability, the accuracy of SALSA's approximations is within $(1 + O({\varepsilon}))$ of the true leverage scores. In addition, we show that the theoretical computational complexity and numerical accuracy of SALSA surpass existing approximations. These theoretical results are subsequently utilized to develop an efficient algorithm, named LSARMA, for fitting an appropriate ARMA model to large-scale time series data. Our proposed algorithm is, with high probability, guaranteed to find the maximum likelihood estimates of the parameters for the true underlying ARMA model. Furthermore, it has a worst-case running time that significantly improves those of the state-of-the-art alternatives in big data regimes. Empirical results on large-scale data strongly support these theoretical results and underscore the efficacy of our new approach.
CLDec 20, 2020
A hybrid deep-learning approach for complex biochemical named entity recognitionJian Liu, Lei Gao, Sujie Guo et al.
Named entity recognition (NER) of chemicals and drugs is a critical domain of information extraction in biochemical research. NER provides support for text mining in biochemical reactions, including entity relation extraction, attribute extraction, and metabolic response relationship extraction. However, the existence of complex naming characteristics in the biomedical field, such as polysemy and special characters, make the NER task very challenging. Here, we propose a hybrid deep learning approach to improve the recognition accuracy of NER. Specifically, our approach applies the Bidirectional Encoder Representations from Transformers (BERT) model to extract the underlying features of the text, learns a representation of the context of the text through Bi-directional Long Short-Term Memory (BILSTM), and incorporates the multi-head attention (MHATT) mechanism to extract chapter-level features. In this approach, the MHATT mechanism aims to improve the recognition accuracy of abbreviations to efficiently deal with the problem of inconsistency in full-text labels. Moreover, conditional random field (CRF) is used to label sequence tags because this probabilistic method does not need strict independence assumptions and can accommodate arbitrary context information. The experimental evaluation on a publicly-available dataset shows that the proposed hybrid approach achieves the best recognition performance; in particular, it substantially improves performance in recognizing abbreviations, polysemes, and low-frequency entities, compared with the state-of-the-art approaches. For instance, compared with the recognition accuracies for low-frequency entities produced by the BILSTM-CRF algorithm, those produced by the hybrid approach on two entity datasets (MULTIPLE and IDENTIFIER) have been increased by 80% and 21.69%, respectively.
MENov 27, 2019
LSAR: Efficient Leverage Score Sampling Algorithm for the Analysis of Big Time Series DataAli Eshragh, Fred Roosta, Asef Nazari et al.
We apply methods from randomized numerical linear algebra (RandNLA) to develop improved algorithms for the analysis of large-scale time series data. We first develop a new fast algorithm to estimate the leverage scores of an autoregressive (AR) model in big data regimes. We show that the accuracy of approximations lies within $(1+\bigO{\varepsilon})$ of the true leverage scores with high probability. These theoretical results are subsequently exploited to develop an efficient algorithm, called LSAR, for fitting an appropriate AR model to big time series data. Our proposed algorithm is guaranteed, with high probability, to find the maximum likelihood estimates of the parameters of the underlying true AR model and has a worst case running time that significantly improves those of the state-of-the-art alternatives in big data regimes. Empirical results on large-scale synthetic as well as real data highly support the theoretical results and reveal the efficacy of this new approach.