SPMar 10, 2023
Machine learning-based detection of cardiovascular disease using ECG signals: performance vs. complexityHuy Pham, Konstantin Egorov, Alexey Kazakov et al.
Cardiovascular disease remains a significant problem in modern society. Among non-invasive techniques, the electrocardiogram (ECG) is one of the most reliable methods for detecting abnormalities in cardiac activities. However, ECG interpretation requires expert knowledge and it is time-consuming. Developing a novel method to detect the disease early could prevent death and complication. The paper presents novel various approaches for classifying cardiac diseases from ECG recordings. The first approach suggests the Poincare representation of ECG signal and deep-learning-based image classifiers (ResNet50 and DenseNet121 were learned over Poincare diagrams), which showed decent performance in predicting AF (atrial fibrillation) but not other types of arrhythmia. XGBoost, a gradient-boosting model, showed an acceptable performance in long-term data but had a long inference time due to highly-consuming calculation within the pre-processing phase. Finally, the 1D convolutional model, specifically the 1D ResNet, showed the best results in both studied CinC 2017 and CinC 2020 datasets, reaching the F1 score of 85% and 71%, respectively, and that was superior to the first-ranking solution of each challenge. The paper also investigated efficiency metrics such as power consumption and equivalent CO2 emissions, with one-dimensional models like 1D CNN and 1D ResNet being the most energy efficient. Model interpretation analysis showed that the DenseNet detected AF using heart rate variability while the 1DResNet assessed AF pattern in raw ECG signals.
4.6ITMay 11
A Fast Hierarchical Splitting Approach for Non-Adaptive Learning of Random HypergraphsHuy Pham, Hoang Ta
This work focuses on the problem of learning an unknown $3$-uniform hypergraph using edge-detecting queries. Our goal is to design a querying strategy that recovers the hyperedge set using as few queries as possible. We restrict our attention to random hypergraphs under the Erdős--Rényi (ER) model, in which each potential hyperedge appears independently with probability $q = Θ(n^{-3(1-θ)})$ for $θ\in (0;1)$. Prior work [Austhof-Reyzin-Tani, ISIT 2025] presents a testing-decoding scheme that uses $O(\bar{m}\log n)$ tests but requires a decoding time of $Ω(n^3)$, where $\bar{m} = q\binom{n}{3}$ denotes the expected number of hyperedges. In this work, we extend the binary splitting framework and adapt it to the $3$-uniform hypergraph setting. We obtain a testing-decoding scheme that recovers the hyperedge set with high probability using $O(\bar{m} \log n)$ tests and achieves decoding time $O(\bar{m}^{5/3}\log n)$ for the case $θ> \dfrac{2}{3}$ and $O(\bar{m}^{5/3}\log^2{\bar{m}}\log n)$ for the case $θ\leq \dfrac{2}{3}$. Thus, compared with prior work, our result significantly improves the decoding complexity while maintaining optimal query complexity.
CLDec 18, 2023
VinaLLaMA: LLaMA-based Vietnamese Foundation ModelQuan Nguyen, Huy Pham, Dung Dao
In this technical report, we present VinaLLaMA, an open-weight, state-of-the-art (SOTA) Large Language Model for the Vietnamese language, built upon LLaMA-2 with an additional 800 billion trained tokens. VinaLLaMA not only demonstrates fluency in Vietnamese but also exhibits a profound understanding of Vietnamese culture, making it a truly indigenous model. VinaLLaMA-7B-chat, trained on 1 million high-quality synthetic samples, achieves SOTA results on key benchmarks, including VLSP, VMLU, and Vicuna Benchmark Vietnamese, marking a significant advancement in the Vietnamese AI landscape and offering a versatile resource for various applications.
DSMar 28, 2024
Constructing Decision Trees from Data StreamsHuy Pham, Hoang Ta, Hoa T. Vu
In this work, we present data stream algorithms to compute optimal splits for decision tree learning. In particular, given a data stream of observations \(x_i\) and their corresponding labels \(y_i\), without the i.i.d. assumption, the objective is to identify the optimal split \(j\) that partitions the data into two sets, minimizing the mean squared error (for regression) or the misclassification rate and Gini impurity (for classification). We propose several efficient streaming algorithms that require sublinear space and use a small number of passes to solve these problems. These algorithms can also be extended to the MapReduce model. Our results, while not directly comparable, complements the seminal work of Domingos-Hulten (KDD 2000) and Hulten-Spencer-Domingos (KDD 2001).
MED-PHAug 16, 2021
MRI-compatible electromagnetic servomotors for image-guided robotic proceduresLorne W. Hofstetter, Rock Hadley, Robb Merrill et al.
Combining the unmatched soft-tissue imaging capabilities of magnetic resonance imaging (MRI) with high precision robotics has the potential to improve the accuracy, precision, and safety of a wide range of image-guided medical procedures. However, the goal of highly functional MRI-compatible robotic systems has not yet been realized because conventional electromagnetic servomotors used by medical robots can become dangerous projectiles near the strong magnetic field of an MRI scanner. Here we report a novel electromagnetic servomotor design that is constructed from non-magnetic components and can operate within the patient area of clinical scanners. We show that this design enables high-torque and precisely controlled rotary actuation during imaging. Using this servomotor design, an MRI-compatible robot was constructed and tested. The robot demonstrated that the linear forces required to manipulate large diameter surgical instruments in tissues could be achieved during simultaneous imaging with MRI. This work presents the first fully functional electromagnetic servomotor that can be safely operated (while imaging) in the patient area of a 3 Tesla clinical MRI scanner.