Theerawit Wilaiprasitporn

SP

h-index19

16papers

455citations

Novelty39%

AI Score31

Ranked #132,040 of 194,257 authors (top 68%)#345 in SP (top 47%)

16 Papers

6.6SPAug 18, 2022

RRWaveNet: A Compact End-to-End Multi-Scale Residual CNN for Robust PPG Respiratory Rate Estimation

Pongpanut Osathitporn, Guntitat Sawadwuthikul, Punnawish Thuwajit et al.

Respiratory rate (RR) is an important biomarker as RR changes can reflect severe medical events such as heart disease, lung disease, and sleep disorders. Unfortunately, standard manual RR counting is prone to human error and cannot be performed continuously. This study proposes a method for continuously estimating RR, RRWaveNet. The method is a compact end-to-end deep learning model which does not require feature engineering and can use low-cost raw photoplethysmography (PPG) as input signal. RRWaveNet was tested subject-independently and compared to baseline in four datasets (BIDMC, CapnoBase, WESAD, and SensAI) and using three window sizes (16, 32, and 64 seconds). RRWaveNet outperformed current state-of-the-art methods with mean absolute errors at optimal window size of 1.66 \pm 1.01, 1.59 \pm 1.08, 1.92 \pm 0.96 and 1.23 \pm 0.61 breaths per minute for each dataset. In remote monitoring settings, such as in the WESAD and SensAI datasets, we apply transfer learning to improve the performance using two other ICU datasets as pretraining datasets, reducing the MAE by up to 21$\%$. This shows that this model allows accurate and practical estimation of RR on affordable and wearable devices. Our study also shows feasibility of remote RR monitoring in the context of telemedicine and at home.

1.2QMJul 6, 2023Code

PseudoCell: Hard Negative Mining as Pseudo Labeling for Deep Learning-Based Centroblast Cell Detection

Narongrid Seesawad, Piyalitt Ittichaiwong, Thapanun Sudhawiyangkul et al.

Patch classification models based on deep learning have been utilized in whole-slide images (WSI) of H&E-stained tissue samples to assist pathologists in grading follicular lymphoma patients. However, these approaches still require pathologists to manually identify centroblast cells and provide refined labels for optimal performance. To address this, we propose PseudoCell, an object detection framework to automate centroblast detection in WSI (source code is available at https://github.com/IoBT-VISTEC/PseudoCell.git). This framework incorporates centroblast labels from pathologists and combines them with pseudo-negative labels obtained from undersampled false-positive predictions using the cell's morphological features. By employing PseudoCell, pathologists' workload can be reduced as it accurately narrows down the areas requiring their attention during examining tissue. Depending on the confidence threshold, PseudoCell can eliminate 58.18-99.35% of non-centroblasts tissue areas on WSI. This study presents a practical centroblast prescreening method that does not require pathologists' refined labels for improvement. Detailed guidance on the practical implementation of PseudoCell is provided in the discussion section.

4.8IVJul 25, 2022Code

OCTAve: 2D en face Optical Coherence Tomography Angiography Vessel Segmentation in Weakly-Supervised Learning with Locality Augmentation

Amrest Chinkamol, Vetit Kanjaras, Phattarapong Sawangjai et al.

While there have been increased researches using deep learning techniques for the extraction of vascular structure from the 2D en face OCTA, for such approach, it is known that the data annotation process on the curvilinear structure like the retinal vasculature is very costly and time consuming, albeit few tried to address the annotation problem. In this work, we propose the application of the scribble-base weakly-supervised learning method to automate the pixel-level annotation. The proposed method, called OCTAve, combines the weakly-supervised learning using scribble-annotated ground truth augmented with an adversarial and a novel self-supervised deep supervision. Our novel mechanism is designed to utilize the discriminative outputs from the discrimination layer of a UNet-like architecture where the Kullback-Liebler Divergence between the aggregate discriminative outputs and the segmentation map predicate is minimized during the training. This combined method leads to the better localization of the vascular structure as shown in our experiments. We validate our proposed method on the large public datasets i.e., ROSE, OCTA-500. The segmentation performance is compared against both state-of-the-art fully-supervised and scribble-based weakly-supervised approaches. The implementation of our work used in the experiments is located at [LINK].

7.9LGSep 6, 2024Code

MixNet: Joining Force of Classical and Modern Approaches Toward the Comprehensive Pipeline in Motor Imagery EEG Classification

Phairot Autthasan, Rattanaphon Chaisaen, Huy Phan et al.

Recent advances in deep learning (DL) have significantly impacted motor imagery (MI)-based brain-computer interface (BCI) systems, enhancing the decoding of electroencephalography (EEG) signals. However, most studies struggle to identify discriminative patterns across subjects during MI tasks, limiting MI classification performance. In this article, we propose MixNet, a novel classification framework designed to overcome this limitation by utilizing spectral-spatial signals from MI data, along with a multitask learning architecture named MIN2Net, for classification. Here, the spectral-spatial signals are generated using the filter-bank common spatial patterns (FBCSPs) method on MI data. Since the multitask learning architecture is used for the classification task, the learning in each task may exhibit different generalization rates and potential overfitting across tasks. To address this issue, we implement adaptive gradient blending, simultaneously regulating multiple loss weights and adjusting the learning pace for each task based on its generalization/overfitting tendencies. Experimental results on six benchmark data sets of different data sizes demonstrate that MixNet consistently outperforms all state-of-the-art algorithms in subject-dependent and -independent settings. Finally, the low-density EEG MI classification results show that MixNet outperforms all state-of-the-art algorithms, offering promising implications for Internet of Thing (IoT) applications, such as lightweight and portable EEG wearable devices based on low-density montages.

2.7IVDec 9, 2022

PACMAN: a framework for pulse oximeter digit detection and reading in a low-resource setting

Chiraphat Boonnag, Wanumaidah Saengmolee, Narongrid Seesawad et al.

In light of the COVID-19 pandemic, patients were required to manually input their daily oxygen saturation (SpO2) and pulse rate (PR) values into a health monitoring system-unfortunately, such a process trend to be an error in typing. Several studies attempted to detect the physiological value from the captured image using optical character recognition (OCR). However, the technology has limited availability with high cost. Thus, this study aimed to propose a novel framework called PACMAN (Pandemic Accelerated Human-Machine Collaboration) with a low-resource deep learning-based computer vision. We compared state-of-the-art object detection algorithms (scaled YOLOv4, YOLOv5, and YOLOR), including the commercial OCR tools for digit recognition on the captured images from pulse oximeter display. All images were derived from crowdsourced data collection with varying quality and alignment. YOLOv5 was the best-performing model against the given model comparison across all datasets, notably the correctly orientated image dataset. We further improved the model performance with the digits auto-orientation algorithm and applied a clustering algorithm to extract SpO2 and PR values. The accuracy performance of YOLOv5 with the implementations was approximately 81.0-89.5%, which was enhanced compared to without any additional implementation. Accordingly, this study highlighted the completion of PACMAN framework to detect and read digits in real-world datasets. The proposed framework has been currently integrated into the patient monitoring system utilized by hospitals nationwide.

1.2QMSep 17, 2022

ANet: Autoencoder-Based Local Field Potential Feature Extractor for Evaluating An Antidepressant Effect in Mice after Administering Kratom Leaf Extracts

Jakkrit Nukitram, Rattanaphon Chaisaen, Phairot Autthasan et al.

Kratom (KT) typically exerts antidepressant (AD) effects. However, evaluating which form of KT extracts possesses AD properties similar to the standard AD fluoxetine (flu) remained challenging. Here, we adopted an autoencoder (AE)-based anomaly detector called ANet to measure the similarity of mice's local field potential (LFP) features that responded to KT leave extracts and AD flu. The features that responded to KT syrup had the highest similarity to those that responded to the AD flu at 85.62 $\pm$ 0.29%. This finding presents the higher feasibility of using KT syrup as an alternative substance for depressant therapy than KT alkaloids and KT aqueous, which are the other candidates in this study. Apart from the similarity measurement, we utilized ANet as a multi-task AE and evaluated the performance in discriminating multi-class LFP responses corresponding to the effect of different KT extracts and AD flu simultaneously. Furthermore, we visualized learned latent features among LFP responses qualitatively and quantitatively as t-SNE projection and maximum mean discrepancy distance, respectively. The classification results reported the accuracy and F1-score of 79.78 $\pm$ 0.39% and 79.53 $\pm$ 0.00%. In summary, the outcomes of this research might help therapeutic design devices for an alternative substance profile evaluation, such as Kratom-based form in real-world applications.

0.5CLNov 18, 2023

Combining EEG and NLP Features for Predicting Students' Lecture Comprehension using Ensemble Classification

Phantharach Natnithikarat, Theerawit Wilaiprasitporn, Supavit Kongwudhikunakorn

Electroencephalography (EEG) and Natural Language Processing (NLP) can be applied for education to measure students' comprehension in classroom lectures; currently, the two measures have been used separately. In this work, we propose a classification framework for predicting students' lecture comprehension in two tasks: (i) students' confusion after listening to the simulated lecture and (ii) the correctness of students' responses to the post-lecture assessment. The proposed framework includes EEG and NLP feature extraction, processing, and classification. EEG and NLP features are extracted to construct integrated features obtained from recorded EEG signals and sentence-level syntactic analysis, which provide information about specific biomarkers and sentence structures. An ensemble stacking classification method -- a combination of multiple individual models that produces an enhanced predictive model -- is studied to learn from the features to make predictions accurately. Furthermore, we also utilized subjective confusion ratings as another integrated feature to enhance classification performance. By doing so, experiment results show that this framework performs better than the baselines, which achieved F1 up to 0.65 for predicting confusion and 0.78 for predicting correctness, highlighting that utilizing this has helped improve the classification performance.

1.2SPAug 17, 2022

EEG-BBNet: a Hybrid Framework for Brain Biometric using Graph Connectivity

Payongkit Lakhan, Nannapas Banluesombatkul, Natchaya Sricom et al.

Brain biometrics based on electroencephalography (EEG) have been used increasingly for personal identification. Traditional machine learning techniques as well as modern day deep learning methods have been applied with promising results. In this paper we present EEG-BBNet, a hybrid network which integrates convolutional neural networks (CNN) with graph convolutional neural networks (GCNN). The benefit of the CNN in automatic feature extraction and the capability of GCNN in learning connectivity between EEG electrodes through graph representation are jointly exploited. We examine various connectivity measures, namely the Euclidean distance, Pearson's correlation coefficient, phase-locked value, phase-lag index, and Rho index. The performance of the proposed method is assessed on a benchmark dataset consisting of various brain-computer interface (BCI) tasks and compared to other state-of-the-art approaches. We found that our models outperform all baselines in the event-related potential (ERP) task with an average correct recognition rates up to 99.26% using intra-session data. EEG-BBNet with Pearson's correlation and RHO index provide the best classification results. In addition, our model demonstrates greater adaptability using inter-session and inter-task data. We also investigate the practicality of our proposed model with smaller number of electrodes. Electrode placements over the frontal lobe region appears to be most appropriate with minimal lost in performance.

3.1LGJun 18, 2021

Deep Reinforcement Learning Models Predict Visual Responses in the Brain: A Preliminary Result

Maytus Piriyajitakonkij, Sirawaj Itthipuripat, Theerawit Wilaiprasitporn et al.

Supervised deep convolutional neural networks (DCNNs) are currently one of the best computational models that can explain how the primate ventral visual stream solves object recognition. However, embodied cognition has not been considered in the existing visual processing models. From the ecological standpoint, humans learn to recognize objects by interacting with them, allowing better classification, specialization, and generalization. Here, we ask if computational models under the embodied learning framework can explain mechanisms underlying object recognition in the primate visual system better than the existing supervised models? To address this question, we use reinforcement learning to train neural network models to play a 3D computer game and we find that these reinforcement learning models achieve neural response prediction accuracy scores in the early visual areas (e.g., V1 and V2) in the levels that are comparable to those accomplished by the supervised neural network model. In contrast, the supervised neural network models yield better neural response predictions in the higher visual areas, compared to the reinforcement learning models. Our preliminary results suggest the future direction of visual neuroscience in which deep reinforcement learning should be included to fill the missing embodiment concept.

4.4LGMar 5, 2021

A Pilot Study on Visually Stimulated Cognitive Tasks for EEG-Based Dementia Recognition

Supavit Kongwudhikunakorn, Suktipol Kiatthaveephong, Kamonwan Thanontip et al.

In the status quo, dementia is yet to be cured. Precise diagnosis prior to the onset of the symptoms can prevent the rapid progression of the emerging cognitive impairment. Recent progress has shown that Electroencephalography (EEG) is the promising and cost-effective test to facilitate the detection of neurocognitive disorders. However, most of the existing works have been using only resting-state EEG. The efficiencies of EEG signals from various cognitive tasks, for dementia classification, have yet to be thoroughly investigated. In this study, we designed four cognitive tasks that engage different cognitive performances: attention, working memory, and executive function. We investigated these tasks by using statistical analysis on both time and frequency domains of EEG signals from three classes of human subjects: Dementia (DEM), Mild Cognitive Impairment (MCI), and Normal Control (NC). We also further evaluated the classification performances of two features extraction methods: Principal Component Analysis (PCA) and Filter Bank Common Spatial Pattern (FBCSP). We found that the working memory related tasks yielded good performances for dementia recognition in both cases using PCA and FBCSP. Moreover, FBCSP with features combination from four tasks revealed the best sensitivity of 0.87 and the specificity of 0.80. To our best knowledge, this is the first work that concurrently investigated several cognitive tasks for dementia recognition using both statistical analysis and classification scores. Our results yielded essential information to design and aid in conducting further experimental tasks to early diagnose dementia patients.

10.8SPFeb 7, 2021Code

MIN2Net: End-to-End Multi-Task Learning for Subject-Independent Motor Imagery EEG Classification

Phairot Autthasan, Rattanaphon Chaisaen, Thapanun Sudhawiyangkul et al.

Advances in the motor imagery (MI)-based brain-computer interfaces (BCIs) allow control of several applications by decoding neurophysiological phenomena, which are usually recorded by electroencephalography (EEG) using a non-invasive technique. Despite great advances in MI-based BCI, EEG rhythms are specific to a subject and various changes over time. These issues point to significant challenges to enhance the classification performance, especially in a subject-independent manner. To overcome these challenges, we propose MIN2Net, a novel end-to-end multi-task learning to tackle this task. We integrate deep metric learning into a multi-task autoencoder to learn a compact and discriminative latent representation from EEG and perform classification simultaneously. This approach reduces the complexity in pre-processing, results in significant performance improvement on EEG classification. Experimental results in a subject-independent manner show that MIN2Net outperforms the state-of-the-art techniques, achieving an F1-score improvement of 6.72%, and 2.23% on the SMR-BCI, and OpenBMI datasets, respectively. We demonstrate that MIN2Net improves discriminative information in the latent representation. This study indicates the possibility and practicality of using this model to develop MI-based BCI applications for new users without the need for calibration.

3.7HCJan 30, 2021Code

Revealing Preference in Popular Music Through Familiarity and Brain Response

Soravitt Sangnark, Phairot Autthasan, Puntawat Ponglertnapakorn et al.

Music preference was reported as a factor, which could elicit innermost music emotion, entailing accurate ground-truth data and music therapy efficiency. This study executes statistical analysis to investigate the distinction of music preference through familiarity scores, response times (response rates), and brain response (EEG). Twenty participants did self-assessment after listening to two types of popular music's chorus section: music without lyrics (Melody) and music with lyrics (Song). \textcolor{red}{We then conduct a music preference classification using a support vector machine, random forest, and k-nearest neighbors with the familiarity scores, the response rates, and EEG as the feature vectors. The statistical analysis and F1-score of EEG are congruent, which is the brain's right side outperformed its left side in classification performance.} Finally, these behavioral and brain studies support that preference, familiarity, and response rates can contribute to the music emotion experiment's design to understand music, emotion, and listener. Not only to the music industry, the biomedical and healthcare industry can also exploit this experiment to collect data from patients to improve the efficiency of healing by music.

11.1HCApr 7, 2020Code

Decoding EEG Rhythms During Action Observation, Motor Imagery, and Execution for Standing and Sitting

Rattanaphon Chaisaen, Phairot Autthasan, Nopparada Mingchinda et al.

Event-related desynchronization and synchronization (ERD/S) and movement-related cortical potential (MRCP) play an important role in brain-computer interfaces (BCI) for lower limb rehabilitation, particularly in standing and sitting. However, little is known about the differences in the cortical activation between standing and sitting, especially how the brain's intention modulates the pre-movement sensorimotor rhythm as they do for switching movements. In this study, we aim to investigate the decoding of continuous EEG rhythms during action observation (AO), motor imagery (MI), and motor execution (ME) for the actions of standing and sitting. We developed a behavioral task in which participants were instructed to perform both AO and MI/ME in regard to the transitioning actions of sit-to-stand and stand-to-sit. Our results demonstrated that the ERD was prominent during AO, whereas ERS was typical during MI at the alpha band across the sensorimotor area. A combination of the filter bank common spatial pattern (FBCSP) and support vector machine (SVM) for classification was used for both offline and classifier testing analyses. The offline analysis indicated the classification of AO and MI providing the highest mean accuracy at 82.73$\pm$2.54\% in the stand-to-sit transition. By applying the classifier testing analysis, we demonstrated the higher performance of decoding neural intentions from the MI paradigm in comparison to the ME paradigm. These observations led us to the promising aspect of using our developed tasks based on the integration of both AO and MI to build future exoskeleton-based rehabilitation systems.

3.6IVAug 23, 2019

Parkinson's Disease Recognition Using SPECT Image and Interpretable AI: A Tutorial

Theerasarn Pianpanit, Sermkiat Lolak, Phattarapong Sawangjai et al.

In the past few years, there are several researches on Parkinson's disease (PD) recognition using single-photon emission computed tomography (SPECT) images with deep learning (DL) approach. However, the DL model's complexity usually results in difficult model interpretation when used in clinical. Even though there are multiple interpretation methods available for the DL model, there is no evidence of which method is suitable for PD recognition application. This tutorial aims to demonstrate the procedure to choose a suitable interpretation method for the PD recognition model. We exhibit four DCNN architectures as an example and introduce six well-known interpretation methods. Finally, we propose an evaluation method to measure the interpretation performance and a method to use the interpreted feedback for assisting in model selection. The evaluation demonstrates that the guided backpropagation and SHAP interpretation methods are suitable for PD recognition methods in different aspects. Guided backpropagation has the best ability to show fine-grained importance, which is proven by the highest Dice coefficient and lowest mean square error. On the other hand, SHAP can generate a better quality heatmap at the uptake depletion location, which outperforms other methods in discriminating the difference between PD and NC subjects. Shortly, the introduced interpretation methods can contribute to not only the PD recognition application but also to sensor data processing in an AI Era (interpretable-AI) as feedback in constructing well-suited deep learning architectures for specific applications.

6.6SPOct 10, 2018

Consumer Grade Brain Sensing for Emotion Recognition

Payongkit Lakhan, Nannapas Banluesombatkul, Vongsagon Changniam et al.

For several decades, electroencephalography (EEG) has featured as one of the most commonly used tools in emotional state recognition via monitoring of distinctive brain activities. An array of datasets have been generated with the use of diverse emotion-eliciting stimuli and the resulting brainwave responses conventionally captured with high-end EEG devices. However, the applicability of these devices is to some extent limited by practical constraints and may prove difficult to be deployed in highly mobile context omnipresent in everyday happenings. In this study, we evaluate the potential of OpenBCI to bridge this gap by first comparing its performance to research grade EEG system, employing the same algorithms that were applied on benchmark datasets. Moreover, for the purpose of emotion classification, we propose a novel method to facilitate the selection of audio-visual stimuli of high/low valence and arousal. Our setup entailed recruiting 200 healthy volunteers of varying years of age to identify the top 60 affective video clips from a total of 120 candidates through standardized self assessment, genre tags, and unsupervised machine learning. Additional 43 participants were enrolled to watch the pre-selected clips during which emotional EEG brainwaves and peripheral physiological signals were collected. These recordings were analyzed and extracted features fed into a classification model to predict whether the elicited signals were associated with a high or low level of valence and arousal. As it turned out, our prediction accuracies were decidedly comparable to those of previous studies that utilized more costly EEG amplifiers for data acquisition.

3.3SPAug 31, 2018

Towards Asynchronous Motor Imagery-Based Brain-Computer Interfaces: a joint training scheme using deep learning

Patcharin Cheng, Phairot Autthasan, Boriwat Pijarana et al.

In this paper, the deep learning (DL) approach is applied to a joint training scheme for asynchronous motor imagery-based Brain-Computer Interface (BCI). The proposed DL approach is a cascade of one-dimensional convolutional neural networks and fully-connected neural networks (CNN-FC). The focus is mainly on three types of brain responses: non-imagery EEG (\textit{background EEG}), (\textit{pure imagery}) EEG, and EEG during the transitional period between background EEG and pure imagery (\textit{transitional imagery}). The study of transitional imagery signals should provide greater insight into real-world scenarios. It may be inferred that pure imagery and transitional EEG are high and low power EEG imagery, respectively. Moreover, the results from the CNN-FC are compared to the conventional approach for motor imagery-BCI, namely the common spatial pattern (CSP) for feature extraction and support vector machine (SVM) for classification (CSP-SVM). Under a joint training scheme, pure and transitional imagery are treated as the same class, while background EEG is another class. Ten-fold cross-validation is used to evaluate whether the joint training scheme significantly improves the performance task of classifying pure and transitional imagery signals from background EEG. Using sparse of just a few electrode channels ($C_{z}$, $C_{3}$ and $C_{4}$), mean accuracy reaches 71.52 % and 70.27 % for CNN-FC and CSP-SVM, respectively. On the other hand, mean accuracy without the joint training scheme achieve only 62.68 % and 52.41 % for CNN-FC and CSP-SVM, respectively.