Ngoc Duy Nguyen

h-index12

9papers

1,366citations

Novelty32%

AI Score23

Ranked #173,651 of 194,257 authors (top 89%)#37,620 in LG (top 94%)

9 Papers

1.4CVSep 3, 2022

vieCap4H-VLSP 2021: Vietnamese Image Captioning for Healthcare Domain using Swin Transformer and Attention-based LSTM

Thanh Tin Nguyen, Long H. Nguyen, Nhat Truong Pham et al.

This study presents our approach on the automatic Vietnamese image captioning for healthcare domain in text processing tasks of Vietnamese Language and Speech Processing (VLSP) Challenge 2021, as shown in Figure 1. In recent years, image captioning often employs a convolutional neural network-based architecture as an encoder and a long short-term memory (LSTM) as a decoder to generate sentences. These models perform remarkably well in different datasets. Our proposed model also has an encoder and a decoder, but we instead use a Swin Transformer in the encoder, and a LSTM combined with an attention module in the decoder. The study presents our training experiments and techniques used during the competition. Our model achieves a BLEU4 score of 0.293 on the vietCap4H dataset, and the score is ranked the 3$^{rd}$ place on the private leaderboard. Our code can be found at \url{https://git.io/JDdJm}.

2.3SDSep 6, 2021

Fruit-CoV: An Efficient Vision-based Framework for Speedy Detection and Diagnosis of SARS-CoV-2 Infections Through Recorded Cough Sounds

Long H. Nguyen, Nhat Truong Pham, Van Huong Do et al.

SARS-CoV-2 is colloquially known as COVID-19 that had an initial outbreak in December 2019. The deadly virus has spread across the world, taking part in the global pandemic disease since March 2020. In addition, a recent variant of SARS-CoV-2 named Delta is intractably contagious and responsible for more than four million deaths over the world. Therefore, it is vital to possess a self-testing service of SARS-CoV-2 at home. In this study, we introduce Fruit-CoV, a two-stage vision framework, which is capable of detecting SARS-CoV-2 infections through recorded cough sounds. Specifically, we convert sounds into Log-Mel Spectrograms and use the EfficientNet-V2 network to extract its visual features in the first stage. In the second stage, we use 14 convolutional layers extracted from the large-scale Pretrained Audio Neural Networks for audio pattern recognition (PANNs) and the Wavegram-Log-Mel-CNN to aggregate feature representations of the Log-Mel Spectrograms. Finally, we use the combined features to train a binary classifier. In this study, we use a dataset provided by the AICovidVN 115M Challenge, which includes a total of 7371 recorded cough sounds collected throughout Vietnam, India, and Switzerland. Experimental results show that our proposed model achieves an AUC score of 92.8% and ranks the 1st place on the leaderboard of the AICovidVN Challenge. More importantly, our proposed framework can be integrated into a call center or a VoIP system to speed up detecting SARS-CoV-2 infections through online/recorded cough sounds.

2.3LGFeb 27, 2020

Review, Analysis and Design of a Comprehensive Deep Reinforcement Learning Framework

Ngoc Duy Nguyen, Thanh Thi Nguyen, Hai Nguyen et al.

The integration of deep learning to reinforcement learning (RL) has enabled RL to perform efficiently in high-dimensional environments. Deep RL methods have been applied to solve many complex real-world problems in recent years. However, development of a deep RL-based system is challenging because of various issues such as the selection of a suitable deep RL algorithm, its network configuration, training time, training methods, and so on. This paper proposes a comprehensive software framework that not only plays a vital role in designing a connect-the-dots deep RL architecture but also provides a guideline to develop a realistic RL application in a short time span. We have designed and developed a deep RL-based software framework that strictly ensures flexibility, robustness, and scalability. By inheriting the proposed architecture, software managers can foresee any challenges when designing a deep RL-based system. As a result, they can expedite the design process and actively control every stage of software development, which is especially critical in agile development environments. To enforce generalization, the proposed architecture does not depend on a specific RL algorithm, a network configuration, the number of agents, or the type of agents. Using our framework, software developers can develop and integrate new RL algorithms or new types of agents, and can flexibly change network configuration or the number of agents.

3.3LGFeb 27, 2020

A Visual Communication Map for Multi-Agent Deep Reinforcement Learning

Ngoc Duy Nguyen, Thanh Thi Nguyen, Doug Creighton et al.

Deep reinforcement learning has been applied successfully to solve various real-world problems and the number of its applications in the multi-agent settings has been increasing. Multi-agent learning distinctly poses significant challenges in the effort to allocate a concealed communication medium. Agents receive thorough knowledge from the medium to determine subsequent actions in a distributed nature. Apparently, the goal is to leverage the cooperation of multiple agents to achieve a designated objective efficiently. Recent studies typically combine a specialized neural network with reinforcement learning to enable communication between agents. This approach, however, limits the number of agents or necessitates the homogeneity of the system. In this paper, we have proposed a more scalable approach that not only deals with a great number of agents but also enables collaboration between dissimilar functional agents and compatibly combined with any deep reinforcement learning methods. Specifically, we create a global communication map to represent the status of each agent in the system visually. The visual map and the environmental state are fed to a shared-parameter network to train multiple agents concurrently. Finally, we select the Asynchronous Advantage Actor-Critic (A3C) algorithm to demonstrate our proposed scheme, namely Visual communication map for Multi-agent A3C (VMA3C). Simulation results show that the use of visual communication map improves the performance of A3C regarding learning speed, reward achievement, and robustness in multi-agent problems.

14.2ROFeb 14, 2019

Manipulating Soft Tissues by Deep Reinforcement Learning for Autonomous Robotic Surgery

Ngoc Duy Nguyen, Thanh Nguyen, Saeid Nahavandi et al.

In robotic surgery, pattern cutting through a deformable material is a challenging research field. The cutting procedure requires a robot to concurrently manipulate a scissor and a gripper to cut through a predefined contour trajectory on the deformable sheet. The gripper ensures the cutting accuracy by nailing a point on the sheet and continuously tensioning the pinch point to different directions while the scissor is in action. The goal is to find a pinch point and a corresponding tensioning policy to minimize damage to the material and increase cutting accuracy measured by the symmetric difference between the predefined contour and the cut contour. Previous study considers finding one fixed pinch point during the course of cutting, which is inaccurate and unsafe when the contour trajectory is complex. In this paper, we examine the soft tissue cutting task by using multiple pinch points, which imitates human operations while cutting. This approach, however, does not require the use of a multi-gripper robot. We use a deep reinforcement learning algorithm to find an optimal tensioning policy of a pinch point. Simulation results show that the multi-point approach outperforms the state-of-the-art method in soft pattern cutting task with respect to both accuracy and reliability.

25.3LGDec 31, 2018

Deep Reinforcement Learning for Multi-Agent Systems: A Review of Challenges, Solutions and Applications

Thanh Thi Nguyen, Ngoc Duy Nguyen, Saeid Nahavandi

Reinforcement learning (RL) algorithms have been around for decades and employed to solve various sequential decision-making problems. These algorithms however have faced great challenges when dealing with high-dimensional environments. The recent development of deep learning has enabled RL methods to drive optimal policies for sophisticated and capable agents, which can perform efficiently in these challenging environments. This paper addresses an important aspect of deep RL related to situations that require multiple agents to communicate and cooperate to solve complex tasks. A survey of different approaches to problems related to multi-agent deep RL (MADRL) is presented, including non-stationarity, partial observability, continuous state and action spaces, multi-agent training schemes, multi-agent transfer learning. The merits and demerits of the reviewed methods will be analyzed and discussed, with their corresponding applications explored. It is envisaged that this review provides insights about various MADRL methods and can lead to future development of more robust and highly useful multi-agent learning methods for solving real-world problems.

4.1LGApr 5, 2018

A Human Mixed Strategy Approach to Deep Reinforcement Learning

Ngoc Duy Nguyen, Saeid Nahavandi, Thanh Nguyen

In 2015, Google's DeepMind announced an advancement in creating an autonomous agent based on deep reinforcement learning (DRL) that could beat a professional player in a series of 49 Atari games. However, the current manifestation of DRL is still immature, and has significant drawbacks. One of DRL's imperfections is its lack of "exploration" during the training process, especially when working with high-dimensional problems. In this paper, we propose a mixed strategy approach that mimics behaviors of human when interacting with environment, and create a "thinking" agent that allows for more efficient exploration in the DRL training process. The simulation results based on the Breakout game show that our scheme achieves a higher probability of obtaining a maximum score than does the baseline DRL algorithm, i.e., the asynchronous advantage actor-critic method. The proposed scheme therefore can be applied effectively to solving a complicated task in a real-world application.

16.3LGMar 8, 2018

A Multi-Objective Deep Reinforcement Learning Framework

Thanh Thi Nguyen, Ngoc Duy Nguyen, Peter Vamplew et al.

This paper introduces a new scalable multi-objective deep reinforcement learning (MODRL) framework based on deep Q-networks. We develop a high-performance MODRL framework that supports both single-policy and multi-policy strategies, as well as both linear and non-linear approaches to action selection. The experimental results on two benchmark problems (two-objective deep sea treasure environment and three-objective Mountain Car problem) indicate that the proposed framework is able to find the Pareto-optimal solutions effectively. The proposed framework is generic and highly modularized, which allows the integration of different deep reinforcement learning algorithms in different complex problem domains. This therefore overcomes many disadvantages involved with standard multi-objective reinforcement learning methods in the current literature. The proposed framework acts as a testbed platform that accelerates the development of MODRL for solving increasingly complicated multi-objective problems.

9.0HCMar 6, 2018

A Review of Situation Awareness Assessment Approaches in Aviation Environments

Thanh Nguyen, Chee Peng Lim, Ngoc Duy Nguyen et al.

Situation awareness (SA) is an important constituent in human information processing and essential in pilots' decision-making processes. Acquiring and maintaining appropriate levels of SA is critical in aviation environments as it affects all decisions and actions taking place in flights and air traffic control. This paper provides an overview of recent measurement models and approaches to establishing and enhancing SA in aviation environments. Many aspects of SA are examined including the classification of SA techniques into six categories, and different theoretical SA models from individual, to shared or team, and to distributed or system levels. Quantitative and qualitative perspectives pertaining to SA methods and issues of SA for unmanned vehicles are also addressed. Furthermore, future research directions regarding SA assessment approaches are raised to deal with shortcomings of the existing state-of-the-art methods in the literature.