Hakaru Tamukoh

RO
h-index11
15papers
32citations
Novelty36%
AI Score45

15 Papers

CVOct 20, 2023Code
ManifoldNeRF: View-dependent Image Feature Supervision for Few-shot Neural Radiance Fields

Daiju Kanaoka, Motoharu Sonogashira, Hakaru Tamukoh et al.

Novel view synthesis has recently made significant progress with the advent of Neural Radiance Fields (NeRF). DietNeRF is an extension of NeRF that aims to achieve this task from only a few images by introducing a new loss function for unknown viewpoints with no input images. The loss function assumes that a pre-trained feature extractor should output the same feature even if input images are captured at different viewpoints since the images contain the same object. However, while that assumption is ideal, in reality, it is known that as viewpoints continuously change, also feature vectors continuously change. Thus, the assumption can harm training. To avoid this harmful training, we propose ManifoldNeRF, a method for supervising feature vectors at unknown viewpoints using interpolated features from neighboring known viewpoints. Since the method provides appropriate supervision for each unknown viewpoint by the interpolated features, the volume representation is learned better than DietNeRF. Experimental results show that the proposed method performs better than others in a complex scene. We also experimented with several subsets of viewpoints from a set of viewpoints and identified an effective set of viewpoints for real environments. This provided a basic policy of viewpoint patterns for real-world application. The code is available at https://github.com/haganelego/ManifoldNeRF_BMVC2023

RONov 30, 2025
Sign Language Recognition using Bidirectional Reservoir Computing

Nitin Kumar Singh, Arie Rachmad Syulistyo, Yuichiro Tanaka et al.

Sign language recognition (SLR) facilitates communication between deaf and hearing individuals. Deep learning is widely used to develop SLR-based systems; however, it is computationally intensive and requires substantial computational resources, making it unsuitable for resource-constrained devices. To address this, we propose an efficient sign language recognition system using MediaPipe and an echo state network (ESN)-based bidirectional reservoir computing (BRC) architecture. MediaPipe extracts hand joint coordinates, which serve as inputs to the ESN-based BRC architecture. The BRC processes these features in both forward and backward directions, efficiently capturing temporal dependencies. The resulting states of BRC are concatenated to form a robust representation for classification. We evaluated our method on the Word-Level American Sign Language (WLASL) video dataset, achieving a competitive accuracy of 57.71% and a significantly lower training time of only 9 seconds, in contrast to the 55 minutes and $38$ seconds required by the deep learning-based Bi-GRU approach. Consequently, the BRC-based SLR system is well-suited for edge devices.

CLDec 29, 2025
Reservoir Computing inspired Matrix Multiplication-free Language Model

Takumi Shiratsuchi, Yuichiro Tanaka, Hakaru Tamukoh

Large language models (LLMs) have achieved state-of-the-art performance in natural language processing; however, their high computational cost remains a major bottleneck. In this study, we target computational efficiency by focusing on a matrix multiplication free language model (MatMul-free LM) and further reducing the training cost through an architecture inspired by reservoir computing. Specifically, we partially fix and share the weights of selected layers in the MatMul-free LM and insert reservoir layers to obtain rich dynamic representations without additional training overhead. Additionally, several operations are combined to reduce memory accesses. Experimental results show that the proposed architecture reduces the number of parameters by up to 19%, training time by 9.9%, and inference time by 8.0%, while maintaining comparable performance to the baseline model.

CVDec 22, 2025
Sign Language Recognition using Parallel Bidirectional Reservoir Computing

Nitin Kumar Singh, Arie Rachmad Syulistyo, Yuichiro Tanaka et al.

Sign language recognition (SLR) facilitates communication between deaf and hearing communities. Deep learning based SLR models are commonly used but require extensive computational resources, making them unsuitable for deployment on edge devices. To address these limitations, we propose a lightweight SLR system that combines parallel bidirectional reservoir computing (PBRC) with MediaPipe. MediaPipe enables real-time hand tracking and precise extraction of hand joint coordinates, which serve as input features for the PBRC architecture. The proposed PBRC architecture consists of two echo state network (ESN) based bidirectional reservoir computing (BRC) modules arranged in parallel to capture temporal dependencies, thereby creating a rich feature representation for classification. We trained our PBRC-based SLR system on the Word-Level American Sign Language (WLASL) video dataset, achieving top-1, top-5, and top-10 accuracies of 60.85%, 85.86%, and 91.74%, respectively. Training time was significantly reduced to 18.67 seconds due to the intrinsic properties of reservoir computing, compared to over 55 minutes for deep learning based methods such as Bi-GRU. This approach offers a lightweight, cost-effective solution for real-time SLR on edge devices.

33.9ARApr 8
CBM-Dual: A 65-nm Fully Connected Chaotic Boltzmann Machine Processor for Dual Function Simulated Annealing and Reservoir Computing

Kanta Yoshioka, Soshi Hirayae, Yuichiro Tanaka et al.

This paper presents CBM-Dual, the first silicon-proven digital chaotic dynamics processor (CDP) supporting both simulated annealing (SA) and reservoir computing (RC). CBM-Dual enables real-time decision-making and lightweight adaptation for autonomous Edge AI, employing the largest-scale fully connected 1024-neuron chaotic Boltzmann machine (CBM). To address the high computational and area costs of digital CDPs, we propose: 1) a CBM-specific scheduler that exploits an inherently low neuron flip rate to reduce multiply-accumulate operations by 99%, and 2) an efficient multiply splitting scheme that reduces the area by 59%. Fabricated in 65nm (12mm$^2$), CBM-Dual achieves simultaneous heterogeneous task execution and state-of-the-art energy efficiency, delivering $\times$25-54 and $\times$4.5 improvements in the SA and RC fields, respectively.

LGFeb 25, 2025
Techniques for Enhancing Memory Capacity of Reservoir Computing

Atsuki Yokota, Ichiro Kawashima, Yohei Saito et al.

Reservoir Computing (RC) is a bio-inspired machine learning framework, and various models have been proposed. RC is a well-suited model for time series data processing, but there is a trade-off between memory capacity and nonlinearity. In this study, we propose methods to improve the memory capacity of reservoir models by modifying their network configuration except for the inside of reservoirs. The Delay method retains past inputs by adding delay node chains to the input layer with the specified number of delay steps. To suppress the effect of input value increase due to the Delay method, we divide the input weights by the number of added delay steps. The Pass through method feeds input values directly to the output layer. The Clustering method divides the input and reservoir nodes into multiple parts and integrates them at the output layer. We applied these methods to an echo state network (ESN), a typical RC model, and the chaotic Boltzmann machine (CBM)-RC, which can be efficiently implemented in integrated circuits. We evaluated their performance on the NARMA task, and measured information processing capacity (IPC) to evaluate the trade-off between memory capacity and nonlinearity.

RODec 18, 2024
Unified Understanding of Environment, Task, and Human for Human-Robot Interaction in Real-World Environments

Yuga Yano, Akinobu Mizutani, Yukiya Fukuda et al.

To facilitate human--robot interaction (HRI) tasks in real-world scenarios, service robots must adapt to dynamic environments and understand the required tasks while effectively communicating with humans. To accomplish HRI in practice, we propose a novel indoor dynamic map, task understanding system, and response generation system. The indoor dynamic map optimizes robot behavior by managing an occupancy grid map and dynamic information, such as furniture and humans, in separate layers. The task understanding system targets tasks that require multiple actions, such as serving ordered items. Task representations that predefine the flow of necessary actions are applied to achieve highly accurate understanding. The response generation system is executed in parallel with task understanding to facilitate smooth HRI by informing humans of the subsequent actions of the robot. In this study, we focused on waiter duties in a restaurant setting as a representative application of HRI in a dynamic environment. We developed an HRI system that could perform tasks such as serving food and cleaning up while communicating with customers. In experiments conducted in a simulated restaurant environment, the proposed HRI system successfully communicated with customers and served ordered food with 90\% accuracy. In a questionnaire administered after the experiment, the HRI system of the robot received 4.2 points out of 5. These outcomes indicated the effectiveness of the proposed method and HRI system in executing waiter tasks in real-world environments.

LGNov 28, 2024
Enhancing Neural Network Robustness Against Fault Injection Through Non-linear Weight Transformations

Ninnart Fuengfusin, Hakaru Tamukoh

Deploying deep neural networks (DNNs) in real-world environments poses challenges due to faults that can manifest in physical hardware from radiation, aging, and temperature fluctuations. To address this, previous works have focused on protecting DNNs via activation range restriction using clipped ReLU and finding the optimal clipping threshold. However, this work instead focuses on constraining DNN weights by applying saturated activation functions (SAFs): Tanh, Arctan, and others. SAFs prevent faults from causing DNN weights to become excessively large, which can lead to model failure. These methods not only enhance the robustness of DNNs against fault injections but also improve DNN performance by a small margin. Before deployment, DNNs are trained with weights constrained by SAFs. During deployment, the weights without applied SAF are written to mediums with faults. When read, weights with faults are applied with SAFs and are used for inference. We demonstrate our proposed method across three datasets (CIFAR10, CIFAR100, ImageNet 2012) and across three datatypes (32-bit floating point (FP32), 16-bit floating point, and 8-bit fixed point). We show that our method enables FP32 ResNet18 with ImageNet 2012 to operate at a bit-error rate of 0.00001 with minor accuracy loss, while without the proposed method, the FP32 DNN only produces random guesses. Furthermore, to accelerate the training process, we demonstrate that an ImageNet 2012 pre-trained ResNet18 can be adapted to SAF by training for a few epochs with a slight improvement in Top-1 accuracy while still ensuring robustness against fault injection.

LGNov 28, 2024
Harden Deep Neural Networks Against Fault Injections Through Weight Scaling

Ninnart Fuengfusin, Hakaru Tamukoh

Deep neural networks (DNNs) have enabled smart applications on hardware devices. However, these hardware devices are vulnerable to unintended faults caused by aging, temperature variance, and write errors. These faults can cause bit-flips in DNN weights and significantly degrade the performance of DNNs. Thus, protection against these faults is crucial for the deployment of DNNs in critical applications. Previous works have proposed error correction codes based methods, however these methods often require high overheads in both memory and computation. In this paper, we propose a simple yet effective method to harden DNN weights by multiplying weights by constants before storing them to fault-prone medium. When used, these weights are divided back by the same constants to restore the original scale. Our method is based on the observation that errors from bit-flips have properties similar to additive noise, therefore by dividing by constants can reduce the absolute error from bit-flips. To demonstrate our method, we conduct experiments across four ImageNet 2012 pre-trained models along with three different data types: 32-bit floating point, 16-bit floating point, and 8-bit fixed point. This method demonstrates that by only multiplying weights with constants, Top-1 Accuracy of 8-bit fixed point ResNet50 is improved by 54.418 at bit-error rate of 0.0001.

ROMay 29, 2020
Hibikino-Musashi@Home 2019 Team Description Paper

Yuichiro Tanaka, Yutaro Ishida, Yushi Abe et al.

Our team, Hibikino-Musashi@Home (HMA), was founded in 2010. It is based in the Kitakyushu Science and Research Park, Japan. Since 2010, we have participated in the RoboCup@Home Japan Open competition open platform league annually. We have also participated in the RoboCup 2017 Nagoya as an open platform league and domestic standard platform league teams, and in the RoboCup 2018 Montreal as a domestic standard platform league team. Currently, we have 23 members from seven different laboratories based in Kyushu Institute of Technology. This paper aims to introduce the activities that are performed by our team and the technologies that we use.

ROMay 29, 2020
Hibikino-Musashi@Home 2020 Team Description Paper

Tomohiro Ono, Yuichiro Tanaka, Yutaro Ishida et al.

Our team, Hibikino-Musashi@Home (HMA), was founded in 2010. It is based in Japan in the Kitakyushu Science and Research Park. Since 2010, we have annually participated in the RoboCup@Home Japan Open competition in the open platform league (OPL). We participated as an open platform league team in the 2017 Nagoya RoboCup competition and as a domestic standard platform league (DSPL) team in the 2017 Nagoya, 2018 Montreal, and 2019 Sydney RoboCup competitions. We also participated in the World Robot Challenge (WRC) 2018 in the service-robotics category of the partner-robot challenge (real space) and won first place. Currently, we have 20 members from eight different laboratories within the Kyushu Institute of Technology. In this paper, we introduce the activities that have been performed by our team and the technologies that we use.

LGNov 14, 2019
An Efficient Hardware-Oriented Dropout Algorithm

Yoeng Jye Yeoh, Takashi Morie, Hakaru Tamukoh

This paper proposes a hardware-oriented dropout algorithm, which is efficient for field programmable gate array (FPGA) implementation. In deep neural networks (DNNs), overfitting occurs when networks are overtrained and adapt too well to training data. Consequently, they fail in predicting unseen data used as test data. Dropout is a common technique that is often applied in DNNs to overcome this problem. In general, implementing such training algorithms of DNNs in embedded systems is difficult due to power and memory constraints. Training DNNs is power-, time-, and memory- intensive; however, embedded systems require low power consumption and real-time processing. An FPGA is suitable for embedded systems for its parallel processing characteristic and low operating power; however, due to its limited memory and different architecture, it is difficult to apply general neural network algorithms. Therefore, we propose a hardware-oriented dropout algorithm that can effectively utilize the characteristics of an FPGA with less memory required. Software program verification demonstrates that the performance of the proposed method is identical to that of conventional dropout, and hardware synthesis demonstrates that it results in significant resource reduction.

LGAug 2, 2019
Network with Sub-Networks

Ninnart Fuengfusin, Hakaru Tamukoh

We introduce network with sub-networks, a neural network which its weight layers could be detached into sub-neural networks during inference. To develop weights and biases which could be inserted in both base and sub-neural networks, firstly, the parameters are copied from sub-model to base-model. Each model is forward-propagated separately. Gradients from a pair of networks are averaged and, used to update both networks. Our base model achieves the test-accuracy which is comparable to the regularly trained models, while the model maintains the ability to detach weight layers.

RONov 15, 2017
Hibikino-Musashi@Home 2017 Team Description Paper

Sansei Hori, Yutaro Ishida, Yuta Kiyama et al.

Our team Hibikino-Musashi@Home was founded in 2010. It is based in Kitakyushu Science and Research Park, Japan. Since 2010, we have participated in the RoboCup@Home Japan open competition open-platform league every year. Currently, the Hibikino-Musashi@Home team has 24 members from seven different laboratories based in the Kyushu Institute of Technology. Our home-service robots are used as platforms for both education and implementation of our research outcomes. In this paper, we introduce our team and the technologies that we have implemented in our robots.

RODec 12, 2016
Depth-Based Visual Servoing Using Low-Accurate Arm

Ludovic Hofer, Michio Tanaka, Hakaru Tamukoh et al.

This paper proposes a visual-servoing method dedicated to grasping of daily-life objects. In order to obtain an affordable solution, we use a low-accurate robotic arm. Our method corrects errors by using an RGB-D sensor. It is based on SURF invariant features which allows us to perform object recognition at a high frame rate. We define regions of interest based on depth segmentation, and we use them to speed-up the recognition and to improve reliability. The system has been tested on a real-world scenario. In spite of the lack of accuracy of all the components and the uncontrolled environment, it grasps objects successfully on more than 95 percents of the trials.