Hideya Ochiai

LG
20papers
226citations
Novelty42%
AI Score46

20 Papers

LGMay 24, 2022
Wireless Ad Hoc Federated Learning: A Fully Distributed Cooperative Machine Learning

Hideya Ochiai, Yuwei Sun, Qingzhe Jin et al.

Privacy-sensitive data is stored in autonomous vehicles, smart devices, or sensor nodes that can move around with making opportunistic contact with each other. Federation among such nodes was mainly discussed in the context of federated learning with a centralized mechanism in many works. However, because of multi-vendor issues, those nodes do not want to rely on a specific server operated by a third party for this purpose. In this paper, we propose a wireless ad hoc federated learning (WAFL) -- a fully distributed cooperative machine learning organized by the nodes physically nearby. WAFL can develop generalized models from Non-IID datasets stored in distributed nodes locally by exchanging and aggregating them with each other over opportunistic node-to-node contacts. In our benchmark-based evaluation with various opportunistic networks, WAFL has achieved higher accuracy of 94.8-96.3% than the self-training case of 84.7%. All our evaluation results show that WAFL can train and converge the model parameters from highly-partitioned Non-IID datasets over opportunistic networks without any centralized mechanisms.

LGMar 22, 2022
Feature Distribution Matching for Federated Domain Generalization

Yuwei Sun, Ng Chong, Hideya Ochiai

Multi-source domain adaptation has been intensively studied. The distribution shift in features inherent to specific domains causes the negative transfer problem, degrading a model's generality to unseen tasks. In Federated Learning (FL), learned model parameters are shared to train a global model that leverages the underlying knowledge across client models trained on separate data domains. Nonetheless, the data confidentiality of FL hinders the effectiveness of traditional domain adaptation methods that require prior knowledge of different domain data. We propose a new federated domain generalization method called Federated Knowledge Alignment (FedKA). FedKA leverages feature distribution matching in a global workspace such that the global model can learn domain-invariant client features under the constraint of unknown client data. FedKA employs a federated voting mechanism that generates target domain pseudo-labels based on the consensus from clients to facilitate global model fine-tuning. We performed extensive experiments, including an ablation study, to evaluate the effectiveness of the proposed method in both image and text classification tasks using different model architectures. The empirical results show that FedKA achieves performance gains of 8.8% and 3.5% in Digit-Five and Office-Caltech10, respectively, and a gain of 0.7% in Amazon Review with extremely limited training data. Moreover, we studied the effectiveness of FedKA in alleviating the negative transfer of FL based on a new criterion called Group Effect. The results show that FedKA can reduce negative transfer, improving the performance gain via model aggregation by 4 times.

LGMar 22, 2022
Semi-Targeted Model Poisoning Attack on Federated Learning via Backward Error Analysis

Yuwei Sun, Hideya Ochiai, Jun Sakuma

Model poisoning attacks on federated learning (FL) intrude in the entire system via compromising an edge model, resulting in malfunctioning of machine learning models. Such compromised models are tampered with to perform adversary-desired behaviors. In particular, we considered a semi-targeted situation where the source class is predetermined however the target class is not. The goal is to cause the global classifier to misclassify data of the source class. Though approaches such as label flipping have been adopted to inject poisoned parameters into FL, it has been shown that their performances are usually class-sensitive varying with different target classes applied. Typically, an attack can become less effective when shifting to a different target class. To overcome this challenge, we propose the Attacking Distance-aware Attack (ADA) to enhance a poisoning attack by finding the optimized target class in the feature space. Moreover, we studied a more challenging situation where an adversary had limited prior knowledge about a client's data. To tackle this problem, ADA deduces pair-wise distances between different classes in the latent feature space from shared model parameters based on the backward error analysis. We performed extensive empirical evaluations on ADA by varying the factor of attacking frequency in three different image classification tasks. As a result, ADA succeeded in increasing the attack performance by 1.8 times in the most challenging case with an attacking frequency of 0.01.

20.8CRApr 13
Detection of Anomalous Network Nodes via Hierarchical Prediction and Extreme Value Theory

Sevvandi Kandanaarachchi, Mahdi Abolghasemi, Hideya Ochiai et al.

Continuously evolving cyber-attacks against industrial networks reduce the effectiveness of signature-based detection methods. Once malware has infiltrated a network (for example, entering via an unsecured device), it can infect further network nodes and carry out malicious activity. Infected nodes can exhibit unusual behaviour in their use of Address Resolution Protocol (ARP) calls within the network. In order to detect such anomalous nodes, we propose a two-stage method: (i) modelling of ARP call behaviour via hierarchical time series prediction methods, and (ii) exploiting Extreme Value Theory (EVT) to robustly detect whether deviations from expected behaviour are anomalous. EVT is able to handle heavy-tailed distributions which are exhibited by internet traffic. Empirical evaluations on a real-life dataset containing over 10M ARP calls from 362 nodes show that the proposed method results in considerably reduced number of false positives, addressing the problem of alert fatigue commonly reported by security professionals.

LGJul 16, 2024
Detection of Global Anomalies on Distributed IoT Edges with Device-to-Device Communication

Hideya Ochiai, Riku Nishihata, Eisuke Tomiyama et al.

Anomaly detection is an important function in IoT applications for finding outliers caused by abnormal events. Anomaly detection sometimes comes with high-frequency data sampling which should be carried out at Edge devices rather than Cloud. In this paper, we consider the case that multiple IoT devices are installed in a single remote site and that they collaboratively detect anomalies from the observations with device-to-device communications. For this, we propose a fully distributed collaborative scheme for training distributed anomaly detectors with Wireless Ad Hoc Federated Learning, namely "WAFL-Autoencoder". We introduce the concept of Global Anomaly which sample is not only rare to the local device but rare to all the devices in the target domain. We also propose a distributed threshold-finding algorithm for Global Anomaly detection. With our standard benchmark-based evaluation, we have confirmed that our scheme trained anomaly detectors perfectly across the devices. We have also confirmed that the devices collaboratively found thresholds for Global Anomaly detection with low false positive rates while achieving high true positive rates with few exceptions.

LGNov 7, 2022
Resilience of Wireless Ad Hoc Federated Learning against Model Poisoning Attacks

Naoya Tezuka, Hideya Ochiai, Yuwei Sun et al.

Wireless ad hoc federated learning (WAFL) is a fully decentralized collaborative machine learning framework organized by opportunistically encountered mobile nodes. Compared to conventional federated learning, WAFL performs model training by weakly synchronizing the model parameters with others, and this shows great resilience to a poisoned model injected by an attacker. In this paper, we provide our theoretical analysis of the WAFL's resilience against model poisoning attacks, by formulating the force balance between the poisoned model and the legitimate model. According to our experiments, we confirmed that the nodes directly encountered the attacker has been somehow compromised to the poisoned model but other nodes have shown great resilience. More importantly, after the attacker has left the network, all the nodes have finally found stronger model parameters combined with the poisoned model. Most of the attack-experienced cases achieved higher accuracy than the no-attack-experienced cases.

CVApr 2, 2023
Instance-Level Trojan Attacks on Visual Question Answering via Adversarial Learning in Neuron Activation Space

Yuwei Sun, Hideya Ochiai, Jun Sakuma

Trojan attacks embed perturbations in input data leading to malicious behavior in neural network models. A combination of various Trojans in different modalities enables an adversary to mount a sophisticated attack on multimodal learning such as Visual Question Answering (VQA). However, multimodal Trojans in conventional methods are susceptible to parameter adjustment during processes such as fine-tuning. To this end, we propose an instance-level multimodal Trojan attack on VQA that efficiently adapts to fine-tuned models through a dual-modality adversarial learning method. This method compromises two specific neurons in a specific perturbation layer in the pretrained model to produce overly large neuron activations. Then, a malicious correlation between these overactive neurons and the malicious output of a fine-tuned model is established through adversarial learning. Extensive experiments are conducted using the VQA-v2 dataset, based on a wide range of metrics including sample efficiency, stealthiness, and robustness. The proposed attack demonstrates enhanced performance with diverse vision and text Trojans tailored for each sample. We demonstrate that the proposed attack can be efficiently adapted to different fine-tuned models, by injecting only a few shots of Trojan samples. Moreover, we investigate the attack performance under conventional defenses, where the defenses cannot effectively mitigate the attack.

CVSep 18, 2024
Logic-Free Building Automation: Learning the Control of Room Facilities with Wall Switches and Ceiling Camera

Hideya Ochiai, Kohki Hashimoto, Takuya Sakamoto et al.

Artificial intelligence enables smarter control in building automation by its learning capability of users' preferences on facility control. Reinforcement learning (RL) was one of the approaches to this, but it has many challenges in real-world implementations. We propose a new architecture for logic-free building automation (LFBA) that leverages deep learning (DL) to control room facilities without predefined logic. Our approach differs from RL in that it uses wall switches as supervised signals and a ceiling camera to monitor the environment, allowing the DL model to learn users' preferred controls directly from the scenes and switch states. This LFBA system is tested by our testbed with various conditions and user activities. The results demonstrate the efficacy, achieving 93%-98% control accuracy with VGG, outperforming other DL models such as Vision Transformer and ResNet. This indicates that LFBA can achieve smarter and more user-friendly control by learning from the observable scenes and user interactions.

LGSep 22, 2023
Associative Transformer

Yuwei Sun, Hideya Ochiai, Zhirong Wu et al.

Emerging from the pairwise attention in conventional Transformers, there is a growing interest in sparse attention mechanisms that align more closely with localized, contextual learning in the biological brain. Existing studies such as the Coordination method employ iterative cross-attention mechanisms with a bottleneck to enable the sparse association of inputs. However, these methods are parameter inefficient and fail in more complex relational reasoning tasks. To this end, we propose Associative Transformer (AiT) to enhance the association among sparsely attended input tokens, improving parameter efficiency and performance in various vision tasks such as classification and relational reasoning. AiT leverages a learnable explicit memory comprising specialized priors that guide bottleneck attentions to facilitate the extraction of diverse localized tokens. Moreover, AiT employs an associative memory-based token reconstruction using a Hopfield energy function. The extensive empirical experiments demonstrate that AiT requires significantly fewer parameters and attention layers outperforming a broad range of sparse Transformer models. Additionally, AiT outperforms the SOTA sparse Transformer models including the Coordination method on the Sort-of-CLEVR dataset.

CVAug 24, 2022
Bidirectional Contrastive Split Learning for Visual Question Answering

Yuwei Sun, Hideya Ochiai

Visual Question Answering (VQA) based on multi-modal data facilitates real-life applications such as home robots and medical diagnoses. One significant challenge is to devise a robust decentralized learning framework for various client models where centralized data collection is refrained due to confidentiality concerns. This work aims to tackle privacy-preserving VQA by decoupling a multi-modal model into representation modules and a contrastive module and leveraging inter-module gradients sharing and inter-client weight sharing. To this end, we propose Bidirectional Contrastive Split Learning (BiCSL) to train a global multi-modal model on the entire data distribution of decentralized clients. We employ the contrastive loss that enables a more efficient self-supervised learning of decentralized modules. Comprehensive experiments are conducted on the VQA-v2 dataset based on five SOTA VQA models, demonstrating the effectiveness of the proposed method. Furthermore, we inspect BiCSL's robustness against a dual-key backdoor attack on VQA. Consequently, BiCSL shows much better robustness to the multi-modal adversarial attack compared to the centralized learning method, which provides a promising approach to decentralized multi-modal learning.

79.9CRApr 8
FedDetox: Robust Federated SLM Alignment via On-Device Data Sanitization

Shunan Zhu, Jiawei Chen, Yonghao Yu et al.

As high quality public data becomes scarce, Federated Learning (FL) provides a vital pathway to leverage valuable private user data while preserving privacy. However, real-world client data often contains toxic or unsafe information. This leads to a critical issue we define as unintended data poisoning, which can severely damage the safety alignment of global models during federated alignment. To address this, we propose FedDetox, a robust framework tailored for Small Language Models (SLMs) on resource-constrained edge devices. We first employ knowledge distillation to transfer sophisticated safety alignment capabilities from large scale safety aligned teacher models into light weight student classifiers suitable for resource constrained edge devices. Specifically, during federated learning for human preference alignment, the edge client identifies unsafe samples at the source and replaces them with refusal templates, effectively transforming potential poisons into positive safety signals. Experiments demonstrate that our approach preserves model safety at a level comparable to centralized baselines without compromising general utility.

CVDec 5, 2025Code
University Building Recognition Dataset in Thailand for the mission-oriented IoT sensor system

Takara Taniguchi, Yudai Ueda, Atsuya Muramatsu et al.

Many industrial sectors have been using of machine learning at inference mode on edge devices. Future directions show that training on edge devices is promising due to improvements in semiconductor performance. Wireless Ad Hoc Federated Learning (WAFL) has been proposed as a promising approach for collaborative learning with device-to-device communication among edges. In particular, WAFL with Vision Transformer (WAFL-ViT) has been tested on image recognition tasks with the UTokyo Building Recognition Dataset (UTBR). Since WAFL-ViT is a mission-oriented sensor system, it is essential to construct specific datasets by each mission. In our work, we have developed the Chulalongkorn University Building Recognition Dataset (CUBR), which is specialized for Chulalongkorn University as a case study in Thailand. Additionally, our results also demonstrate that training on WAFL scenarios achieves better accuracy than self-training scenarios. Dataset is available in https://github.com/jo2lxq/wafl/.

ROAug 10, 2021Code
Roadside-assisted Cooperative Planning using Future Path Sharing for Autonomous Driving

Mai Hirata, Manabu Tsukada, Keisuke Okumura et al.

Cooperative intelligent transportation systems (ITS) are used by autonomous vehicles to communicate with surrounding autonomous vehicles and roadside units (RSU). Current C-ITS applications focus primarily on real-time information sharing, such as cooperative perception. In addition to real-time information sharing, self-driving cars need to coordinate their action plans to achieve higher safety and efficiency. For this reason, this study defines a vehicle's future action plan/path and designs a cooperative path-planning model at intersections using future path sharing based on the future path information of multiple vehicles. The notion is that when the RSU detects a potential conflict of vehicle paths or an acceleration opportunity according to the shared future paths, it will generate a coordinated path update that adjusts the speeds of the vehicles. We implemented the proposed method using the open-source Autoware autonomous driving software and evaluated it with the LGSVL autonomous vehicle simulator. We conducted simulation experiments with two vehicles at a blind intersection scenario, finding that each car can travel safely and more efficiently by planning a path that reflects the action plans of all vehicles involved. The time consumed by introducing the RSU is 23.0 % and 28.1 % shorter than that of the stand-alone autonomous driving case at the intersection.

CRNov 2, 2021
Misbehavior Detection Using Collective Perception under Privacy Considerations

Manabu Tsukada, Shimpei Arii, Hideya Ochiai et al.

In cooperative ITS, security and privacy protection are essential. Cooperative Awareness Message (CAM) is a basic V2V message standard, and misbehavior detection is critical for protection against attacking CAMs from the inside system, in addition to node authentication by Public Key Infrastructure (PKI). On the contrary, pseudonym IDs, which have been introduced to protect privacy from tracking, make it challenging to perform misbehavior detection. In this study, we improve the performance of misbehavior detection using observation data of other vehicles. This is referred to as collective perception message (CPM), which is becoming the new standard in European countries. We have experimented using realistic traffic scenarios and succeeded in reducing the rate of rejecting valid CAMs (false positive) by approximately 15 percentage points while maintaining the rate of correctly detecting attacks (true positive).

CROct 12, 2021
Federated Phish Bowl: LSTM-Based Decentralized Phishing Email Detection

Yuwei Sun, Ng Chong, Hideya Ochiai

With increasingly more sophisticated phishing campaigns in recent years, phishing emails lure people using more legitimate-looking personal contexts. To tackle this problem, instead of traditional heuristics-based algorithms, more adaptive detection systems such as natural language processing (NLP)-powered approaches are essential to understanding phishing text representations. Nevertheless, concerns surrounding the collection of phishing data that might cover confidential information hinder the effectiveness of model learning. We propose a decentralized phishing email detection framework called Federated Phish Bowl (FedPB) which facilitates collaborative phishing detection with privacy. In particular, we devise a knowledge-sharing mechanism with federated learning (FL). Using long short-term memory (LSTM) for phishing detection, the framework adapts by sharing a global word embedding matrix across the clients, with each client running its local model with Non-IID data. We collected the most recent phishing samples to study the effectiveness of the proposed method using different client numbers and data distributions. The results show that FedPB can attain a competitive performance with a centralized phishing detector, with generality to various cases of FL retaining a prediction accuracy of 83%.

LGOct 11, 2021
Homogeneous Learning: Self-Attention Decentralized Deep Learning

Yuwei Sun, Hideya Ochiai

Federated learning (FL) has been facilitating privacy-preserving deep learning in many walks of life such as medical image classification, network intrusion detection, and so forth. Whereas it necessitates a central parameter server for model aggregation, which brings about delayed model communication and vulnerability to adversarial attacks. A fully decentralized architecture like Swarm Learning allows peer-to-peer communication among distributed nodes, without the central server. One of the most challenging issues in decentralized deep learning is that data owned by each node are usually non-independent and identically distributed (non-IID), causing time-consuming convergence of model training. To this end, we propose a decentralized learning model called Homogeneous Learning (HL) for tackling non-IID data with a self-attention mechanism. In HL, training performs on each round's selected node, and the trained model of a node is sent to the next selected node at the end of each round. Notably, for the selection, the self-attention mechanism leverages reinforcement learning to observe a node's inner state and its surrounding environment's state, and find out which node should be selected to optimize the training. We evaluate our method with various scenarios for an image classification task. The result suggests that HL can produce a better performance compared with standalone learning and greatly reduce both the total training rounds by 50.8% and the communication cost by 74.6% compared with random policy-based decentralized learning for training on non-IID data.

CRAug 20, 2021
Suspicious ARP Activity Detection and Clustering Based on Autoencoder Neural Networks

Yuwei Sun, Hideya Ochiai, Hiroshi Esaki

The rapidly increasing number of smart devices on the Internet necessitates an efficient inspection system for safeguarding our networks from suspicious activities such as Address Resolution Protocol (ARP) probes. In this research, we analyze sequence data of ARP traffic on LAN based on the numerical count and degree of its packets. Moreover, a dynamic threshold is employed to detect underlying suspicious activities, which are further converted into feature vectors to train an unsupervised autoencoder neural network. Then, we leverage K-means clustering to separate the extracted latent features of suspicious activities from the autoencoder into various patterns. Besides, to evaluate the performance, we collect and adopt a real-world network traffic dataset from five different LANs. At last, we successfully detect suspicious ARP patterns varying in scale, lifespan, and regularity on the LANs.

LGAug 2, 2021
Information Stealing in Federated Learning Systems Based on Generative Adversarial Networks

Yuwei Sun, Ng Chong, Hideya Ochiai

An attack on deep learning systems where intelligent machines collaborate to solve problems could cause a node in the network to make a mistake on a critical judgment. At the same time, the security and privacy concerns of AI have galvanized the attention of experts from multiple disciplines. In this research, we successfully mounted adversarial attacks on a federated learning (FL) environment using three different datasets. The attacks leveraged generative adversarial networks (GANs) to affect the learning process and strive to reconstruct the private data of users by learning hidden features from shared local model parameters. The attack was target-oriented drawing data with distinct class distribution from the CIFAR- 10, MNIST, and Fashion-MNIST respectively. Moreover, by measuring the Euclidean distance between the real data and the reconstructed adversarial samples, we evaluated the performance of the adversary in the learning processes in various scenarios. At last, we successfully reconstructed the real data of the victim from the shared global model parameters with all the applied datasets.

DCJul 30, 2021
Decentralized Deep Learning for Multi-Access Edge Computing: A Survey on Communication Efficiency and Trustworthiness

Yuwei Sun, Hideya Ochiai, Hiroshi Esaki

Wider coverage and a better solution to a latency reduction in 5G necessitate its combination with multi-access edge computing (MEC) technology. Decentralized deep learning (DDL) such as federated learning and swarm learning as a promising solution to privacy-preserving data processing for millions of smart edge devices, leverages distributed computing of multi-layer neural networks within the networking of local clients, whereas, without disclosing the original local training data. Notably, in industries such as finance and healthcare where sensitive data of transactions and personal medical records is cautiously maintained, DDL can facilitate the collaboration among these institutes to improve the performance of trained models while protecting the data privacy of participating clients. In this survey paper, we demonstrate the technical fundamentals of DDL that benefit many walks of society through decentralized learning. Furthermore, we offer a comprehensive overview of the current state-of-the-art in the field by outlining the challenges of DDL and the most relevant solutions from novel perspectives of communication efficiency and trustworthiness.

CRMay 6, 2021
Honeyboost: Boosting honeypot performance with data fusion and anomaly detection

Sevvandi Kandanaarachchi, Hideya Ochiai, Asha Rao

With cyber incidents and data breaches becoming increasingly common, being able to predict a cyberattack has never been more crucial. The ability of Network Anomaly Detection Systems (NADS) to identify unusual behavior makes them useful in predicting such attacks. However, NADS often suffer from high false positive rates. In this paper, we introduce a novel framework called Honeyboost that enhances the performance of honeypot aided NADS. Using data from the LAN Security Monitoring Project, Honeyboost identifies most anomalous nodes before they access the honeypot aiding early detection and prediction. Furthermore, using extreme value theory, we achieve the highly desirable low false positive rates. Honeyboost is an unsupervised method comprising two approaches: horizontal and vertical. The horizontal approach constructs a time series from the communications of each node, with node-level features encapsulating their behavior over time. The vertical approach finds anomalies in each protocol space. Using a window-based model, which is typically used in online scenarios, the horizontal and vertical approaches are combined to identify anomalies and gain useful insights. Experimental results indicate the efficacy of our framework in identifying suspicious activities of nodes.