LGDec 23, 2022
Deep Unfolding-based Weighted Averaging for Federated Learning in Heterogeneous EnvironmentsAyano Nakai-Kasai, Tadashi Wadayama
Federated learning is a collaborative model training method that iterates model updates by multiple clients and aggregation of the updates by a central server. Device and statistical heterogeneity of participating clients cause significant performance degradation so that an appropriate aggregation weight should be assigned to each client in the aggregation phase of the server. To adjust the aggregation weights, this paper employs deep unfolding, which is known as the parameter tuning method that leverages both learning capability using training data like deep learning and domain knowledge. This enables us to directly incorporate the heterogeneity of the environment of interest into the tuning of the aggregation weights. The proposed approach can be combined with various federated learning algorithms. The results of numerical experiments indicate that a higher test accuracy for unknown class-balanced data can be obtained with the proposed method than that with conventional heuristic weighting methods. The proposed method can handle large-scale learning models with the aid of pretrained models such that it can perform practical real-world tasks. Convergence rate of federated learning algorithms with the proposed method is also provided in this paper.
35.4LGMay 1
Federated Learning with Hypergradient-based Online Update of Aggregation WeightsAyano Nakai-Kasai, Tadashi Wadayama
Federated learning using mobile and Internet of Things devices requires not only the ability to handle heterogeneity of clients' data distributions but also high adaptability to varying communication environments. We propose FedHAW (Federated Learning with Hypergradient-based update of Aggregation Weights) that implements online updates of aggregation weights. FedHAW updates the aggregation weights by using hypergradient, the gradient of the objective function with respect to the weights, which can be calculated with low computational overhead. Simulation results show that the proposed method possesses high generalization performance in heterogeneous environments and high robustness to communication errors.
8.1SPApr 22
Computationally Efficient Sparse Signal Recovery via Linear Sketching and Deep UnfoldingTatsuki Tokumura, Ayano Nakai-Kasai, Tadashi Wadayama
This paper provides a sparse signal recovery algorithm, DU-PSISTA (Deep Unfolded-Periodic Sketched Iterative Shrinkage-Thresholding Algorithm), which aims to balance computational efficiency and accuracy for recovering high-dimensional sparse signals, and a convergence analysis under sufficient conditions. DU-PSISTA introduces a random matrix projection known as sketching to reduce the dimensionality of gradient computations and periodically alternates between the standard ISTA and the sketched variant. This hybrid structure enables flexible control over the trade-off between accuracy and computational complexity through a pre-configurable period parameter. The algorithm includes many parameters to be tuned such as step sizes and thresholding factors so that we incorporate deep unfolding that optimizes the parameters through data-driven training, enabling the algorithm to adaptively improve convergence speed and performance. We show that the proposed method achieves a linear-type contraction to a neighborhood of the true sparse signal with properly selected parameters. The analysis provides an interpretation for the effectiveness of the hybrid structure to improve recovery accuracy. Numerical experiments confirm that our method achieves comparable recovery performance to conventional deep unfolded ISTA while reducing computational complexity, especially when the period parameter and sketch size are properly selected. The results are also consistent with the theoretical insights.
LGMay 22, 2025
Multi-Output Gaussian Processes for Graph-Structured DataAyano Nakai-Kasai, Tadashi Wadayama
Graph-structured data is a type of data to be obtained associated with a graph structure where vertices and edges describe some kind of data correlation. This paper proposes a regression method on graph-structured data, which is based on multi-output Gaussian processes (MOGP), to capture both the correlation between vertices and the correlation between associated data. The proposed formulation is built on the definition of MOGP. This allows it to be applied to a wide range of data configurations and scenarios. Moreover, it has high expressive capability due to its flexibility in kernel design. It includes existing methods of Gaussian processes for graph-structured data as special cases and is possible to remove restrictions on data configurations, model selection, and inference scenarios in the existing methods. The performance of extensions achievable by the proposed formulation is evaluated through computer experiments with synthetic and real data.