Hongyi Wu

h-index22

4papers

88citations

Novelty54%

AI Score29

Ranked #142,813 of 194,257 authors (top 74%)#3,707 in CR (top 55%)

4 Papers

4.3DCApr 29, 2022

Energy Minimization for Federated Asynchronous Learning on Battery-Powered Mobile Devices via Application Co-running

Cong Wang, Bin Hu, Hongyi Wu

Energy is an essential, but often forgotten aspect in large-scale federated systems. As most of the research focuses on tackling computational and statistical heterogeneity from the machine learning algorithms, the impact on the mobile system still remains unclear. In this paper, we design and implement an online optimization framework by connecting asynchronous execution of federated training with application co-running to minimize energy consumption on battery-powered mobile devices. From a series of experiments, we find that co-running the training process in the background with foreground applications gives the system a deep energy discount with negligible performance slowdown. Based on these results, we first study an offline problem assuming all the future occurrences of applications are available, and propose a dynamic programming-based algorithm. Then we propose an online algorithm using the Lyapunov framework to explore the solution space via the energy-staleness trade-off. The extensive experiments demonstrate that the online optimization framework can save over 60% energy with 3 times faster convergence speed compared to the previous schemes.

8.4CVJan 7, 2025

Superpixel Boundary Correction for Weakly-Supervised Semantic Segmentation on Histopathology Images

Hongyi Wu, Hong Zhang

With the rapid advancement of deep learning, computational pathology has made significant progress in cancer diagnosis and subtyping. Tissue segmentation is a core challenge, essential for prognosis and treatment decisions. Weakly supervised semantic segmentation (WSSS) reduces the annotation requirement by using image-level labels instead of pixel-level ones. However, Class Activation Map (CAM)-based methods still suffer from low spatial resolution and unclear boundaries. To address these issues, we propose a multi-level superpixel correction algorithm that refines CAM boundaries using superpixel clustering and floodfill. Experimental results show that our method achieves great performance on breast cancer segmentation dataset with mIoU of 71.08%, significantly improving tumor microenvironment boundary delineation.

22.9CRMay 5, 2021

GALA: Greedy ComputAtion for Linear Algebra in Privacy-Preserved Neural Networks

Qiao Zhang, Chunsheng Xin, Hongyi Wu

Machine Learning as a Service (MLaaS) is enabling a wide range of smart applications on end devices. However, privacy-preserved computation is still expensive. Our investigation has found that the most time-consuming component of the HE-based linear computation is a series of Permutation (Perm) operations that are imperative for dot product and convolution in privacy-preserved MLaaS. To this end, we propose GALA: Greedy computAtion for Linear Algebra in privacy-preserved neural networks, which views the HE-based linear computation as a series of Homomorphic Add, Mult and Perm operations and chooses the least expensive operation in each linear computation step to reduce the overall cost. GALA makes the following contributions: (1) It introduces a row-wise weight matrix encoding and combines the share generation that is needed for the GC-based nonlinear computation, to reduce the Perm operations for the dot product; (2) It designs a first-Add-second-Perm approach (named kernel grouping) to reduce Perm operations for convolution. As such, GALA efficiently reduces the cost for the HE-based linear computation, which is a critical building block in almost all of the recent frameworks for privacy-preserved neural networks, including GAZELLE (Usenix Security'18), DELPHI (Usenix Security'20), and CrypTFlow2 (CCS'20). With its deep optimization of the HE-based linear computation, GALA can be a plug-and-play module integrated into these systems to further boost their efficiency. Our experiments show that it achieves a significant speedup up to 700x for the dot product and 14x for the convolution computation under different data dimensions. Meanwhile, GALA demonstrates an encouraging runtime boost by 2.5x, 2.7x, 3.2x, 8.3x, 7.7x, and 7.5x over GAZELLE and 6.5x, 6x, 5.7x, 4.5x, 4.2x, and 4.1x over CrypTFlow2, on AlexNet, VGG, ResNet-18, ResNet-50, ResNet-101, and ResNet-152, respectively.

4.1LGNov 12, 2019

CHEETAH: An Ultra-Fast, Approximation-Free, and Privacy-Preserved Neural Network Framework based on Joint Obscure Linear and Nonlinear Computations

Qiao Zhang, Cong Wang, Chunsheng Xin et al.

Machine Learning as a Service (MLaaS) is enabling a wide range of smart applications on end devices. However, such convenience comes with a cost of privacy because users have to upload their private data to the cloud. This research aims to provide effective and efficient MLaaS such that the cloud server learns nothing about user data and the users cannot infer the proprietary model parameters owned by the server. This work makes the following contributions. First, it unveils the fundamental performance bottleneck of existing schemes due to the heavy permutations in computing linear transformation and the use of communication intensive Garbled Circuits for nonlinear transformation. Second, it introduces an ultra-fast secure MLaaS framework, CHEETAH, which features a carefully crafted secret sharing scheme that runs significantly faster than existing schemes without accuracy loss. Third, CHEETAH is evaluated on the benchmark of well-known, practical deep networks such as AlexNet and VGG-16 on the MNIST and ImageNet datasets. The results demonstrate more than 100x speedup over the fastest GAZELLE (Usenix Security'18), 2000x speedup over MiniONN (ACM CCS'17) and five orders of magnitude speedup over CryptoNets (ICML'16). This significant speedup enables a wide range of practical applications based on privacy-preserved deep neural networks.