Zhiwei Chang

51.9SEMar 23

Rethinking Software Misconfigurations in the Real World: An Empirical Study and Literature Analysis

Yuhao Liu, Yingnan Zhou, Hanfeng Zhang et al.

Software misconfiguration has consistently been a major reason for software failures. Over the past two decades, much work has been done to detect and diagnose software misconfigurations. However, there is still a gap between real-world misconfigurations and the literature. It is desirable to investigate whether existing taxonomy and tools are applicable for real-world misconfigurations in modern software. In this paper, we conduct an empirical study on 772 real-world misconfiguration issues, based on which we propose a novel classification of the root causes of software misconfigurations, i.e., constraint violation, resource unavailability, component-dependency error, and configuration semantic misinterpretation. Then, we systematically review the literature on misconfiguration troubleshooting to study the trends of research and the practicality of the tools and datasets in this field. We find that the research targets have changed from system and infrastructure software to advanced applications (e.g., cloud service). In the meanwhile, the research on non-crash misconfigurations also has significant growth. Despite the progress, a majority of studies lack reproducibility due to the unavailable tools and evaluation datasets. In total, only ten tools and four datasets are publicly available. We analyze the trends of existing literature on misconfiguration troubleshooting, summarize the challenges that users are faced with, and highlight the suggestions to mitigate and diagnose software misconfigurations. We release the real-world dataset of misconfiguration issues for follow-up research.

CRJun 22, 2024

Breaking Secure Aggregation: Label Leakage from Aggregated Gradients in Federated Learning

Zhibo Wang, Zhiwei Chang, Jiahui Hu et al.

Federated Learning (FL) exhibits privacy vulnerabilities under gradient inversion attacks (GIAs), which can extract private information from individual gradients. To enhance privacy, FL incorporates Secure Aggregation (SA) to prevent the server from obtaining individual gradients, thus effectively resisting GIAs. In this paper, we propose a stealthy label inference attack to bypass SA and recover individual clients' private labels. Specifically, we conduct a theoretical analysis of label inference from the aggregated gradients that are exclusively obtained after implementing SA. The analysis results reveal that the inputs (embeddings) and outputs (logits) of the final fully connected layer (FCL) contribute to gradient disaggregation and label restoration. To preset the embeddings and logits of FCL, we craft a fishing model by solely modifying the parameters of a single batch normalization (BN) layer in the original model. Distributing client-specific fishing models, the server can derive the individual gradients regarding the bias of FCL by resolving a linear system with expected embeddings and the aggregated gradients as coefficients. Then the labels of each client can be precisely computed based on preset logits and gradients of FCL's bias. Extensive experiments show that our attack achieves large-scale label recovery with 100\% accuracy on various datasets and model architectures.

Zhiwei Chang

2 Papers