CRDec 4, 2023
A Survey on Large Language Model (LLM) Security and Privacy: The Good, the Bad, and the UglyYifan Yao, Jinhao Duan, Kaidi Xu et al.
Large Language Models (LLMs), such as ChatGPT and Bard, have revolutionized natural language understanding and generation. They possess deep language comprehension, human-like text generation capabilities, contextual awareness, and robust problem-solving skills, making them invaluable in various domains (e.g., search engines, customer support, translation). In the meantime, LLMs have also gained traction in the security community, revealing security vulnerabilities and showcasing their potential in security-related tasks. This paper explores the intersection of LLMs with security and privacy. Specifically, we investigate how LLMs positively impact security and privacy, potential risks and threats associated with their use, and inherent vulnerabilities within LLMs. Through a comprehensive literature review, the paper categorizes the papers into "The Good" (beneficial LLM applications), "The Bad" (offensive applications), and "The Ugly" (vulnerabilities of LLMs and their defenses). We have some interesting findings. For example, LLMs have proven to enhance code security (code vulnerability detection) and data privacy (data confidentiality protection), outperforming traditional methods. However, they can also be harnessed for various attacks (particularly user-level attacks) due to their human-like reasoning abilities. We have identified areas that require further research efforts. For example, Research on model and parameter extraction attacks is limited and often theoretical, hindered by LLM parameter scale and confidentiality. Safe instruction tuning, a recent development, requires more exploration. We hope that our work can shed light on the LLMs' potential to both bolster and jeopardize cybersecurity.
SEJun 14, 2021
No Free Lunch: Microservice Practices Reconsidered in IndustryQilin Xiang, Xin Peng, Chuan He et al.
Microservice architecture advocates a number of technologies and practices such as lightweight container, container orchestration, and DevOps, with the promised benefits of faster delivery, improved scalability, and greater autonomy. However, microservice systems implemented in industry vary a lot in terms of adopted practices and achieved benefits, drastically different from what is advocated in the literature. In this article, we conduct an empirical study, including an online survey with 51 responses and 14 interviews for experienced microservice experts to advance our understanding regarding to microservice practices in industry. As a part of our findings, the empirical study clearly revealed three levels of maturity of microservice systems (from basic to advanced): independent development and deployment, high scalability and availability, and service ecosystem, categorized by the fulfilled benefits of microservices. We also identify 11 practical issues that constrain the microservice capabilities of organizations. For each issue, we summarize the practices that have been explored and adopted in industry, along with the remaining challenges. Our study can help practitioners better position their microservice systems and determine what infrastructures and capabilities are worth investing. Our study can also help researchers better understand industrial microservice practices and identify useful research problems.
SEMar 8, 2021
On the Lack of Consensus Among Technical Debt Detection ToolsJason Lefever, Yuanfang Cai, Humberto Cervantes et al.
A vigorous and growing set of technical debt analysis tools have been developed in recent years -- both research tools and industrial products -- such as Structure 101, SonarQube, and DV8. Each of these tools identifies problematic files using their own definitions and measures. But to what extent do these tools agree with each other in terms of the files that they identify as problematic? If the top-ranked files reported by these tools are largely consistent, then we can be confident in using any of these tools. Otherwise, a problem of accuracy arises. In this paper, we report the results of an empirical study analyzing 10 projects using multiple tools. Our results show that: 1) these tools report very different results even for the most common measures, such as size, complexity, file cycles, and package cycles. 2) These tools also differ dramatically in terms of the set of problematic files they identify, since each implements its own definitions of "problematic". After normalizing by size, the most problematic file sets that the tools identify barely overlap. 3) Our results show that code-based measures, other than size and complexity, do not even moderately correlate with a file's change-proneness or error-proneness. In contrast, co-change-related measures performed better. Our results suggest that, to identify files with true technical debt -- those that experience excessive changes or bugs -- co-change information must be considered. Code-based measures are largely ineffective at pinpointing true debt. Finally, this study reveals the need for the community to create benchmarks and data sets to assess the accuracy of software analysis tools in terms of commonly used measures.
SENov 30, 2018
A Longitudinal Study of Identifying and Paying Down Architectural DebtMaleknaz Nayebi, Yuanfang Cai, Rick Kazman et al.
Architectural debt is a form of technical debt that derives from the gap between the architectural design of the system as it "should be" compared to "as it is". We measured architecture debt in two ways: 1) in terms of system-wide coupling measures, and 2) in terms of the number and severity of architectural flaws. In recent work it was shown that the amount of architectural debt has a huge impact on software maintainability and evolution. Consequently, detecting and reducing the debt is expected to make software more amenable to change. This paper reports on a longitudinal study of a healthcare communications product created by Brightsquid Secure Communications Corp. This start-up company is facing the typical trade-off problem of desiring responsiveness to change requests, but wanting to avoid the ever-increasing effort that the accumulation of quick-and-dirty changes eventually incurs. In the first stage of the study, we analyzed the status of the "before" system, which indicated the impacts of change requests. This initial study motivated a more in-depth analysis of architectural debt. The results of this analysis were used to motivate a comprehensive refactoring of the software system. The third phase of the study was a follow-on architectural debt analysis which quantified the improvements made. Using this quantitative evidence, augmented by qualitative evidence gathered from in-depth interviews with Brightsquid's architects, we present lessons learned about the costs and benefits of paying down architecture debt in practice.