49.9SEMay 8
Prompt Engineering Strategies for LLM-based Qualitative Coding of Psychological Safety in Software Engineering Communities: A Controlled Empirical StudyMoaath Alshaikh, Tasneem Alshaher, Ricardo Vieira et al.
Qualitative analysis plays a pivotal role in understanding the human and social aspects of software engineering. However, it remains a demanding process shaped by the subjective interpretation of individual researchers and sensitive to methodological choices such as prompt design. Recent advancements in Large Language Models (LLMs) offer promising opportunities to support this type of analysis, although their reliability in reproducing human qualitative reasoning under varying prompting conditions remains largely untested. This study presents a controlled empirical evaluation of three LLMs -- Claude Haiku, DeepSeek-Chat, and Gemini 2.5 Flash -- across two prompt engineering strategies (zero-shot and multi-shot closed coding), using Cohen's kappa as the primary agreement metric over ten independent runs per configuration. Results suggest that multi-shot prompting significantly improves agreement for Claude Haiku (Delta kappa = +0.034, Wilcoxon p = 0.004) but not for DeepSeek-Chat or Gemini 2.5 Flash. Intra-model stability varies substantially -- DeepSeek-Chat and Claude Haiku exhibit the lowest variance (SD approx. 0.017), while Gemini 2.5 Flash is the least stable (SD = 0.038). A systematic over-prediction of "Sharing Negative Feedback" is identified across all models (bias ratios up to 5.25x), alongside consistent under-prediction of "Expressing Concerns." Collectively, these findings provide empirical evidence for prompt engineering guidelines in LLM-assisted qualitative coding for software engineering research.
4.4SEApr 30
What Characterizes a Software Leader? Identifying Leadership Practices from Practitioners Social MediaMurilo Coelho, Denivan Campos, Mariana Maia Bezerra et al.
Context: Leadership has been extensively studied in management and agile software development; however, prior research predominantly focuses on formal roles and predefined leadership models, offering limited insight into how leadership is experienced and demonstrated by software practitioners in everyday practice. Objective: Our goal is to identify and categorize leadership practices as perceived and reported by software development practitioners based on their professional experiences. Method: We conducted a content analysis of 116 practitioner-authored articles published on the Dev.to online community. Articles were systematically collected, screened, and coded, resulting in the extraction, correlation analysis and categorization of leadership practices grounded in practitioners narratives. Results: We identified 103 practices for software project leaders, distinguished between recommended and discouraged ones. These practices were organized into five categories: People Management & Development, Processes & Execution, Professional & Personal Growth, Communication & Articulation and Strategic Vision. The most recurrent recommended practices include Cultivating & Practicing Interpersonal Skills, Managing & Delegating Team Work, and Practicing & Developing Managerial Skills, whereas Micromanagement, Counterproductive Work Patterns, and Counterproductive Communication Styles emerged as the most frequent discouraged practices. We organized all practices into a conceptual map. Conclusion: The findings indicate that software leadership is mainly associated with managerial and interpersonal practices rather than technical expertise. The resulting conceptual map summarizes these practices and can serve as a reference for understanding leadership in software development contexts.