Zixuan Feng

h-index9

3papers

6citations

Novelty42%

AI Score43

Ranked #56,379 of 194,257 authors (top 29%)#541 in SE (top 18%)

3 Papers

8.1SEMay 18Code

Restructure This: Using AI to Restructure Onboarding Documents to Reduce Cognitive Overload

Zixuan Feng, Prashant Tandan, Igor Steinmacher et al.

Onboarding documentation is critical for attracting and retaining newcomers in open source software (OSS). However, it is often presented as dense, inconsistently structured, and fragmented presentations that are difficult to understand, which creates cognitive overload leading to frustration, errors, and abandonment. Here, we investigate how Cognitive Theory of Multimedia Learning (CTML) strategies can be used to restructure OSS documentation. We use a GenAI-based pipeline to operationalize these strategies to restructure OSS documentation through our prototype VisDoc. VisDoc segments documentation into task-based units, infers workflows, removes redundancy, and generates multimodal explanations. An expert evaluation (N=4) affirmed VisDoc's completeness, accuracy, and adoptability; A between-subjects evaluation (N=14) with newcomers found that VisDoc participants achieved higher task success, had significantly lower cognitive load, and perceived higher usability. The contributions of this work include a CTML-grounded analysis of onboarding challenges, a GenAI-based documentation restructuring pipeline, and empirical evidence that cognitively informed documentation restructuring reduces cognitive load and improves usability and task performance in OSS.

4.3SEFeb 23, 2022Code

Implicit Mentoring: The Unacknowledged Developer Efforts in Open Source

Zixuan Feng, Amreeta Chatterjee, Anita Sarma et al.

Mentoring is traditionally viewed as a dyadic, top-down apprenticeship. This perspective, however, overlooks other forms of informal mentoring taking place in everyday activities in which developers invest time and effort, but remain unacknowledged. Here, we investigate the different flavors of mentoring in Open Source Software (OSS) to define and identify implicit mentoring. We first define implicit mentoring--situations where contributors guide others through instructions and suggestions embedded in everyday (OSS) activities--through formative interviews with OSS contributors, a literature review, and member-checking. Next, through an empirical investigation of Pull Requests (PRs) in 37 Apache Projects, we build a classifier to extract implicit mentoring and characterize it through the dual lenses of experience and gender. Our analysis of 107,895 PRs shows that implicit mentoring occurs (27.41% of all PRs include implicit mentoring) and it does not follow the traditional dyadic, top-down apprenticeship model. When considering the gender of mentor-mentee pairs, we found pervasive homophily--a preference to mentor those who are of the same gender--in 93.81% cases. In the cross-gender mentoring instances, women were more likely to mentor men.

4.9CLFeb 6, 2025

Hierarchical Contextual Manifold Alignment for Structuring Latent Representations in Large Language Models

Meiquan Dong, Haoran Liu, Yan Huang et al.

The organization of latent token representations plays a crucial role in determining the stability, generalization, and contextual consistency of language models, yet conventional approaches to embedding refinement often rely on parameter modifications that introduce additional computational overhead. A hierarchical alignment method was introduced to restructure token embeddings without altering core model weights, ensuring that representational distributions maintained coherence across different linguistic contexts. Experimental evaluations demonstrated improvements in rare token retrieval, adversarial robustness, and long-range dependency tracking, highlighting the advantages of hierarchical structuring in mitigating inconsistencies in latent space organization. The comparative analysis against conventional fine-tuning and embedding perturbation methods revealed that hierarchical restructuring maintained computational efficiency while achieving measurable gains in representation quality. Structural refinements introduced through the alignment process resulted in improved contextual stability across varied linguistic tasks, reducing inconsistencies in token proximity relationships and enhancing interpretability in language generation. A detailed computational assessment confirmed that the realignment process introduced minimal inference overhead, ensuring that representational improvements did not compromise model efficiency. The findings reinforced the broader significance of structured representation learning, illustrating that hierarchical embedding modifications could serve as an effective strategy for refining latent space distributions while preserving pre-learned semantic associations.