Jin Guo

h-index21

16papers

514citations

Novelty33%

AI Score44

Ranked #46,339 of 194,257 authors (top 24%)#414 in SE (top 14%)

16 Papers

7.4HCApr 27Code

What If We Work Together? Fostering Reflections on Designer Inclusion in Open Source Software Through Speculative Design

Rozhan Hozhabri Nezhad, Jin L. C. Guo, Jinghui Cheng

Open source software (OSS) often prioritizes technical functionality over usability and UX design. This imbalance limits OSS adoption among broader, non-technical users. Key underlying factors contributing to this issue are the shortage of design expertise in OSS and a dominant developer-centric mindset. To address these persistent issues, we explore the potential of speculative design as a catalyst for transforming the OSS community's mindset towards a more designer-inclusive environment. Our design was informed by an analysis of online forums, which revealed designers' motivations and challenges when contributing to OSS. Guided by these insights, we created two speculative societies, Husia (collectivist) and Reetar (individualist), in which designers are valued for different reasons and their work incorporated in different ways. Through a user study with 12 OSS practitioners (seven designers and five developers), we found that our speculative societies provoked participants' rich and critical reflections on OSS values, the root causes of challenges, and proposed actions. Our work provides insights into how speculative design can be used in the practical, sociotechnical context of OSS to stimulate critical reflection, improve awareness, and yield recommendations for fostering an equitable, sustainable, and inclusive OSS environment.

5.5HCApr 6, 2023

Approach Intelligent Writing Assistants Usability with Seven Stages of Action

Avinash Bhat, Disha Shrivastava, Jin L. C. Guo · mila

Despite the potential of Large Language Models (LLMs) as writing assistants, they are plagued by issues like coherence and fluency of the model output, trustworthiness, ownership of the generated content, and predictability of model performance, thereby limiting their usability. In this position paper, we propose to adopt Norman's seven stages of action as a framework to approach the interaction design of intelligent writing assistants. We illustrate the framework's applicability to writing tasks by providing an example of software tutorial authoring. The paper also discusses the framework as a tool to synthesize research on the interaction design of LLM-based tools and presents examples of tools that support the stages of action. Finally, we briefly outline the potential of a framework for human-LLM interaction research.

15.5SEApr 13, 2022

Aspirations and Practice of Model Documentation: Moving the Needle with Nudging and Traceability

Avinash Bhat, Austin Coursey, Grace Hu et al.

The documentation practice for machine-learned (ML) models often falls short of established practices for traditional software, which impedes model accountability and inadvertently abets inappropriate or misuse of models. Recently, model cards, a proposal for model documentation, have attracted notable attention, but their impact on the actual practice is unclear. In this work, we systematically study the model documentation in the field and investigate how to encourage more responsible and accountable documentation practice. Our analysis of publicly available model cards reveals a substantial gap between the proposal and the practice. We then design a tool named DocML aiming to (1) nudge the data scientists to comply with the model cards proposal during the model development, especially the sections related to ethics, and (2) assess and manage the documentation quality. A lab study reveals the benefit of our tool towards long-term documentation quality and accountability.

3.9CVApr 18, 2023Code

GUILGET: GUI Layout GEneration with Transformer

Andrey Sobolevsky, Guillaume-Alexandre Bilodeau, Jinghui Cheng et al.

Sketching out Graphical User Interface (GUI) layout is part of the pipeline of designing a GUI and a crucial task for the success of a software application. Arranging all components inside a GUI layout manually is a time-consuming task. In order to assist designers, we developed a method named GUILGET to automatically generate GUI layouts from positional constraints represented as GUI arrangement graphs (GUI-AGs). The goal is to support the initial step of GUI design by producing realistic and diverse GUI layouts. The existing image layout generation techniques often cannot incorporate GUI design constraints. Thus, GUILGET needs to adapt existing techniques to generate GUI layouts that obey to constraints specific to GUI designs. GUILGET is based on transformers in order to capture the semantic in relationships between elements from GUI-AG. Moreover, the model learns constraints through the minimization of losses responsible for placing each component inside its parent layout, for not letting components overlap if they are inside the same parent, and for component alignment. Our experiments, which are conducted on the CLAY dataset, reveal that our model has the best understanding of relationships from GUI-AG and has the best performances in most of evaluation metrics. Therefore, our work contributes to improved GUI layout generation by proposing a novel method that effectively accounts for the constraints on GUI elements and paves the road for a more efficient GUI design pipeline.

6.8HCApr 27

Putting a Face to the Issue: Fostering User Empathy of Open Source Software Developers With PersonaFlow

Boniface Bahati Tadjuidje, Jin L. C. Guo, Jinghui Cheng

Open-source software (OSS) developers often struggle to understand and respond to user context, while existing tools, such as issue trackers (for handling bugs, requests, and feedback), largely focus on technical discussion. Although personas could help, limited resources and UX expertise make them hard to scale. We present PersonaFlow, a tool that generates editable user personas from OSS repository artifacts and integrates them alongside issue reports. In a user study with 13 OSS developers, most reported shifts in how they understood users, and more than half modified their responses by adding empathetic language, tailoring explanations, or raising priority ratings. We found two pathways to this change: some connected emotionally to personas as people, while others used them pragmatically for triaging. Both appeared to lead to more user-centered behavior. We contribute design implications for persona-based tools relevant to OSS and other contexts where efficiency-driven systems or workflows obscure valuable human elements.

5.8AIMay 9, 2025Code

Opening the Scope of Openness in AI

Tamara Paris, AJung Moon, Jin Guo

The concept of openness in AI has so far been heavily inspired by the definition and community practice of open source software. This positions openness in AI as having positive connotations; it introduces assumptions of certain advantages, such as collaborative innovation and transparency. However, the practices and benefits of open source software are not fully transferable to AI, which has its own challenges. Framing a notion of openness tailored to AI is crucial to addressing its growing societal implications, risks, and capabilities. We argue that considering the fundamental scope of openness in different disciplines will broaden discussions, introduce important perspectives, and reflect on what openness in AI should mean. Toward this goal, we qualitatively analyze 98 concepts of openness discovered from topic modeling, through which we develop a taxonomy of openness. Using this taxonomy as an instrument, we situate the current discussion on AI openness, identify gaps and highlight links with other disciplines. Our work contributes to the recent efforts in framing openness in AI by reflecting principles and practices of openness beyond open source software and calls for a more holistic view of openness in terms of actions, system properties, and ethical objectives.

34.9HCMar 21, 2024

A Design Space for Intelligent and Interactive Writing Assistants

Mina Lee, Katy Ilonka Gero, John Joon Young Chung et al. · allen-ai, deepmind

In our era of rapid technological advancement, the research landscape for writing assistants has become increasingly fragmented across various research communities. We seek to address this challenge by proposing a design space as a structured way to examine and explore the multidimensional space of intelligent and interactive writing assistants. Through a large community collaboration, we explore five aspects of writing assistants: task, user, technology, interaction, and ecosystem. Within each aspect, we define dimensions (i.e., fundamental components of an aspect) and codes (i.e., potential options for each dimension) by systematically reviewing 115 papers. Our design space aims to offer researchers and designers a practical tool to navigate, comprehend, and compare the various possibilities of writing assistants, and aid in the envisioning and design of new writing assistants.

6.4SEAug 10, 2021Code

Issue Link Label Recovery and Prediction for Open Source Software

Alexander Nicholson, Jin L. C. Guo

Modern open source software development heavily relies on the issue tracking systems to manage their feature requests, bug reports, tasks, and other similar artifacts. Together, those "issues" form a complex network with links to each other. The heterogeneous character of issues inherently results in varied link types and therefore poses a great challenge for users to create and maintain the label of the link manually. The goal of most existing automated issue link construction techniques ceases with only examining the existence of links between issues. In this work, we focus on the next important question of whether we can assess the type of issue link automatically through a data-driven method. We analyze the links between issues and their labels used the issue tracking system for 66 open source projects. Using three projects, we demonstrate promising results when using supervised machine learning classification for the task of link label recovery with careful model selection and tuning, achieving F1 scores of between 0.56-0.70 for the three studied projects. Further, the performance of our method for future link label prediction is convincing when there is sufficient historical data. Our work signifies the first step in systematically manage and maintain issue links faced in practice.

8.9SEJul 13, 2020

How Do Open Source Software Contributors Perceive and Address Usability? Valued Factors, Practices, and Challenges

Wenting Wang, Jinghui Cheng, Jin L. C. Guo

Usability is an increasing concern in open source software (OSS). Given the recent changes in the OSS landscape, it is imperative to examine the OSS contributors' current valued factors, practices, and challenges concerning usability. We accumulated this knowledge through a survey with a wide range of contributors to OSS applications. Through analyzing 84 survey responses, we found that many participants recognized the importance of usability. While most relied on issue tracking systems to collect user feedback, a few participants also adopted typical user-centered design methods. However, most participants demonstrated a system-centric rather than a user-centric view. Understanding the diverse needs and consolidating various feedback of end-users posed unique challenges for the OSS contributors when addressing usability in the most recent development context. Our work provided important insights for OSS practitioners and tool designers in exploring ways for promoting a user-centric mindset and improving usability practice in the current OSS communities.

16.5SEMar 13, 2019Code

Activity-Based Analysis of Open Source Software Contributors: Roles and Dynamics

Jinghui Cheng, Jin L. C. Guo

Contributors to open source software (OSS) communities assume diverse roles to take different responsibilities. One major limitation of the current OSS tools and platforms is that they provide a uniform user interface regardless of the activities performed by the various types of contributors. This paper serves as a non-trivial first step towards resolving this challenge by demonstrating a methodology and establishing knowledge to understand how the contributors' roles and their dynamics, reflected in the activities contributors perform, are exhibited in OSS communities. Based on an analysis of user action data from 29 GitHub projects, we extracted six activities that distinguished four Active roles and five Supporting roles of OSS contributors, as well as patterns in role changes. Through the lens of the Activity Theory, these findings provided rich design guidelines for OSS tools to support diverse contributor roles.

20.7SEFeb 19, 2019Code

Analysis and Detection of Information Types of Open Source Software Issue Discussions

Deeksha Arya, Wenting Wang, Jin L. C. Guo et al.

Most modern Issue Tracking Systems (ITSs) for open source software (OSS) projects allow users to add comments to issues. Over time, these comments accumulate into discussion threads embedded with rich information about the software project, which can potentially satisfy the diverse needs of OSS stakeholders. However, discovering and retrieving relevant information from the discussion threads is a challenging task, especially when the discussions are lengthy and the number of issues in ITSs are vast. In this paper, we address this challenge by identifying the information types presented in OSS issue discussions. Through qualitative content analysis of 15 complex issue threads across three projects hosted on GitHub, we uncovered 16 information types and created a labeled corpus containing 4656 sentences. Our investigation of supervised, automated classification techniques indicated that, when prior knowledge about the issue is available, Random Forest can effectively detect most sentence types using conversational features such as the sentence length and its position. When classifying sentences from new issues, Logistic Regression can yield satisfactory performance using textual features for certain information types, while falling short on others. Our work represents a nontrivial first step towards tools and techniques for identifying and obtaining the rich information recorded in the ITSs to support various software engineering activities and to satisfy the diverse needs of OSS stakeholders.

20.8SEApr 6, 2018

Traceability in the Wild: Automatically Augmenting Incomplete Trace Links

Michael Rath, Jacob Rendall, Jin L. C. Guo et al.

Software and systems traceability is widely accepted as an essential element for supporting many software development tasks. Today's version control systems provide inbuilt features that allow developers to tag each commit with one or more issue ID, thereby providing the building blocks from which project-wide traceability can be established between feature requests, bug fixes, commits, source code, and specific developers. However, our analysis of six open source projects showed that on average only 60% of the commits were linked to specific issues. Without these fundamental links the entire set of project-wide links will be incomplete, and therefore not trustworthy. In this paper we address the fundamental problem of missing links between commits and issues. Our approach leverages a combination of process and text-related features characterizing issues and code changes to train a classifier to identify missing issue tags in commit messages, thereby generating the missing links. We conducted a series of experiments to evaluate our approach against six open source projects and showed that it was able to effectively recommend links for tagging issues at an average of 96% recall and 33% precision. In a related task for augmenting a set of existing trace links, the classifier returned precision at levels greater than 89% in all projects and recall of 50%

6.4SEOct 25, 2021

Generating GitHub Repository Descriptions: A Comparison of Manual and Automated Approaches

Jazlyn Hellman, Eunbee Jang, Christoph Treude et al.

Given the vast number of repositories hosted on GitHub, project discovery and retrieval have become increasingly important for GitHub users. Repository descriptions serve as one of the first points of contact for users who are accessing a repository. However, repository owners often fail to provide a high-quality description; instead, they use vague terms, the purpose of the repository is poorly explained, or the description is omitted entirely. In this work, we examine the current practice of writing GitHub repository descriptions. Our investigation leads to the proposal of the LSP (Language, Software technology, and Purpose) template to formulate good descriptions for GitHub repositories that are clear, concise, and informative. To understand the extent to which current automated techniques can support generating repository descriptions, we compare the performance of state-of-the-art text summarization methods on this task. Finally, our user study with GitHub users reveals that automated summarization can adequately be used for default description generation for GitHub repositories, while the descriptions which follow the LSP template offer the most effective instrument for communicating with GitHub users.

3.6SEApr 13, 2021

Science-Software Linkage: The Challenges of Traceability between Scientific Knowledge and Software Artifacts

Hideaki Hata, Jin L. C. Guo, Raula Gaikovina Kula et al.

Although computer science papers are often accompanied by software artifacts, connecting research papers to their software artifacts and vice versa is not always trivial. First of all, there is a lack of well-accepted standards for how such links should be provided. Furthermore, the provided links, if any, often become outdated: they are affected by link rot when pre-prints are removed, when repositories are migrated, or when papers and repositories evolve independently. In this paper, we summarize the state of the practice of linking research papers and associated source code, highlighting the recent efforts towards creating and maintaining such links. We also report on the results of several empirical studies focusing on the relationship between scientific papers and associated software artifacts, and we outline challenges related to traceability and opportunities for overcoming these challenges.

4.2LGJul 2, 2020

Software Engineering Event Modeling using Relative Time in Temporal Knowledge Graphs

Kian Ahrabian, Daniel Tarlow, Hehuimin Cheng et al.

We present a multi-relational temporal Knowledge Graph based on the daily interactions between artifacts in GitHub, one of the largest social coding platforms. Such representation enables posing many user-activity and project management questions as link prediction and time queries over the knowledge graph. In particular, we introduce two new datasets for i) interpolated time-conditioned link prediction and ii) extrapolated time-conditioned link/time prediction queries, each with distinguished properties. Our experiments on these datasets highlight the potential of adapting knowledge graphs to answer broad software engineering questions. Meanwhile, it also reveals the unsatisfactory performance of existing temporal models on extrapolated queries and time prediction queries in general. To overcome these shortcomings, we introduce an extension to current temporal models using relative temporal information with regards to past events.

2.7SEAug 15, 2018

Domain Knowledge Discovery Guided by Software Trace Links

Jin L. C. Guo, Natawut Monaikul, Jane Cleland-Huang

Software-intensive projects are specified and modeled using domain terminology. Knowledge of the domain terminology is necessary for performing many Software Engineering tasks such as impact analysis, compliance verification, and safety certification. However, discovering domain terminology and reasoning about their interrelationships for highly technical software and system engineering domains is a complex task which requires significant domain expertise and human effort. In this paper, we present a novel approach for leveraging trace links in software intensive systems to guide the process of mining facts that contain domain knowledge. The trace links which drive our mining process, define relationships between artifacts such as regulations and requirements and enable a guided search through high-yield combinations of domain terms. Our proof-of-concept evaluation shows that our approach aids in the discovery of domain facts even in highly complex technical domains. These domain facts can provide support for a variety of Software Engineering activities. As a use case, we demonstrate how the mined facts can facilitate the task of project Q&A.