AISep 30, 2025
TVS Sidekick: Challenges and Practical Insights from Deploying Large Language Models in the EnterprisePaula Reyero Lobo, Kevin Johnson, Bill Buchanan et al.
Many enterprises are increasingly adopting Artificial Intelligence (AI) to make internal processes more competitive and efficient. In response to public concern and new regulations for the ethical and responsible use of AI, implementing AI governance frameworks could help to integrate AI within organisations and mitigate associated risks. However, the rapid technological advances and lack of shared ethical AI infrastructures creates barriers to their practical adoption in businesses. This paper presents a real-world AI application at TVS Supply Chain Solutions, reporting on the experience developing an AI assistant underpinned by large language models and the ethical, regulatory, and sociotechnical challenges in deployment for enterprise use.
SEJun 21, 2021
Towards a corpus for credibility assessment in software practitioner blog articlesAshley Williams, Matthew Shardlow, Austen Rainer
Blogs are a source of grey literature which are widely adopted by software practitioners for disseminating opinion and experience. Analysing such articles can provide useful insights into the state-of-practice for software engineering research. However, there are challenges in identifying higher quality content from the large quantity of articles available. Credibility assessment can help in identifying quality content, though there is a lack of existing corpora. Credibility is typically measured through a series of conceptual criteria, with 'argumentation' and 'evidence' being two important criteria. We create a corpus labelled for argumentation and evidence that can aid the credibility community. The corpus consists of articles from the blog of a single software practitioner and is publicly available. Three annotators label the corpus with a series of conceptual credibility criteria, reaching an agreement of 0.82 (Fleiss' Kappa). We present preliminary analysis of the corpus by using it to investigate the identification of claim sentences (one of our ten labels). We train four systems (Bert, KNN, Decision Tree and SVM) using three feature sets (Bag of Words, Topic Modelling and InferSent), achieving an F1 score of 0.64 using InferSent and a Linear SVM. Our preliminary results are promising, indicating that the corpus can help future studies in detecting the credibility of grey literature. Future research will investigate the degree to which the sentence level annotations can infer the credibility of the overall document.
SEMar 2, 2021
Practitioner-generated blog posts as evidence for software engineering research: attitudinal survey and preliminary checklistAusten Rainer, Ashley Williams
Background: Blog posts are frequently used by software practitioners to share information about their practice. Blog posts therefore provide a potential source of evidence for software engineering (SE) research. The use of blog posts as evidence for research appears contentious amongst some SE researchers. Objective: To better understand the actual and perceived value of blog posts as evidence for SE research, and to develop guidance for SE researchers on the use of blog posts as evidence. Method: We further analyse responses from a previously conducted attitudinal survey of 44 software engineering researchers. We conduct a heatmap analysis, simple statistical analysis, and a thematic analysis. Results: We find no clear consensus from respondents on researchers' attitudes to the credibility of blog posts, or on a standard set of criteria to evaluate blog-post credibility. We show that some of the responses to the survey exhibit characteristics similar to the content of blog posts, e.g., asserting prior beliefs as claims, with no citations and little supporting rationale. We illustrate our insights with ~60 qualitative examples from the survey ~40% of the total responses. We complement our quantitative and qualitative analyses with preliminary checklists to guide SE researchers. Conclusion: Blog posts are relevant to research because they are written by software practitioners describing their practice and experience. But evaluating the credibility of blog posts, so as to select the higher-quality content, remains an ongoing challenge. The quantitative and qualitative results, with the proposed checklists, are intended to stimulate reflection and action in the research community on the role of blog posts as evidence in software engineering research. Finally, our findings on researchers' attitudes to blog posts also provide more general insights into researchers' values for SE research.
SESep 30, 2020
Retrieving and mining professional experience of software practice from grey literature: an exploratory reviewAusten Rainer, Ashley Williams, Vahid Garousi et al.
Background: Retrieving and mining practitioners' self--reports of their professional experience of software practice could provide valuable evidence for research. We are, however, unaware of any existing reviews of research conducted in this area. Objective: To review and classify previous research, and to identify insights into the challenges research confronts when retrieving and mining practitioners' self-reports of their experience of software practice. Method: We conduct an exploratory review to identify and classify 42 articles. We analyse a selection of those articles for insights on challenges to mining professional experience. Results: We identify only one directly relevant article. Even then this article concerns the software professional's emotional experiences rather than the professional's reporting of behaviour and events occurring during software practice. We discuss challenges concerning: the prevalence of professional experience; definitions, models and theories; the sparseness of data; units of discourse analysis; annotator agreement; evaluation of the performance of algorithms; and the lack of replications. Conclusion: No directly relevant prior research appears to have been conducted in this area. We discuss the value of reporting negative results in secondary studies. There are a range of research opportunities but also considerable challenges. We formulate a set of guiding questions for further research in this area.