CLMay 5, 2022
Balancing Multi-Domain Corpora Learning for Open-Domain Response GenerationYujie Xing, Jinglun Cai, Nils Barlaug et al. · amazon-science
Open-domain conversational systems are assumed to generate equally good responses on multiple domains. Previous work achieved good performance on the single corpus, but training and evaluating on multiple corpora from different domains are less studied. This paper explores methods of generating relevant responses for each of multiple multi-domain corpora. We first examine interleaved learning which intermingles multiple corpora as the baseline. We then investigate two multi-domain learning methods, labeled learning and multi-task labeled learning, which encode each corpus through a unique corpus embedding. Furthermore, we propose Domain-specific Frequency (DF), a novel word-level importance weight that measures the relative importance of a word for a specific corpus compared to other corpora. Based on DF, we propose weighted learning, a method that integrates DF to the loss function. We also adopt DF as a new evaluation metric. Extensive experiments show that our methods gain significant improvements on both automatic and human evaluation. We share our code and data for reproducibility
CLNov 9, 2022
Evaluating and Improving Context Attention Distribution on Multi-Turn Response Generation using Self-Contained DistractionsYujie Xing, Jon Atle Gulla
Despite the rapid progress of open-domain generation-based conversational agents, most deployed systems treat dialogue contexts as single-turns, while systems dealing with multi-turn contexts are less studied. There is a lack of a reliable metric for evaluating multi-turn modelling, as well as an effective solution for improving it. In this paper, we focus on an essential component of multi-turn generation-based conversational agents: context attention distribution, i.e. how systems distribute their attention on dialogue's context. For evaluation of this component, We introduce a novel attention-mechanism-based metric: DAS ratio. To improve performance on this component, we propose an optimization strategy that employs self-contained distractions. Our experiments on the Ubuntu chatlogs dataset show that models with comparable perplexity can be distinguished by their ability on context attention distribution. Our proposed optimization strategy improves both non-hierarchical and hierarchical models on the proposed metric by about 10% from baselines.
CLDec 12, 2024
The Impact of Copyrighted Material on Large Language Models: A Norwegian PerspectiveJavier de la Rosa, Vladislav Mikhailov, Lemei Zhang et al.
The use of copyrighted materials in training language models raises critical legal and ethical questions. This paper presents a framework for and the results of empirically assessing the impact of publisher-controlled copyrighted corpora on the performance of generative large language models (LLMs) for Norwegian. When evaluated on a diverse set of tasks, we found that adding both books and newspapers to the data mixture of LLMs tend to improve their performance, while the addition of fiction works seems to be detrimental. Our experiments could inform the creation of a compensation scheme for authors whose works contribute to AI development.
CLDec 3, 2023
NLEBench+NorGLM: A Comprehensive Empirical Analysis and Benchmark Dataset for Generative Language Models in NorwegianPeng Liu, Lemei Zhang, Terje Farup et al.
Norwegian, spoken by only 5 million population, is under-representative within the most impressive breakthroughs in NLP tasks. To the best of our knowledge, there has not yet been a comprehensive evaluation of the existing language models (LMs) on Norwegian generation tasks during the article writing process. To fill this gap, we 1) compiled the existing Norwegian dataset and pre-trained 4 Norwegian Open Language Models varied from parameter scales and architectures, collectively called NorGLM; 2) introduced a comprehensive benchmark, NLEBench, for evaluating natural language generation capabilities in Norwegian, encompassing translation and human annotation. Based on the investigation, we find that: 1) the mainstream, English-dominated LM GPT-3.5 has limited capability in understanding the Norwegian context; 2) the increase in model parameter scales demonstrates limited impact on the performance of downstream tasks when the pre-training dataset is constrained in size; 3) smaller models also demonstrate the reasoning capability through Chain-of-Thought; 4) a multi-task dataset that includes synergy tasks can be used to verify the generalizability of LLMs on natural language understanding and, meanwhile, test the interconnectedness of these NLP tasks. We share our resources and code for reproducibility under a CC BY-NC 4.0 license.
DBOct 21, 2020
Neural Networks for Entity Matching: A SurveyNils Barlaug, Jon Atle Gulla
Entity matching is the problem of identifying which records refer to the same real-world entity. It has been actively researched for decades, and a variety of different approaches have been developed. Even today, it remains a challenging problem, and there is still generous room for improvement. In recent years we have seen new methods based upon deep learning techniques for natural language processing emerge. In this survey, we present how neural networks have been used for entity matching. Specifically, we identify which steps of the entity matching process existing work have targeted using neural networks, and provide an overview of the different techniques used at each step. We also discuss contributions from deep learning in entity matching compared to traditional methods, and propose a taxonomy of deep neural networks for entity matching.
CYAug 14, 2018
Evaluation of team dynamic in Norwegian projects for IT studentsSalah Uddin Ahmed, Ingrid Sundbø, Jon Kvisli et al.
The need for teaching realistic software development in project courses has increased in a global scale. It has always been challenges in cooperating fast-changing software technologies, development methodologies and teamwork. Moreover, such project courses need to be designed in the connection to existing theoretical courses. We performed a large-scale research on student performance in Software Engineering projects in Norwegian universities. This paper investigates four aspects of team dynamics, which are team reflection, leadership, decision making and task assignment in order to improve student learning. Data was collected from student projects in 4 years at two universities. We found that some leader's characteristics are perceived differently for female and male leaders, including the perception of leaders as skilful workers or visionaries. Leadership is still a challenging aspect to teach, and assigned leadership is probably not the best way to learn. Students is are performing well in task review, however, needs support while performing task assignment. The result also suggests that task management to be done in more fine-grained levels. It is also important to maintain an open and active discussion to facilitate effective group decision makings.
SESep 22, 2017
Female Leadership in Software Projects: A Preliminary Result on Leadership Style and Project Context FactorsAnh Nguyen-Duc, Soudabeh Khodambashi, Jon Atle Gulla et al.
Women have been shown to be effective leaders in many team-based situations. However, it is also well-recognized that women are underrepresented in engineering and technology areas, which leads to wasted efforts and a lack of diversity in professional organizations. Although studies about gender and leadership are rich, research focusing on engineering-specific activities, are scarce. To react on this gap, we explored the experience of female leaders of software development projects and possible context factors that influence leadership effectiveness. The study was conducted as a longitudinal multiple case study. Data was collected from survey, interviews, observation and project reports. In this work, we reported some preliminary findings related to leadership style, team perception on leadership and team-task context factors. We found a strong correlation between perceived team leadership and task management. We also observed a potential association between human-oriented leading approach in low customer involvement scenarios and task-oriented leading approach in high customer involvement situations.