Maria Mitrofan

CL
h-index10
3papers
586citations
Novelty10%
AI Score21

3 Papers

CLJun 16, 2022
An Open-Domain QA System for e-Governance

Radu Ion, Andrei-Marius Avram, Vasile Păiş et al.

The paper presents an open-domain Question Answering system for Romanian, answering COVID-19 related questions. The QA system pipeline involves automatic question processing, automatic query generation, web searching for the top 10 most relevant documents and answer extraction using a fine-tuned BERT model for Extractive QA, trained on a COVID-19 data set that we have manually created. The paper will present the QA system and its integration with the Romanian language technologies portal RELATE, the COVID-19 data set and different evaluations of the QA performance.

CLOct 29, 2024
RELATE: A Modern Processing Platform for Romanian Language

Vasile Păiş, Radu Ion, Andrei-Marius Avram et al.

This paper presents the design and evolution of the RELATE platform. It provides a high-performance environment for natural language processing activities, specially constructed for Romanian language. Initially developed for text processing, it has been recently updated to integrate audio processing tools. Technical details are provided with regard to core components. We further present different usage scenarios, derived from actual use in national and international research projects, thus demonstrating that RELATE is a mature, modern, state-of-the-art platform for processing Romanian language corpora. Finally, we present very recent developments including bimodal (text and audio) features available within the platform.

CLNov 22, 2021
Human-Machine Interaction Speech Corpus from the ROBIN project

Vasile Păiş, Radu Ion, Andrei-Marius Avram et al.

This paper introduces a new Romanian speech corpus from the ROBIN project, called ROBIN Technical Acquisition Speech Corpus (ROBINTASC). Its main purpose was to improve the behaviour of a conversational agent, allowing human-machine interaction in the context of purchasing technical equipment. The paper contains a detailed description of the acquisition process, corpus statistics as well as an evaluation of the corpus influence on a low-latency ASR system as well as a dialogue component.