P Baudis

h-index8

5papers

76citations

Novelty28%

AI Score24

Ranked #171,085 of 194,257 authors (top 88%)#10,464 in AI (top 83%)

5 Papers

8.5CLMar 19, 2016Code

Sentence Pair Scoring: Towards Unified Framework for Text Comprehension

Petr Baudiš, Jan Pichl, Tomáš Vyskočil et al.

We review the task of Sentence Pair Scoring, popular in the literature in various forms - viewed as Answer Sentence Selection, Semantic Text Scoring, Next Utterance Ranking, Recognizing Textual Entailment, Paraphrasing or e.g. a component of Memory Networks. We argue that all such tasks are similar from the model perspective and propose new baselines by comparing the performance of common IR metrics and popular convolutional, recurrent and attention-based neural models across many Sentence Pair Scoring tasks and datasets. We discuss the problem of evaluating randomized models, propose a statistically grounded methodology, and attempt to improve comparisons by releasing new datasets that are much harder than some of the currently used well explored benchmarks. We introduce a unified open source software framework with easily pluggable models and tasks, which enables us to experiment with multi-task reusability of trained sentence model. We set a new state-of-art in performance on the Ubuntu Dialogue dataset.

11.9CLMay 16, 2016Code

Joint Learning of Sentence Embeddings for Relevance and Entailment

Petr Baudis, Silvestr Stanko, Jan Sedivy

We consider the problem of Recognizing Textual Entailment within an Information Retrieval context, where we must simultaneously determine the relevancy as well as degree of entailment for individual pieces of evidence to determine a yes/no answer to a binary natural language question. We compare several variants of neural networks for sentence embeddings in a setting of decision-making based on evidence of varying relevance. We propose a basic model to integrate evidence for entailment, show that joint training of the sentence embeddings to model relevance and entailment is feasible even with no explicit per-evidence supervision, and show the importance of evaluating strong baselines. We also demonstrate the benefit of carrying over text comprehension model trained on an unrelated task for our small datasets. Our research is motivated primarily by a new open dataset we introduce, consisting of binary questions and news-based evidence snippets. We also apply the proposed relevance-entailment model on a similar task of ranking multiple-choice test answers, evaluating it on a preliminary dataset of school test questions as well as the standard MCTest dataset, where we improve the neural model state-of-art.

4.0SEMay 14, 2014

Current Concepts in Version Control Systems

Petr Baudiš

We give the reader a comprehensive overview of the state of the Version Control software engineering field, describing and analysing the concepts, architectural approaches and methods researched and included in the currently widely used version control systems and propose some possible future research directions.

5.4AIMay 14, 2014Code

COCOpf: An Algorithm Portfolio Framework

Petr Baudiš

Algorithm portfolios represent a strategy of composing multiple heuristic algorithms, each suited to a different class of problems, within a single general solver that will choose the best suited algorithm for each input. This approach recently gained popularity especially for solving combinatoric problems, but optimization applications are still emerging. The COCO platform of the BBOB workshop series is the current standard way to measure performance of continuous black-box optimization algorithms. As an extension to the COCO platform, we present the Python-based COCOpf framework that allows composing portfolios of optimization algorithms and running experiments with different selection strategies. In our framework, we focus on black-box algorithm portfolio and online adaptive selection. As a demonstration, we measure the performance of stock SciPy optimization algorithms and the popular CMA algorithm alone and in a portfolio with two simple selection strategies. We confirm that even a naive selection strategy can provide improved performance across problem classes.

7.5AISep 24, 2012

On Move Pattern Trends in a Large Go Games Corpus

Petr Baudiš, Josef Moudřík

We process a large corpus of game records of the board game of Go and propose a way of extracting summary information on played moves. We then apply several basic data-mining methods on the summary information to identify the most differentiating features within the summary information, and discuss their correspondence with traditional Go knowledge. We show statistically significant mappings of the features to player attributes such as playing strength or informally perceived "playing style" (e.g. territoriality or aggressivity), describe accurate classifiers for these attributes, and propose applications including seeding real-work ranks of internet players, aiding in Go study and tuning of Go-playing programs, or contribution to Go-theoretical discussion on the scope of "playing style".