Tomas Kliegr

AI
3papers
7citations
Novelty32%
AI Score21

3 Papers

AIAug 11, 2024
Neurosymbolic Methods for Rule Mining

Agnieszka Lawrynowicz, Luis Galarraga, Mehwish Alam et al.

In this chapter, we address the problem of rule mining, beginning with essential background information, including measures of rule quality. We then explore various rule mining methodologies, categorized into three groups: inductive logic programming, path sampling and generalization, and linear programming. Following this, we delve into neurosymbolic methods, covering topics such as the integration of deep learning with rules, the use of embeddings for rule learning, and the application of large language models in rule learning.

CLSep 24, 2024
Explaining word embeddings with perfect fidelity: Case study in research impact prediction

Lucie Dvorackova, Marcin P. Joachimiak, Michal Cerny et al.

The best-performing approaches for scholarly document quality prediction are based on embedding models. In addition to their performance when used in classifiers, embedding models can also provide predictions even for words that were not contained in the labelled training data for the classification model, which is important in the context of the ever-evolving research terminology. Although model-agnostic explanation methods, such as Local interpretable model-agnostic explanations, can be applied to explain machine learning classifiers trained on embedding models, these produce results with questionable correspondence to the model. We introduce a new feature importance method, Self-model Rated Entities (SMER), for logistic regression-based classification models trained on word embeddings. We show that SMER has theoretically perfect fidelity with the explained model, as the average of logits of SMER scores for individual words (SMER explanation) exactly corresponds to the logit of the prediction of the explained model. Quantitative and qualitative evaluation is performed through five diverse experiments conducted on 50,000 research articles (papers) from the CORD-19 corpus. Through an AOPC curve analysis, we experimentally demonstrate that SMER produces better explanations than LIME, SHAP and global tree surrogates.

MLNov 28, 2017
QCBA: Improving Rule Classifiers Learned from Quantitative Data by Recovering Information Lost by Discretisation

Tomas Kliegr, Ebroul Izquierdo

A prediscretisation of numerical attributes which is required by some rule learning algorithms is a source of inefficiencies. This paper describes new rule tuning steps that aim to recover lost information in the discretisation and new pruning techniques that may further reduce the size of rule models and improve their accuracy. The proposed QCBA method was initially developed to postprocess quantitative attributes in models generated by the Classification based on associations (CBA) algorithm, but it can also be applied to the results of other rule learning approaches. We demonstrate the effectiveness on the postprocessing of models generated by five association rule classification algorithms (CBA, CMAR, CPAR, IDS, SBRL) and two first-order logic rule learners (FOIL2 and PRM). Benchmarks on 22 datasets from the UCI repository show smaller size and the overall best predictive performance for FOIL2+QCBA compared to all seven baselines. Postoptimised CBA models have a better predictive performance compared to the state-of-the-art rule learner CORELS in this benchmark. The article contains an ablation study for the individual postprocessing steps and a scalability analysis on the KDD'99 Anomaly detection dataset.