Lukasz Czekaj

LGMay 21, 2022

On the problem of entity matching and its application in automated settlement of receivables

Lukasz Czekaj, Tomasz Biegus, Robert Kitlowski et al.

This paper covers automated settlement of receivables in non-governmental organizations. We tackle the problem with entity matching techniques. We consider setup, where base algorithm is used for preliminary ranking of matches, then we apply several novel methods to increase matching quality of base algorithm: score post processing, cascade model and chain model. The methods presented here contribute to automated settlement of receivables, entity matching and multilabel classification in open-world scenario. We evaluate our approach on real world operational data which come from company providing settlement of receivables as a service: proposed methods boost recall from 78% (base model) to >90% at precision 99%.

LGMar 11, 2019

Labeler-hot Detection of EEG Epileptic Transients

Lukasz Czekaj, Wojciech Ziembla, Pawel Jezierski et al.

Preventing early progression of epilepsy and so the severity of seizures requires an effective diagnosis. Epileptic transients indicate the ability to develop seizures but humans overlook such brief events in an electroencephalogram (EEG) what compromises patient treatment. Traditionally, training of the EEG event detection algorithms has relied on ground truth labels, obtained from the consensus of the majority of labelers. In this work, we go beyond labeler consensus on EEG data. Our event descriptor integrates EEG signal features with one-hot encoded labeler category that is a key to improved generalization performance. Notably, boosted decision trees take advantage of singly-labeled but more varied training sets. Our quantitative experiments show the proposed labeler-hot epileptic event detector consistently outperforms a consensus-trained detector and maintains confidence bounds of the detection. The results on our infant EEG recordings suggest datasets can gain higher event variety faster and thus better performance by shifting available human effort from consensus-oriented to separate labeling when labels include both, the event and the labeler category.

Lukasz Czekaj

2 Papers