Alexander I. Nesterov

h-index12

4papers

23citations

Novelty53%

AI Score27

Ranked #155,363 of 194,257 authors (top 80%)#27,040 in CL (top 88%)

4 Papers

1.4CLApr 8, 2022

RuBioRoBERTa: a pre-trained biomedical language model for Russian language biomedical text mining

Alexander Yalunin, Alexander Nesterov, Dmitriy Umerenkov

This paper presents several BERT-based models for Russian language biomedical text mining (RuBioBERT, RuBioRoBERTa). The models are pre-trained on a corpus of freely available texts in the Russian biomedical domain. With this pre-training, our models demonstrate state-of-the-art results on RuMedBench - Russian medical language understanding benchmark that covers a diverse set of tasks, including text classification, question answering, natural language inference, and named entity recognition.

0.5CLMay 5, 2023

Predicting COVID-19 and pneumonia complications from admission texts

Dmitriy Umerenkov, Oleg Cherkashin, Alexander Nesterov et al.

In this paper we present a novel approach to risk assessment for patients hospitalized with pneumonia or COVID-19 based on their admission reports. We applied a Longformer neural network to admission reports and other textual data available shortly after admission to compute risk scores for the patients. We used patient data of multiple European hospitals to demonstrate that our approach outperforms the Transformer baselines. Our experiments show that the proposed model generalises across institutions and diagnoses. Also, our method has several other advantages described in the paper.

0.3CLJan 25, 2022

Distantly supervised end-to-end medical entity extraction from electronic health records with human-level quality

Alexander Nesterov, Dmitry Umerenkov

Medical entity extraction (EE) is a standard procedure used as a first stage in medical texts processing. Usually Medical EE is a two-step process: named entity recognition (NER) and named entity normalization (NEN). We propose a novel method of doing medical EE from electronic health records (EHR) as a single-step multi-label classification task by fine-tuning a transformer model pretrained on a large EHR dataset. Our model is trained end-to-end in an distantly supervised manner using targets automatically extracted from medical knowledge base. We show that our model learns to generalize for entities that are present frequently enough, achieving human-level classification quality for most frequent entities. Our work demonstrates that medical entity extraction can be done end-to-end without human supervision and with human quality given the availability of a large enough amount of unlabeled EHR and a medical knowledge base.

4.3IMSep 9, 2020Code

Deep learning for gravitational-wave data analysis: A resampling white-box approach

Manuel D. Morales, Javier M. Antelis, Claudia Moreno et al.

In this work, we apply Convolutional Neural Networks (CNNs) to detect gravitational wave (GW) signals of compact binary coalescences, using single-interferometer data from LIGO detectors. As novel contribution, we adopted a resampling white-box approach to advance towards a statistical understanding of uncertainties intrinsic to CNNs in GW data analysis. Resampling is performed by repeated $k$-fold cross-validation experiments, and for a white-box approach, behavior of CNNs is mathematically described in detail. Through a Morlet wavelet transform, strain time series are converted to time-frequency images, which in turn are reduced before generating input datasets. Moreover, to reproduce more realistic experimental conditions, we worked only with data of non-Gaussian noise and hardware injections, removing freedom to set signal-to-noise ratio (SNR) values in GW templates by hand. After hyperparameter adjustments, we found that resampling smooths stochasticity of mini-batch stochastic gradient descend by reducing mean accuracy perturbations in a factor of $3.6$. CNNs were quite precise to detect noise but not sensitive enough to recall GW signals, meaning that CNNs are better for noise reduction than generation of GW triggers. However, applying a post-analysis, we found that for GW signals of SNR $\geq 21.80$ with H1 data and SNR $\geq 26.80$ with L1 data, CNNs could remain as tentative alternatives for detecting GW signals. Besides, with receiving operating characteristic curves we found that CNNs show much better performances than those of Naive Bayes and Support Vector Machines models and, with a significance level of $5\%$, we estimated that predictions of CNNs are significant different from those of a random classifier. Finally, we elucidated that performance of CNNs is highly class dependent because of the distribution of probabilistic scores outputted by the softmax layer.