Andrey Grabovoy

CL
h-index6
6papers
69citations
Novelty48%
AI Score43

6 Papers

CLOct 18, 2024Code
Are AI Detectors Good Enough? A Survey on Quality of Datasets With Machine-Generated Texts

German Gritsai, Anastasia Voznyuk, Andrey Grabovoy et al.

The rapid development of autoregressive Large Language Models (LLMs) has significantly improved the quality of generated texts, necessitating reliable machine-generated text detectors. A huge number of detectors and collections with AI fragments have emerged, and several detection methods even showed recognition quality up to 99.9% according to the target metrics in such collections. However, the quality of such detectors tends to drop dramatically in the wild, posing a question: Are detectors actually highly trustworthy or do their high benchmark scores come from the poor quality of evaluation datasets? In this paper, we emphasise the need for robust and qualitative methods for evaluating generated data to be secure against bias and low generalising ability of future model. We present a systematic review of datasets from competitions dedicated to AI-generated content detection and propose methods for evaluating the quality of datasets containing AI-generated fragments. In addition, we discuss the possibility of using high-quality generated data to achieve two goals: improving the training of detection models and improving the training datasets themselves. Our contribution aims to facilitate a better understanding of the dynamics between human and machine text, which will ultimately support the integrity of information in an increasingly automated world. The code is available at https://github.com/Advacheck-OU/ai-dataset-analysing.

LGSep 18, 2024
Unraveling the Hessian: A Key to Smooth Convergence in Loss Function Landscapes

Nikita Kiselev, Andrey Grabovoy

The loss landscape of neural networks is a critical aspect of their training, and understanding its properties is essential for improving their performance. In this paper, we investigate how the loss surface changes when the sample size increases, a previously unexplored issue. We theoretically analyze the convergence of the loss landscape in a fully connected neural network and derive upper bounds for the difference in loss function values when adding a new object to the sample. Our empirical study confirms these results on various datasets, demonstrating the convergence of the loss function surface for image classification tasks. Our findings provide insights into the local geometry of neural loss landscapes and have implications for the development of sample size determination techniques.

LGApr 16
Curvature-Aligned Probing for Local Loss-Landscape Stabilization

Nikita Kiselev, Andrey Grabovoy

Local loss-landscape stabilization under sample growth is typically measured either pointwise or through isotropic averaging in the full parameter space. Despite practical value, both choices probe directions that contribute little to the dominant local deformation of strongly anisotropic neural landscapes. We recast stabilization as an observational problem and introduce a unified family of criteria parameterized by an aggregation order and a probing distribution; within this family we propose a curvature-aligned criterion $Δ_2^{(D)}$ that probes the loss increment field in the top-$D$ eigenspace of the empirical Hessian near a trained solution. Solely from a local quadratic model, we prove that $Δ_2^{(D)}$ preserves the $O(k^{-2})$ mean-squared rate of the full-space criterion while replacing ambient-dimension curvature dependence with dependence on the subspace dimension $D$; a corollary gives a closed-form spectral expression and a proposition identifies the top-$D$ eigenspace as extremal within the eigenspace-aligned family. We also derive scalable estimators based on Hessian-vector products, subspace Monte Carlo, and a closed-form Gaussian-moment proxy. On a decoder-only transformer, a curvature-aligned probe occupying a tiny fraction of parameter space already reproduces the full-space mean-squared signal to within numerical noise throughout the validated local regime, and the closed-form estimator is orders of magnitude faster than direct Monte Carlo after subspace construction.

CLNov 11, 2024
Multi-head Span-based Detector for AI-generated Fragments in Scientific Papers

German Gritsai, Ildar Khabutdinov, Andrey Grabovoy

This paper describes a system designed to distinguish between AI-generated and human-written scientific excerpts in the DAGPap24 competition hosted within the Fourth Workshop on Scientific Document Processing. In this competition the task is to find artificially generated token-level text fragments in documents of a scientific domain. Our work focuses on the use of a multi-task learning architecture with two heads. The application of this approach is justified by the specificity of the task, where class spans are continuous over several hundred characters. We considered different encoder variations to obtain a state vector for each token in the sequence, as well as a variation in splitting fragments into tokens to further feed into the input of a transform-based encoder. This approach allows us to achieve a 9% quality improvement relative to the baseline solution score on the development set (from 0.86 to 0.95) using the average macro F1-score, as well as a score of 0.96 on a closed test part of the dataset from the competition.

CLNov 18, 2024
Advacheck at GenAI Detection Task 1: AI Detection Powered by Domain-Aware Multi-Tasking

German Gritsai, Anastasia Voznyuk, Ildar Khabutdinov et al.

The paper describes a system designed by Advacheck team to recognise machine-generated and human-written texts in the monolingual subtask of GenAI Detection Task 1 competition. Our developed system is a multi-task architecture with shared Transformer Encoder between several classification heads. One head is responsible for binary classification between human-written and machine-generated texts, while the other heads are auxiliary multiclass classifiers for texts of different domains from particular datasets. As multiclass heads were trained to distinguish the domains presented in the data, they provide a better understanding of the samples. This approach led us to achieve the first place in the official ranking with 83.07% macro F1-score on the test set and bypass the baseline by 10%. We further study obtained system through ablation, error and representation analyses, finding that multi-task learning outperforms single-task mode and simultaneous tasks form a cluster structure in embeddings space.

LGOct 19, 2025
Closing the Curvature Gap: Full Transformer Hessians and Their Implications for Scaling Laws

Egor Petrov, Nikita Kiselev, Vladislav Meshkov et al.

The lack of theoretical results for Layer Normalization and feedforward Hessians has left a gap in the study of Transformer optimization landscapes. We address this by deriving explicit second-order expressions for these components, thereby completing the Hessian characterization of full Transformer blocks. Our results generalize prior self-attention analyses and yield estimations for the role of each sublayer in curvature propagation. We demonstrate how these Hessian structures inform both convergence dynamics and the empirical scaling laws governing large-model performance. Further, we propose a Taylor-expansion-based framework for analyzing loss differences to quantify convergence trajectories. By extending Hessian theory to the full Transformer architecture, this work establishes a new foundation for theoretical and empirical investigations of optimization in large-scale deep learning.