Nikolay Ivanov

CR
h-index8
14papers
99citations
Novelty49%
AI Score47

14 Papers

SDMay 28, 2022
SuperVoice: Text-Independent Speaker Verification Using Ultrasound Energy in Human Speech

Hanqing Guo, Qiben Yan, Nikolay Ivanov et al.

Voice-activated systems are integrated into a variety of desktop, mobile, and Internet-of-Things (IoT) devices. However, voice spoofing attacks, such as impersonation and replay attacks, in which malicious attackers synthesize the voice of a victim or simply replay it, have brought growing security concerns. Existing speaker verification techniques distinguish individual speakers via the spectrographic features extracted from an audible frequency range of voice commands. However, they often have high error rates and/or long delays. In this paper, we explore a new direction of human voice research by scrutinizing the unique characteristics of human speech at the ultrasound frequency band. Our research indicates that the high-frequency ultrasound components (e.g. speech fricatives) from 20 to 48 kHz can significantly enhance the security and accuracy of speaker verification. We propose a speaker verification system, SUPERVOICE that uses a two-stream DNN architecture with a feature fusion mechanism to generate distinctive speaker models. To test the system, we create a speech dataset with 12 hours of audio (8,950 voice samples) from 127 participants. In addition, we create a second spoofed voice dataset to evaluate its security. In order to balance between controlled recordings and real-world applications, the audio recordings are collected from two quiet rooms by 8 different recording devices, including 7 smartphones and an ultrasound microphone. Our evaluation shows that SUPERVOICE achieves 0.58% equal error rate in the speaker verification task, it only takes 120 ms for testing an incoming utterance, outperforming all existing speaker verification systems. Moreover, within 91 ms processing time, SUPERVOICE achieves 0% equal error rate in detecting replay attacks launched by 5 different loudspeakers.

DCJul 16, 2023
DynamicFL: Balancing Communication Dynamics and Client Manipulation for Federated Learning

Bocheng Chen, Nikolay Ivanov, Guangjing Wang et al.

Federated Learning (FL) is a distributed machine learning (ML) paradigm, aiming to train a global model by exploiting the decentralized data across millions of edge devices. Compared with centralized learning, FL preserves the clients' privacy by refraining from explicitly downloading their data. However, given the geo-distributed edge devices (e.g., mobile, car, train, or subway) with highly dynamic networks in the wild, aggregating all the model updates from those participating devices will result in inevitable long-tail delays in FL. This will significantly degrade the efficiency of the training process. To resolve the high system heterogeneity in time-sensitive FL scenarios, we propose a novel FL framework, DynamicFL, by considering the communication dynamics and data quality across massive edge devices with a specially designed client manipulation strategy. \ours actively selects clients for model updating based on the network prediction from its dynamic network conditions and the quality of its training data. Additionally, our long-term greedy strategy in client selection tackles the problem of system performance degradation caused by short-term scheduling in a dynamic network. Lastly, to balance the trade-off between client performance evaluation and client manipulation granularity, we dynamically adjust the length of the observation window in the training process to optimize the long-term system efficiency. Compared with the state-of-the-art client selection scheme in FL, \ours can achieve a better model accuracy while consuming only 18.9\% -- 84.0\% of the wall-clock time. Our component-wise and sensitivity studies further demonstrate the robustness of \ours under various real-life scenarios.

CRMar 11
SoK: Self-Sovereign Digital Identities

Sushanth Ambati, Kainat Adeel, Jack Myers et al.

Self-Sovereign Digital Identity (SSDI) enables individuals to control their own identity assertions and data, rather than relying on centralized or federated systems prone to large-scale data breaches. By eliminating centralized databases maintained by service providers and identity brokers, SSDIs offer enhanced security and privacy. However, adoption remains slow, and research in this area lacks systematization and uniformity. To address these gaps, we present a comprehensive systematization of knowledge on self-sovereign digital identities, with a primary focus on identifying the challenges that impede real-world adoption. We survey 80 academic and non-academic sources and identify six major challenges: (i) binding a single identity to one individual or organization, (ii) the absence of mature cryptographic and communication protocols, (iii) significant usability barriers, (iv) regulatory and oversight gaps, (v) bootstrapping to critical-mass adoption, and (vi) dependence on a permissionless, decentralized, yet singular infrastructure that may expose unforeseen vulnerabilities over time. We then analyze 47 scientific publications and find that the vast majority focus on blockchain-based solutions rather than generalized SSDI architectures. Additionally, we catalog 12 real-world, production-grade SSDI applications. Our evaluation of these solutions reveals that self-sovereignty is, in practice, a spectrum rather than a binary property. Finally, we explore the frontiers of SSDI by identifying major trends, open problems, and opportunities for future research. We hope this systematization will help advance the shift from centralized to self-sovereign digital identities in a disciplined and impactful way.

CLMay 30, 2025Code
Eye of Judgement: Dissecting the Evaluation of Russian-speaking LLMs with POLLUX

Nikita Martynov, Anastasia Mordasheva, Dmitriy Gorbetskiy et al.

We introduce POLLUX, a comprehensive open-source benchmark designed to evaluate the generative capabilities of large language models (LLMs) in Russian. Our main contribution is a novel evaluation methodology that enhances the interpretability of LLM assessment. For each task type, we define a set of detailed criteria and develop a scoring protocol where models evaluate responses and provide justifications for their ratings. This enables transparent, criteria-driven evaluation beyond traditional resource-consuming, side-by-side human comparisons. POLLUX includes a detailed, fine-grained taxonomy of 35 task types covering diverse generative domains such as code generation, creative writing, and practical assistant use cases, totaling 2,100 manually crafted and professionally authored prompts. Each task is categorized by difficulty (easy/medium/hard), with experts constructing the dataset entirely from scratch. We also release a family of LLM-as-a-Judge (7B and 32B) evaluators trained for nuanced assessment of generative outputs. This approach provides scalable, interpretable evaluation and annotation tools for model development, effectively replacing costly and less precise human judgments.

CRMay 1, 2021Code
Targeting the Weakest Link: Social Engineering Attacks in Ethereum Smart Contracts

Nikolay Ivanov, Jianzhi Lou, Ting Chen et al.

Ethereum holds multiple billions of U.S. dollars in the form of Ether cryptocurrency and ERC-20 tokens, with millions of deployed smart contracts algorithmically operating these funds. Unsurprisingly, the security of Ethereum smart contracts has been under rigorous scrutiny. In recent years, numerous defense tools have been developed to detect different types of smart contract code vulnerabilities. When opportunities for exploiting code vulnerabilities diminish, the attackers start resorting to social engineering attacks, which aim to influence humans -- often the weakest link in the system. The only known class of social engineering attacks in Ethereum are honeypots, which plant hidden traps for attackers attempting to exploit existing vulnerabilities, thereby targeting only a small population of potential victims. In this work, we explore the possibility and existence of new social engineering attacks beyond smart contract honeypots. We present two novel classes of Ethereum social engineering attacks - Address Manipulation and Homograph - and develop six zero-day social engineering attacks. To show how the attacks can be used in popular programming patterns, we conduct a case study of five popular smart contracts with combined market capitalization exceeding $29 billion, and integrate our attack patterns in their source codes without altering their existing functionality. Moreover, we show that these attacks remain dormant during the test phase but activate their malicious logic only at the final production deployment. We further analyze 85,656 open-source smart contracts, and discover that 1,027 of them can be used for the proposed social engineering attacks. We conduct a professional opinion survey with experts from seven smart contract auditing firms, corroborating that the exposed social engineering attacks bring a major threat to the smart contract systems.

CLJan 22, 2025
Adaptive Retrieval Without Self-Knowledge? Bringing Uncertainty Back Home

Viktor Moskvoretskii, Maria Lysyuk, Mikhail Salnikov et al.

Retrieval Augmented Generation (RAG) improves correctness of Question Answering (QA) and addresses hallucinations in Large Language Models (LLMs), yet greatly increase computational costs. Besides, RAG is not always needed as may introduce irrelevant information. Recent adaptive retrieval methods integrate LLMs' intrinsic knowledge with external information appealing to LLM self-knowledge, but they often neglect efficiency evaluations and comparisons with uncertainty estimation techniques. We bridge this gap by conducting a comprehensive analysis of 35 adaptive retrieval methods, including 8 recent approaches and 27 uncertainty estimation techniques, across 6 datasets using 10 metrics for QA performance, self-knowledge, and efficiency. Our findings show that uncertainty estimation techniques often outperform complex pipelines in terms of efficiency and self-knowledge, while maintaining comparable QA performance.

CLMay 7, 2025
LLM-Independent Adaptive RAG: Let the Question Speak for Itself

Maria Marina, Nikolay Ivanov, Sergey Pletenev et al.

Large Language Models~(LLMs) are prone to hallucinations, and Retrieval-Augmented Generation (RAG) helps mitigate this, but at a high computational cost while risking misinformation. Adaptive retrieval aims to retrieve only when necessary, but existing approaches rely on LLM-based uncertainty estimation, which remain inefficient and impractical. In this study, we introduce lightweight LLM-independent adaptive retrieval methods based on external information. We investigated 27 features, organized into 7 groups, and their hybrid combinations. We evaluated these methods on 6 QA datasets, assessing the QA performance and efficiency. The results show that our approach matches the performance of complex LLM-based methods while achieving significant efficiency gains, demonstrating the potential of external information for adaptive retrieval.

CRMar 7
SoK: Evolution, Security, and Fundamental Properties of Transactional Systems

Sky Pelletier Waterpeace, Nikolay Ivanov

Transaction processing systems underpin modern commerce, finance, and critical infrastructure, yet their security has never been studied across the full evolutionary arc of these systems. Over five decades, transaction processing has progressed through four distinct generations, from centralized databases, to distributed databases, to blockchain and distributed ledger technologies (DLTs), finally to multi-context systems that span cyber-physical components under real-time constraints. Each generation has introduced new transaction types and new classes of vulnerabilities, yet security research remains fragmented by domain, and the foundational ACID transaction model has not been revisited to reflect the demands of contemporary systems. We classify 163 papers on transaction security by evolutionary generation, security focus, and relevant Common Weakness Enumeration (CWE) entries, and distill a curated set of 41 high-impact or seminal papers spanning all four generations. We make three principal contributions. First, we develop a four-generation evolutionary taxonomy that contextualizes each work within the broader trajectory of transaction processing. Second, we map each paper's security focus to CWE identifiers, providing a systems-oriented vocabulary for analyzing transaction-specific threats across otherwise siloed domains. Third, we demonstrate that the classical ACID properties are insufficient for modern transactional systems and introduce RANCID, extending ACID with Real-timeness (R) and N-many Contexts (N), as a property set for reasoning about the security and correctness of systems that must coordinate across heterogeneous contexts under timing constraints. Our systematization exposes a pronounced bias toward DLT security research at the expense of broader transactional security and identifies concrete open problems for the next generation of transaction processing systems.

CLMay 27, 2025
Will It Still Be True Tomorrow? Multilingual Evergreen Question Classification to Improve Trustworthy QA

Sergey Pletenev, Maria Marina, Nikolay Ivanov et al.

Large Language Models (LLMs) often hallucinate in question answering (QA) tasks. A key yet underexplored factor contributing to this is the temporality of questions -- whether they are evergreen (answers remain stable over time) or mutable (answers change). In this work, we introduce EverGreenQA, the first multilingual QA dataset with evergreen labels, supporting both evaluation and training. Using EverGreenQA, we benchmark 12 modern LLMs to assess whether they encode question temporality explicitly (via verbalized judgments) or implicitly (via uncertainty signals). We also train EG-E5, a lightweight multilingual classifier that achieves SoTA performance on this task. Finally, we demonstrate the practical utility of evergreen classification across three applications: improving self-knowledge estimation, filtering QA datasets, and explaining GPT-4o retrieval behavior.

AIFeb 26, 2024
GigaPevt: Multimodal Medical Assistant

Pavel Blinov, Konstantin Egorov, Ivan Sviridov et al.

Building an intelligent and efficient medical assistant is still a challenging AI problem. The major limitation comes from the data modality scarceness, which reduces comprehensive patient perception. This demo paper presents the GigaPevt, the first multimodal medical assistant that combines the dialog capabilities of large language models with specialized medical models. Such an approach shows immediate advantages in dialog quality and metric performance, with a 1.18% accuracy improvement in the question-answering task.

CRAug 31, 2021
EthClipper: A Clipboard Meddling Attack on Hardware Wallets with Address Verification Evasion

Nikolay Ivanov, Qiben Yan

Hardware wallets are designed to withstand malware attacks by isolating their private keys from the cyberspace, but they are vulnerable to the attacks that fake an address stored in a clipboard. To prevent such attacks, a hardware wallet asks the user to verify the recipient address shown on the wallet display. Since crypto addresses are long sequences of random symbols, their manual verification becomes a difficult task. Consequently, many users of hardware wallets elect to verify only a few symbols in the address, and this can be exploited by an attacker. In this work, we introduce EthClipper, an attack that targets owners of hardware wallets on the Ethereum platform. EthClipper malware queries a distributed database of pre-mined accounts in order to select the address with maximum visual similarity to the original one. We design and implement a EthClipper malware, which we test on Trezor, Ledger, and KeepKey wallets. To deliver computation and storage resources for the attack, we implement a distributed service, ClipperCloud, and test it on different deployment environments. Our evaluation shows that with off-the-shelf PCs and NAS storage, an attacker would be able to mine a database capable of matching 25% of the digits in an address to achieve a 50% chance of finding a fitting fake address. For responsible disclosure, we have contacted the manufactures of the hardware wallets used in the attack evaluation, and they all confirm the danger of EthClipper.

DCJul 18, 2021
System-Wide Security for Offline Payment Terminals

Nikolay Ivanov, Qiben Yan

Most self-service payment terminals require network connectivity for processing electronic payments. The necessity to maintain network connectivity increases costs, introduces cybersecurity risks, and significantly limits the number of places where the terminals can be installed. Leading payment service providers have proposed offline payment solutions that rely on algorithmically generated payment tokens. Existing payment token solutions, however, require complex mechanisms for authentication, transaction management, and most importantly, security risk management. In this paper, we present VolgaPay, a blockchain-based system that allows merchants to deploy secure offline payment terminal infrastructure that does not require collection and storage of any sensitive data. We design a novel payment protocol which mitigates security threats for all the participants of VolgaPay, such that the maximum loss from gaining full access to any component by an adversary incurs only a limited scope of harm. We achieve significant enhancements in security, operation efficiency, and cost reduction via a combination of polynomial multi-hash chain micropayment channels and blockchain grafting for off-chain channel state transition. We implement the VolgaPay payment system, and with thorough evaluation and security analysis, we demonstrate that VolgaPay is capable of delivering a fast, secure, and cost-efficient solution for offline payment terminals.

CRJul 17, 2021
Rectifying Administrated ERC20 Tokens

Nikolay Ivanov, Hanqing Guo, Qiben Yan

The developers of Ethereum smart contracts often implement administrating patterns, such as censoring certain users, creating or destroying balances on demand, destroying smart contracts, or injecting arbitrary code. These routines turn an ERC20 token into an administrated token - the type of Ethereum smart contract that we scrutinize in this research. We discover that many smart contracts are administrated, and the owners of these tokens carry lesser social and legal responsibilities compared to the traditional centralized actors that those tokens intend to disrupt. This entails two major problems: a) the owners of the tokens have the ability to quickly steal all the funds and disappear from the market; and b) if the private key of the owner's account is stolen, all the assets might immediately turn into the property of the attacker. We develop a pattern recognition framework based on 9 syntactic features characterizing administrated ERC20 tokens, which we use to analyze existing smart contracts deployed on Ethereum Mainnet. Our analysis of 84,062 unique Ethereum smart contracts reveals that nearly 58% of them are administrated ERC20 tokens, which accounts for almost 90% of all ERC20 tokens deployed on Ethereum. To protect users from the frivolousness of unregulated token owners without depriving the ability of these owners to properly manage their tokens, we introduce SafelyAdministrated - a library that enforces a responsible ownership and management of ERC20 tokens. The library introduces three mechanisms: deferred maintenance, board of trustees and safe pause. We implement and test SafelyAdministrated in the form of Solidity abstract contract, which is ready to be used by the next generation of safely administrated ERC20 tokens.

STMay 11, 2021
Constraint-Based Inference of Heuristics for Foreign Exchange Trade Model Optimization

Nikolay Ivanov, Qiben Yan

The Foreign Exchange (Forex) is a large decentralized market, on which trading analysis and algorithmic trading are popular. Research efforts have been focusing on proof of efficiency of certain technical indicators. We demonstrate, however, that the values of indicator functions are not reproducible and often reduce the number of trade opportunities, compared to price-action trading. In this work, we develop two dataset-agnostic Forex trading heuristic templates with high rate of trading signals. In order to determine most optimal parameters for the given heuristic prototypes, we perform a machine learning simulation of 10 years of Forex price data over three low-margin instruments and 6 different OHLC granularities. As a result, we develop a specific and reproducible list of most optimal trade parameters found for each instrument-granularity pair, with 118 pips of average daily profit for the optimized configuration.