LGDec 2, 2025

Neighborhood density estimation using space-partitioning based hashing schemes

arXiv:2512.03187v1h-index: 3
Originality Incremental advance
AI Analysis

This work addresses problems in bioinformatics for single-cell data analysis and machine learning for streaming data, presenting incremental improvements with novel methods.

The paper tackled anomaly detection in large-scale single-cell RNA sequencing data and concept drift detection in streaming data, with FiRE/FiRE.1 showing superior performance against state-of-the-art techniques and Enhash proving highly competitive in time and accuracy across various drift types.

This work introduces FiRE/FiRE.1, a novel sketching-based algorithm for anomaly detection to quickly identify rare cell sub-populations in large-scale single-cell RNA sequencing data. This method demonstrated superior performance against state-of-the-art techniques. Furthermore, the thesis proposes Enhash, a fast and resource-efficient ensemble learner that uses projection hashing to detect concept drift in streaming data, proving highly competitive in time and accuracy across various drift types.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes