Leveraging I/O Stalls for Efficient Scheduling in ANNS
For ANNS systems handling concurrent queries and updates, LIOS provides a practical method to utilize otherwise wasted I/O stall cycles, significantly improving update throughput with controlled latency impact.
Disk-based ANNS systems waste over 40% of search-thread CPU time stalling on I/O. LIOS exploits these idle cycles to execute index updates, achieving up to 2.68× faster insertions and 2.18× faster deletions while keeping search latency degradation near a user-specified target.
Disk-based graph indexes for approximate nearest neighbor search (ANNS) must serve latency-sensitive queries and throughput-demanding updates concurrently. We observe that over 40% of search-thread CPU time is spent stalling on disk I/O; such idle cycles are invisible to thread-level scheduling yet available for other work. We present LIOS(Leverage I/O Stall), a framework that executes index updates inside search-side I/O stall windows. LIOS introduces three techniques: (i) splitting each update into resumable subtasks small enough to fit within a single stall window; (ii) bounding the expected overrun of update subtasks to a given threshold; and (iii) dynamically adjusting the fraction of idle time devoted to updates to drive end-to-end search latency degradation toward a user-specified target. We integrate LIOS into two update-optimized ANNS systems, FreshDiskANN and OdinANN. LIOS achieves speedups of up to 2.68$\times$ in insertion and 2.18$\times$ in deletion, with search latency degradation maintained near the user-specified target.