Yuanming Ren

2papers

2 Papers

39.3DBMay 19
Leveraging I/O Stalls for Efficient Scheduling in ANNS

Juncheng Zhang, Yuanming Ren, Yongkun Li et al.

Disk-based graph indexes for approximate nearest neighbor search (ANNS) must serve latency-sensitive queries and throughput-demanding updates concurrently. We observe that over 40% of search-thread CPU time is spent stalling on disk I/O; such idle cycles are invisible to thread-level scheduling yet available for other work. We present LIOS(Leverage I/O Stall), a framework that executes index updates inside search-side I/O stall windows. LIOS introduces three techniques: (i) splitting each update into resumable subtasks small enough to fit within a single stall window; (ii) bounding the expected overrun of update subtasks to a given threshold; and (iii) dynamically adjusting the fraction of idle time devoted to updates to drive end-to-end search latency degradation toward a user-specified target. We integrate LIOS into two update-optimized ANNS systems, FreshDiskANN and OdinANN. LIOS achieves speedups of up to 2.68$\times$ in insertion and 2.18$\times$ in deletion, with search latency degradation maintained near the user-specified target.

41.6DBApr 10
Decoupling Vector Data and Index Storage for Space Efficiency

Yuanming Ren, Juncheng Zhang, Yanjing Ren et al.

Managing large-scale vector datasets with disk-based approximate nearest neighbor search (ANNS) systems faces critical efficiency challenges stemming from the co-location of vector data and auxiliary index metadata. Our analysis of state-of-the-art ANNS systems reveals that such co-location incurs substantial storage overhead, generates excessive reads during search queries, and causes severe write amplification during updates. We present DecoupleVS, a decoupled vector storage management framework that enables specialized optimizations for vector data and auxiliary index metadata. DecoupleVS incorporates various design techniques for effective compression, data layouts, search queries, and updates, so as to significantly reduce storage space, while maintaining high search and update performance and high search accuracy. Evaluation on real-world public and proprietary billion-scale datasets shows that DecoupleVS reduces storage space by up to 58.7\%, while delivering competitive or improved search query and update performance, compared to state-of-the-art monolithic disk-based ANNS systems.