Hammer: Towards Efficient Hot-Cold Data Identification via Online Learning
This addresses storage efficiency for cloud and big data systems, but it is incremental as it builds on online learning methods.
The paper tackled the problem of accurately identifying hot and cold data in storage management for big data and cloud computing, achieving a 90% accuracy rate in classification with reduced overheads.
Efficient management of storage resources in big data and cloud computing environments requires accurate identification of data's "cold" and "hot" states. Traditional methods, such as rule-based algorithms and early AI techniques, often struggle with dynamic workloads, leading to low accuracy, poor adaptability, and high operational overhead. To address these issues, we propose a novel solution based on online learning strategies. Our approach dynamically adapts to changing data access patterns, achieving higher accuracy and lower operational costs. Rigorous testing with both synthetic and real-world datasets demonstrates a significant improvement, achieving a 90% accuracy rate in hot-cold classification. Additionally, the computational and storage overheads are considerably reduced.