DBMay 29

Modeling and Optimization for Massive Data Allocation in Database

arXiv:2605.3100262.0
Predicted impact top 18% in DB · last 90 daysOriginality Incremental advance
AI Analysis

This work is significant for e-commerce and Internet platforms that rely on distributed databases, as it offers an improved data placement scheme to enhance the performance of Online Transaction Processing systems by reducing communication costs and balancing data. This is an incremental improvement over existing methods.

This paper addresses the challenge of data allocation in distributed databases for e-commerce and Internet platforms, aiming to balance data and minimize communication overhead. The authors propose a novel model inspired by normalized cut and solve it using the Bregman proximal gradient method, demonstrating that their algorithm minimizes migration costs and maintains superior balance compared to existing partitioning schemes.

In the era of big data, e-commerce and Internet platforms face the challenge of processing massive amounts of data. However, due to data being scattered across different machines in distributed database, extra communication costs are incurred in gathering relevant data to complete transactions. Without a carefully designed data placement scheme, this cost can severely impact the performance of Online Transaction Processing systems. To meet industry requirements, algorithms that output a data placement scheme that achieves i) data balance and ii) low communication overhead within a fixed period of time are eagerly investigated. Although some existing methods have been studied, they do not adequately meet the aforementioned requirements. In this paper, inspired by the normalized cut of spectral clustering, we introduce a novel model for data allocation problem. The normalized cut reconciles the inherent conflict between the two objectives. Taking into account the variable characteristics of the model, we formulate the problem as a 0-1 optimization problem, and solve the relaxed problem using the Bregman proximal gradient method with guaranteed convergence. The numerical experiments reveal that the convergent solutions can be smoothly rounded to discrete solutions. Furthermore, our algorithm surpasses both simple and meta-heuristic partitioning schemes by minimizing migration costs while maintaining a superior balance.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes