PLARApr 9

PG-MDP: Profile-Guided Memory Dependence Prediction for Area-Constrained Cores

arXiv:2604.0844546.3
Predicted impact top 26% in PL · last 90 daysOriginality Incremental advance
AI Analysis

This addresses performance bottlenecks in energy-efficient or edge systems with limited hardware resources, offering an incremental improvement over existing methods.

The paper tackles the problem of high false dependency rates in memory dependence prediction (MDP) for area-constrained cores by proposing a profile-guided software co-design that labels memory independent loads to reduce the MDP working set. It reduces MDP queries by 79%, false dependencies by 77%, and improves IPC by 1.47% on SPEC2017 benchmarks, achieving performance close to using a much larger predictor without area cost.

Memory Dependence Prediction (MDP) is a speculative technique to determine which stores, if any, a given load will depend on. Area-constrained cores are increasingly relevant in various applications such as energy-efficient or edge systems, and often have limited space for MDP tables. This leads to a high rate of false dependencies as memory independent loads alias with unrelated predictor entries, causing unnecessary stalls in the processor pipeline. The conventional way to address this problem is with greater predictor size or complexity, but this is unattractive on area-constrained cores. This paper proposes that targeting the predictor working set is as effective as growing the predictor, and can deliver performance competitive with large predictors while still using very small predictors. This paper introduces profile-guided memory dependence prediction (PG-MDP), a software co-design to label consistently memory independent loads via their opcode and remove them from the MDP working set. These loads bypass querying the MDP when dispatched and always issue as soon as possible. Across SPEC2017 CPU intspeed, PG-MDP reduces the rate of MDP queries by 79%, false dependencies by 77%, and improves geomean IPC for a small simulated core by 1.47% (to within 0.5% of using 16x the predictor entries), with no area cost and no additional instruction bandwidth.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes