ARMay 24, 2024Code
Full-stack evaluation of Machine Learning inference workloads for RISC-V systemsDebjyoti Bhattacharjee, Anmol, Tommaso Marinelli et al.
Architectural simulators hold a vital role in RISC-V research, providing a crucial platform for workload evaluation without the need for costly physical prototypes. They serve as a dynamic environment for exploring innovative architectural concepts, enabling swift iteration and thorough analysis of performance metrics. As deep learning algorithms become increasingly pervasive, it is essential to benchmark new architectures with machine learning workloads. The diverse computational kernels used in deep learning algorithms highlight the necessity for a comprehensive compilation toolchain to map to target hardware platforms. This study evaluates the performance of a wide array of machine learning workloads on RISC-V architectures using gem5, an open-source architectural simulator. Leveraging an open-source compilation toolchain based on Multi-Level Intermediate Representation (MLIR), the research presents benchmarking results specifically focused on deep learning inference workloads. Additionally, the study sheds light on current limitations of gem5 when simulating RISC-V architectures, offering insights for future development and refinement.
52.9ARMar 31
CXLRAMSim v1.0: System-Level Exploration of CXL Memory Expander CardsKaran Pathak, David Atienza, Marina Zapater
The growing demands in the training and inference of Large Language Models (LLMs) are accelerating the adoption of scale-up systems that extend server shared memory through the use of Compute Express Link (CXL)-based load/store interconnects. Accurate full-system simulation of such architectures remains challenging, as existing tools (all very recent) rely on simplified or non-compliant architectural models, impacting accuracy and usability. We present CXLRAMSim, the first gem5-integrated, full-system simulator that models CXL devices at their correct position on the I/O bus, enabling the use of unmodified Linux kernels and software stack, realistic latency-bandwidth behavior and true interleaving with system DRAM. Our approach provides high-fidelity CXL.mem characterization and captures key challenges such as cache pollution when accessing CXL memory.
LGApr 8, 2023
Pump It Up: Predict Water Pump Status using Attentive Tabular LearningKaran Pathak, L Shalini
Water crisis is a crucial concern around the globe. Appropriate and timely maintenance of water pumps in drought-hit countries is vital for communities relying on the well. In this paper, we analyze and apply a sequential attentive deep neural architecture, TabNet, for predicting water pump repair status in Tanzania. The model combines the valuable benefits of tree-based algorithms and neural networks, enabling end-to-end training, model interpretability, sparse feature selection, and efficient learning on tabular data. Finally, we compare the performance of TabNet with popular gradient tree-boosting algorithms like XGBoost, LightGBM,CatBoost, and demonstrate how we can further uplift the performance by choosing focal loss as the objective function while training on imbalanced data.