Alex K. Jones

AR
h-index15
5papers
25citations
Novelty52%
AI Score44

5 Papers

38.7DCMay 26
Advancing Environmental Sustainability in Data Centers via Carbon Depreciation Models

Shixin Ji, Zhuoping Yang, Xingzhen Chen et al.

Recent improvements in energy efficiency and renewable energy integration have increased the relative importance of embodied carbon in data centers, motivating improved provisioning strategies. Conventional approaches primarily minimize operational energy, but this perspective is increasingly insufficient for sustainability. In this paper, we propose carbon depreciation models to encourage longer hardware lifetimes. Carbon depreciation assigns a larger portion of embodied carbon to newly provisioned servers, discouraging unnecessary deployment of new hardware. As a result, new servers are provisioned mainly for jobs with strict quality-of-service (QoS) constraints, while older servers, whose embodied carbon has largely been recovered, are used for other workloads. We further argue that both embodied carbon and operational carbon from server idle time should be recovered during active jobs, encouraging provisioning strategies that maintain high utilization. We show that prior carbon accounting strategies can be counterproductive: under a greedy scheduler minimizing carbon under QoS constraints, jobs are priced as 25% cheaper on new hardware than on older hardware. In contrast, our approach uses a greedy scheduler that prioritizes older hardware through non-linear carbon depreciation, promoting sustainable provisioning. Experimental results show carbon reductions of 28-57%, depending on server lifetime assumptions.

ARJul 4, 2022
Sustainable AI Processing at the Edge

Sébastien Ollivier, Sheng Li, Yue Tang et al.

Edge computing is a popular target for accelerating machine learning algorithms supporting mobile devices without requiring the communication latencies to handle them in the cloud. Edge deployments of machine learning primarily consider traditional concerns such as SWaP constraints (Size, Weight, and Power) for their installations. However, such metrics are not entirely sufficient to consider environmental impacts from computing given the significant contributions from embodied energy and carbon. In this paper we explore the tradeoffs of convolutional neural network acceleration engines for both inference and on-line training. In particular, we explore the use of processing-in-memory (PIM) approaches, mobile GPU accelerators, and recently released FPGAs, and compare them with novel Racetrack memory PIM. Replacing PIM-enabled DDR3 with Racetrack memory PIM can recover its embodied energy as quickly as 1 year. For high activity ratios, mobile GPUs can be more sustainable but have higher embodied energy to overcome compared to PIM-enabled Racetrack memory.

14.3ARApr 7
PHAROS: Pipelined Heterogeneous Accelerators for Real-time Safety-critical Systems With Deadline Compliance

Shixin Ji, Jinming Zhuang, Sarah Schultz et al.

Spatially partitioned heterogeneous accelerators (HAs) are increasingly adopted in embedded systems for their performance and flexibility. Yet most existing HA design frameworks optimize primarily for throughput or quality-of-service (QoS) metrics. They often overlook safety-critical real-time requirements, including hardware support for predictable execution, real-time-aware design space exploration (DSE), and rigorous schedulability analysis. These requirements are essential in safety-critical applications such as smart transportation, where schedulability guarantees directly affect system safety. To address this gap, we present PHAROS, a real-time-centric HA design framework. PHAROS introduces preemption mechanisms and scheduler designs for spatially partitioned HAs under first-in-first-out (FIFO) and earliest-deadline-first (EDF) policies. Leveraging modern real-time theory, we further develop a soft real-time (SRT) schedulability-oriented DSE with objectives and constraints tailored to SRT schedulability. Through comprehensive modeling, analysis, and evaluation across diverse applications, we show that PHAROS's DSE discovers more feasible configurations for a broader range of task sets than throughput-oriented DSE baselines while delivering improved real-time performance. We also provide response-time analyses for the supported scheduling algorithms.

LGJan 30, 2024
EdgeOL: Efficient in-situ Online Learning on Edge Devices

Sheng Li, Geng Yuan, Yue Dai et al.

Emerging applications, such as robot-assisted eldercare and object recognition, generally employ deep learning neural networks (DNNs) and naturally require: i) handling streaming-in inference requests and ii) adapting to possible deployment scenario changes. Online model fine-tuning is widely adopted to satisfy these needs. However, an inappropriate fine-tuning scheme could involve significant energy consumption, making it challenging to deploy on edge devices. In this paper, we propose EdgeOL, an edge online learning framework that optimizes inference accuracy, fine-tuning execution time, and energy efficiency through both inter-tuning and intra-tuning optimizations. Experimental results show that, on average, EdgeOL reduces overall fine-tuning execution time by 64%, energy consumption by 52%, and improves average inference accuracy by 1.75% over the immediate online learning strategy

LGNov 3, 2021
Brain-inspired Cognition in Next Generation Racetrack Memories

Asif Ali Khan, Sebastien Ollivier, Stephen Longofono et al.

Hyperdimensional computing (HDC) is an emerging computational framework inspired by the brain that operates on vectors with thousands of dimensions to emulate cognition. Unlike conventional computational frameworks that operate on numbers, HDC, like the brain, uses high dimensional random vectors and is capable of one-shot learning. HDC is based on a well-defined set of arithmetic operations and is highly error-resilient. The core operations of HDC manipulate HD vectors in bulk bit-wise fashion, offering many opportunities to leverage parallelism. Unfortunately, on conventional Von-Neuman architectures, the continuous movement of HD vectors among the processor and the memory can make the cognition task prohibitively slow and energy-intensive. Hardware accelerators only marginally improve related metrics. On the contrary, only partial implementation of an HDC framework inside memory, using emerging memristive devices, has reported considerable performance/energy gains. This paper presents an architecture based on racetrack memory (RTM) to conduct and accelerate the entire HDC framework within the memory. The proposed solution requires minimal additional CMOS circuitry and uses a read operation across multiple domains in RTMs called transverse read (TR) to realize exclusive-or (XOR) and addition operations. To minimize the overhead the CMOS circuitry, we propose an RTM nanowires-based counting mechanism that leverages the TR operation and the standard RTM operations. Using language recognition as the use case demonstrates 7.8x and 5.3x reduction in the overall runtime and energy consumption compared to the FPGA design, respectively. Compared to the state-of-the-art in-memory implementation, the proposed HDC system reduces the energy consumption by 8.6x.