CVMay 19, 2025Code
MAGI-1: Autoregressive Video Generation at ScaleSand. ai, Hansi Teng, Hongyu Jia et al.
We present MAGI-1, a world model that generates videos by autoregressively predicting a sequence of video chunks, defined as fixed-length segments of consecutive frames. Trained to denoise per-chunk noise that increases monotonically over time, MAGI-1 enables causal temporal modeling and naturally supports streaming generation. It achieves strong performance on image-to-video (I2V) tasks conditioned on text instructions, providing high temporal consistency and scalability, which are made possible by several algorithmic innovations and a dedicated infrastructure stack. MAGI-1 facilitates controllable generation via chunk-wise prompting and supports real-time, memory-efficient deployment by maintaining constant peak inference cost, regardless of video length. The largest variant of MAGI-1 comprises 24 billion parameters and supports context lengths of up to 4 million tokens, demonstrating the scalability and robustness of our approach. The code and models are available at https://github.com/SandAI-org/MAGI-1 and https://github.com/SandAI-org/MagiAttention. The product can be accessed at https://sand.ai.
CEMay 6
How Do Ice Shelves Calve? Peridynamic Modeling of Ice Shelf Fracture Driven by Wave Erosion, Basal Melting, and Buoyancy FlexureYing Song, Xuan Hu, Jingrui Xu et al.
An ice shelf is a floating extension of a land-based ice sheet into the ocean. It plays a crucial role in slowing down the flow of land ice into the sea, thus stabilizing the ice sheet. However, this stabilizing effect can be weakened by ice calving, a process in which large fragments of ice detach from the ice shelf. Although ice calving is widely acknowledged as a major contributor to ice mass loss, and its frequency and magnitude are highly sensitive to the environmental forcing, the underlying physics-based mechanisms remain poorly understood, particularly under ocean wave actions. In this context, we developed a nonlocal peridynamics (PD) framework to model the ice calving process subjected to wave-induced frontal corrosion. The proposed physics-based PD framework enables investigation of the coupled effects of self-weight bending, buoyancy-induced foot loosening, and ice calving process. To authors' best knowledge, this work represents the first attempt to employ a physics-based peridynamics framework for simulating ice calving processes. Compared with conventional finite element methods (FEM), the PD framework naturally captures crack initiation, interaction, and propagation without the need for special numerical treatments, thereby providing a robust tool for simulating fracture phenomena under large deformations and long-term environmental loading. To quantitatively resolve fracture processes, we implemented a static first Piola Kirchhoff virial stress formulation within the PD framework, allowing direct evaluation of stress concentration and energy release at evolving crack tips. Subsequently, the model is rigorously validated through one-to-one comparisons with finite-element stress fields, analytical beam-theory solutions, and recent field observations of wave-driven ice-shelf failure reported by Sartore et al. (2025).
ROMar 9, 2025
AgiBot World Colosseo: A Large-scale Manipulation Platform for Scalable and Intelligent Embodied SystemsAgiBot-World-Contributors, Qingwen Bu, Jisong Cai et al.
We explore how scalable robot data can address real-world challenges for generalized robotic manipulation. Introducing AgiBot World, a large-scale platform comprising over 1 million trajectories across 217 tasks in five deployment scenarios, we achieve an order-of-magnitude increase in data scale compared to existing datasets. Accelerated by a standardized collection pipeline with human-in-the-loop verification, AgiBot World guarantees high-quality and diverse data distribution. It is extensible from grippers to dexterous hands and visuo-tactile sensors for fine-grained skill acquisition. Building on top of data, we introduce Genie Operator-1 (GO-1), a novel generalist policy that leverages latent action representations to maximize data utilization, demonstrating predictable performance scaling with increased data volume. Policies pre-trained on our dataset achieve an average performance improvement of 30% over those trained on Open X-Embodiment, both in in-domain and out-of-distribution scenarios. GO-1 exhibits exceptional capability in real-world dexterous and long-horizon tasks, achieving over 60% success rate on complex tasks and outperforming prior RDT approach by 32%. By open-sourcing the dataset, tools, and models, we aim to democratize access to large-scale, high-quality robot data, advancing the pursuit of scalable and general-purpose intelligence.
COMP-PHOct 29, 2024
A Message Passing Neural Network Surrogate Model for Bond-Associated Peridynamic Material Correspondence FormulationXuan Hu, Qijun Chen, Nicholas H. Luo et al.
Peridynamics is a non-local continuum mechanics theory that offers unique advantages for modeling problems involving discontinuities and complex deformations. Within the peridynamic framework, various formulations exist, among which the material correspondence formulation stands out for its ability to directly incorporate traditional continuum material models, making it highly applicable to a range of engineering challenges. A notable advancement in this area is the bond-associated correspondence model, which not only resolves issues of material instability but also achieves high computational accuracy. However, the bond-associated model typically requires higher computational costs than FEA, which can limit its practical application. To address this computational challenge, we propose a novel surrogate model based on a message-passing neural network (MPNN) specifically designed for the bond-associated peridynamic material correspondence formulation. Leveraging the similarities between graph structure and the neighborhood connectivity inherent to peridynamics, we construct an MPNN that can transfers domain knowledge from peridynamics into a computational graph and shorten the computation time via GPU acceleration. Unlike conventional graph neural networks that focus on node features, our model emphasizes edge-based features, capturing the essential material point interactions in the formulation. A key advantage of this neural network approach is its flexibility: it does not require fixed neighborhood connectivity, making it adaptable across diverse configurations and scalable for complex systems. Furthermore, the model inherently possesses translational and rotational invariance, enabling it to maintain physical objectivity: a critical requirement for accurate mechanical modeling.
NEJul 5, 2021
High-Speed CMOS-Free Purely Spintronic Asynchronous Recurrent Neural NetworkPranav O. Mathews, Christian B. Duffee, Abel Thayil et al.
Neuromorphic computing systems overcome the limitations of traditional von Neumann computing architectures. These computing systems can be further improved upon by using emerging technologies that are more efficient than CMOS for neural computation. Recent research has demonstrated memristors and spintronic devices in various neural network designs boost efficiency and speed. This paper presents a biologically inspired fully spintronic neuron used in a fully spintronic Hopfield RNN. The network is used to solve tasks, and the results are compared against those of current Hopfield neuromorphic architectures which use emerging technologies.
NEMar 16, 2021
Passive frustrated nanomagnet reservoir computingAlexander J. Edwards, Dhritiman Bhattacharya, Peng Zhou et al.
Reservoir computing (RC) has received recent interest because reservoir weights do not need to be trained, enabling extremely low-resource consumption implementations, which could have a transformative impact on edge computing and in-situ learning where resources are severely constrained. Ideally, a natural hardware reservoir should be passive, minimal, expressive, and feasible; to date, proposed hardware reservoirs have had difficulty meeting all of these criteria. We therefore propose a reservoir that meets all of these criteria by leveraging the passive interactions of dipole-coupled, frustrated nanomagnets. The frustration significantly increases the number of stable reservoir states, enriching reservoir dynamics, and as such these frustrated nanomagnets fulfill all of the criteria for a natural hardware reservoir. We likewise propose a complete frustrated nanomagnet reservoir computing (NMRC) system with low-power complementary metal-oxide semiconductor (CMOS) circuitry to interface with the reservoir, and initial experimental results demonstrate the reservoir's feasibility. The reservoir is verified with micromagnetic simulations on three separate tasks demonstrating expressivity. The proposed system is compared with a CMOS echo-state-network (ESN), demonstrating an overall resource decrease by a factor of over 10,000,000, demonstrating that because NMRC is naturally passive and minimal it has the potential to be extremely resource efficient.
NENov 11, 2020
Domain Wall Leaky Integrate-and-Fire Neurons with Shape-Based Configurable Activation FunctionsWesley H. Brigner, Naimul Hassan, Xuan Hu et al.
Complementary metal oxide semiconductor (CMOS) devices display volatile characteristics, and are not well suited for analog applications such as neuromorphic computing. Spintronic devices, on the other hand, exhibit both non-volatile and analog features, which are well-suited to neuromorphic computing. Consequently, these novel devices are at the forefront of beyond-CMOS artificial intelligence applications. However, a large quantity of these artificial neuromorphic devices still require the use of CMOS, which decreases the efficiency of the system. To resolve this, we have previously proposed a number of artificial neurons and synapses that do not require CMOS for operation. Although these devices are a significant improvement over previous renditions, their ability to enable neural network learning and recognition is limited by their intrinsic activation functions. This work proposes modifications to these spintronic neurons that enable configuration of the activation functions through control of the shape of a magnetic domain wall track. Linear and sigmoidal activation functions are demonstrated in this work, which can be extended through a similar approach to enable a wide variety of activation functions.
NEFeb 3, 2020
CMOS-Free Multilayer Perceptron Enabled by Four-Terminal MTJ DeviceWesley H. Brigner, Naimul Hassan, Xuan Hu et al.
Neuromorphic computing promises revolutionary improvements over conventional systems for applications that process unstructured information. To fully realize this potential, neuromorphic systems should exploit the biomimetic behavior of emerging nanodevices. In particular, exceptional opportunities are provided by the non-volatility and analog capabilities of spintronic devices. While spintronic devices have previously been proposed that emulate neurons and synapses, complementary metal-oxide-semiconductor (CMOS) devices are required to implement multilayer spintronic perceptron crossbars. This work therefore proposes a new spintronic neuron that enables purely spintronic multilayer perceptrons, eliminating the need for CMOS circuitry and simplifying fabrication.
ETDec 9, 2019
Exploiting Dual-Gate Ambipolar CNFETs for Scalable Machine Learning ClassificationFarid Kenarangi, Xuan Hu, Yihan Liu et al.
Ambipolar carbon nanotube based field-effect transistors (AP-CNFETs) exhibit unique electrical characteristics, such as tri-state operation and bi-directionality, enabling systems with complex and reconfigurable computing. In this paper, AP-CNFETs are used to design a mixed-signal machine learning (ML) classifier. The classifier is designed in SPICE with feature size of 15 nm and operates at 250 MHz. The system is demonstrated based on MNIST digit dataset, yielding 90% accuracy and no accuracy degradation as compared with the classification of this dataset in Python. The system also exhibits lower power consumption and smaller physical size as compared with the state-of-the-art CMOS and memristor based mixed-signal classifiers.