AIMay 26Code
Laguna M.1/XS.2 Technical ReportJulien Abadji, Marah Abdin, Connor Adams et al.
We present Laguna M.1 and Laguna XS.2, two Mixture-of-Experts foundation models built for long-horizon, agentic coding: M.1 has $225.8$B total parameters ($23.4$B activated per token) and XS.2 has $33.4$B total ($3$B activated). Both models were trained from scratch end-to-end inside the same internal system that we refer to as our Model Factory: a tightly-integrated stack of versioned data, training, evaluation, and inference components that turn model development into an industrial process. We describe the principles and design choices of the Model Factory and also detail the end-to-end training process of our models, throughout pre-training data and architecture, post-training stages, evaluation, and quantization. On agentic software engineering and terminal benchmarks (SWE-bench Verified, SWE-bench Multilingual, SWE-Bench Pro, and Terminal-Bench 2.0) M.1 and XS.2 are competitive with state-of-the-art open models in their respective weight classes. Laguna XS.2 weights are released under Apache~2.0 at https://huggingface.co/collections/poolside/laguna-xs2.
IRJul 7, 2025Code
Hierarchical Intent-guided Optimization with Pluggable LLM-Driven Semantics for Session-based RecommendationJinpeng Chen, Jianxiang He, Huan Li et al.
Session-based Recommendation (SBR) aims to predict the next item a user will likely engage with, using their interaction sequence within an anonymous session. Existing SBR models often focus only on single-session information, ignoring inter-session relationships and valuable cross-session insights. Some methods try to include inter-session data but struggle with noise and irrelevant information, reducing performance. Additionally, most models rely on item ID co-occurrence and overlook rich semantic details, limiting their ability to capture fine-grained item features. To address these challenges, we propose a novel hierarchical intent-guided optimization approach with pluggable LLM-driven semantic learning for session-based recommendations, called HIPHOP. First, we introduce a pluggable embedding module based on large language models (LLMs) to generate high-quality semantic representations, enhancing item embeddings. Second, HIPHOP utilizes graph neural networks (GNNs) to model item transition relationships and incorporates a dynamic multi-intent capturing module to address users' diverse interests within a session. Additionally, we design a hierarchical inter-session similarity learning module, guided by user intent, to capture global and local session relationships, effectively exploring users' long-term and short-term interests. To mitigate noise, an intent-guided denoising strategy is applied during inter-session learning. Finally, we enhance the model's discriminative capability by using contrastive learning to optimize session representations. Experiments on multiple datasets show that HIPHOP significantly outperforms existing methods, demonstrating its effectiveness in improving recommendation quality. Our code is available: https://github.com/hjx159/HIPHOP.
NAApr 30
Parameterization-driven arbitrary Lagrangian-Eulerian method for large-deformation isogeometric fluid-structure interactionJingya Li, Ye Ji, Hugo Verhelst et al.
Body-fitted arbitrary Lagrangian-Eulerian (ALE) methods provide a sharp representation of the fluid-structure interface but rely on mesh-update strategies that incrementally deform a reference configuration. To address this issue, we reformulate the ALE mesh-motion problem in the isogeometric setting as a sequence of independent domain parameterization problems. At each time step, a multi-patch spline parameterization of the fluid domain is constructed from the current interface geometry. Three technical components realize this framework: (i) a barrier-function-based spline parameterization that enforces a strictly positive Jacobian at every time step; (ii) a tangential-slip reparameterization that handles unbounded cumulative rotations of closed domains, where no fixed boundary-to-parameter correspondence is admissible; and (iii) a constant-preserving quasi-interpolation operator for solution transfer between consecutive parameterizations, ensuring that the discrete geometric conservation law holds algebraically. We validate the method on three two-dimensional FSI benchmarks, covering standard and large-rotation regimes, and on a three-dimensional rotor problem. On a rotating-square benchmark, the tangential-slip strategy enables simulations under sustained rotation far beyond the range accessible to classical mesh-update schemes--a regime that is fundamentally inaccessible to any mesh-deformation formulation, not merely numerically difficult. A three-dimensional rotor example further demonstrates that the framework extends naturally to volumetric spline parameterizations. Finally, we show that the per-step spline parameterizations can be used directly within a standard finite element solver.
CVNov 28, 2014
Cross-Modal Learning via Pairwise ConstraintsRan He, Man Zhang, Liang Wang et al.
In multimedia applications, the text and image components in a web document form a pairwise constraint that potentially indicates the same semantic concept. This paper studies cross-modal learning via the pairwise constraint, and aims to find the common structure hidden in different modalities. We first propose a compound regularization framework to deal with the pairwise constraint, which can be used as a general platform for developing cross-modal algorithms. For unsupervised learning, we propose a cross-modal subspace clustering method to learn a common structure for different modalities. For supervised learning, to reduce the semantic gap and the outliers in pairwise constraints, we propose a cross-modal matching method based on compound ?21 regularization along with an iteratively reweighted algorithm to find the global optimum. Extensive experiments demonstrate the benefits of joint text and image modeling with semantically induced pairwise constraints, and show that the proposed cross-modal methods can further reduce the semantic gap between different modalities and improve the clustering/retrieval accuracy.