IRJun 22, 2022
DaisyRec 2.0: Benchmarking Recommendation for Rigorous EvaluationZhu Sun, Hui Fang, Jie Yang et al.
Recently, one critical issue looms large in the field of recommender systems -- there are no effective benchmarks for rigorous evaluation -- which consequently leads to unreproducible evaluation and unfair comparison. We, therefore, conduct studies from the perspectives of practical theory and experiments, aiming at benchmarking recommendation for rigorous evaluation. Regarding the theoretical study, a series of hyper-factors affecting recommendation performance throughout the whole evaluation chain are systematically summarized and analyzed via an exhaustive review on 141 papers published at eight top-tier conferences within 2017-2020. We then classify them into model-independent and model-dependent hyper-factors, and different modes of rigorous evaluation are defined and discussed in-depth accordingly. For the experimental study, we release DaisyRec 2.0 library by integrating these hyper-factors to perform rigorous evaluation, whereby a holistic empirical study is conducted to unveil the impacts of different hyper-factors on recommendation performance. Supported by the theoretical and experimental studies, we finally create benchmarks for rigorous evaluation by proposing standardized procedures and providing performance of ten state-of-the-arts across six evaluation metrics on six datasets as a reference for later study. Overall, our work sheds light on the issues in recommendation evaluation, provides potential solutions for rigorous evaluation, and lays foundation for further investigation.
64.6ITApr 19
About Optimal Prefix Codes over Countably Infinite Alphabets: Probabilistic Intervals for the Codeword Lengths AssignmentHongyang Liu, Wei Yan
For the discrete memoryless sources with a countably infinite alphabet, we prove that for any positive integer $k$, there exists a corresponding probability interval such that if the largest symbol probability $p_{1}$ falls in this interval, the optimal code length for the symbol equals $k$. Furthermore, for infinite sources, we provide a criterion to determine probability distributions whose optimal code length assignment follows the pattern $l^{best}_{i}=i$, for $i\ge 1$. Compared with the existing conclusion for anti-uniform sources, the proposed criterion requires less information for verification.
74.8DSApr 6
Work-Efficient Parallel Counting via SamplingHongyang Liu, Yitong Yin, Yiyao Zhang
A canonical approach to approximating the partition function of a Gibbs distribution via sampling is simulated annealing. This method has led to efficient reductions from counting to sampling, including: $\bullet$ classic non-adaptive (parallel) algorithms with sub-optimal cost (Dyer-Frieze-Kannan '89; Bezáková-Å tefankoviÄ-Vazirani-Vigoda '08); $\bullet$ adaptive (sequential) algorithms with near-optimal cost (Å tefankoviÄ-Vempala-Vigoda '09; Huber '15; Kolmogorov '18; Harris-Kolmogorov '24). We present an algorithm that achieves both near-optimal total work and efficient parallelism, providing a reduction from counting to sampling with logarithmic depth and near-optimal work. As consequences, we obtain work-efficient parallel counting algorithms for several important models, including the hardcore and Ising models within the uniqueness regime.
CLDec 7, 2023
Large Language Models for Intent-Driven Session RecommendationsZhu Sun, Hongyang Liu, Xinghua Qu et al.
Intent-aware session recommendation (ISR) is pivotal in discerning user intents within sessions for precise predictions. Traditional approaches, however, face limitations due to their presumption of a uniform number of intents across all sessions. This assumption overlooks the dynamic nature of user sessions, where the number and type of intentions can significantly vary. In addition, these methods typically operate in latent spaces, thus hinder the model's transparency.Addressing these challenges, we introduce a novel ISR approach, utilizing the advanced reasoning capabilities of large language models (LLMs). First, this approach begins by generating an initial prompt that guides LLMs to predict the next item in a session, based on the varied intents manifested in user sessions. Then, to refine this process, we introduce an innovative prompt optimization mechanism that iteratively self-reflects and adjusts prompts. Furthermore, our prompt selection module, built upon the LLMs' broad adaptability, swiftly selects the most optimized prompts across diverse domains. This new paradigm empowers LLMs to discern diverse user intents at a semantic level, leading to more accurate and interpretable session recommendations. Our extensive experiments on three real-world datasets demonstrate the effectiveness of our method, marking a significant advancement in ISR systems.
SIFeb 2
Cross-Domain Fake News Detection on Unseen Domains via LLM-Based Domain-Aware User ModelingXuankai Yang, Yan Wang, Jiajie Zhu et al.
Cross-domain fake news detection (CD-FND) transfers knowledge from a source domain to a target domain and is crucial for real-world fake news mitigation. This task becomes particularly important yet more challenging when the target domain is previously unseen (e.g., the COVID-19 outbreak or the Russia-Ukraine war). However, existing CD-FND methods overlook such scenarios and consequently suffer from the following two key limitations: (1) insufficient modeling of high-level semantics in news and user engagements; and (2) scarcity of labeled data in unseen domains. Targeting these limitations, we find that large language models (LLMs) offer strong potential for CD-FND on unseen domains, yet their effective use remains non-trivial. Nevertheless, two key challenges arise: (1) how to capture high-level semantics from both news content and user engagements using LLMs; and (2) how to make LLM-generated features more reliable and transferable for CD-FND on unseen domains. To tackle these challenges, we propose DAUD, a novel LLM-Based Domain-Aware framework for fake news detection on Unseen Domains. DAUD employs LLMs to extract high-level semantics from news content. It models users' single- and cross-domain engagements to generate domain-aware behavioral representations. In addition, DAUD captures the relations between original data-driven features and LLM-derived features of news, users, and user engagements. This allows it to extract more reliable domain-shared representations that improve knowledge transfer to unseen domains. Extensive experiments on real-world datasets demonstrate that DAUD outperforms state-of-the-art baselines in both general and unseen-domain CD-FND settings.
61.9DSApr 1
Near-Optimal Parallel Approximate Counting via SamplingDavid G. Harris, Vladimir Kolmogorov, Hongyang Liu et al.
The computational equivalence between approximate counting and sampling is well established for polynomial-time algorithms. The most efficient general reduction from counting to sampling is achieved via simulated annealing, where the counting problem is formulated in terms of estimating the ratio $Q={Z(β_{\max})}/{Z(β_{\min})}$ between partition functions $Z(β)=\sum_{x\in Ω} \exp(βH(x))$ of Gibbs distributions $μ_β$ over $Ω$ with Hamiltonian $H$, given access to a sampling oracle that produces samples from $μ_β$ for $β\in [β_{\min}, β_{\max}]$. The best bound achieved by known annealing algorithms with relative error $\varepsilon$ is $O(q \log h / \varepsilon^2)$, where $q, h$ are parameters which respectively bound $\ln Q$ and $H$. However, all known algorithms attaining this near-optimal complexity are inherently sequential, or *adaptive*: the queried parameters $β$ depend on previous samples. We develop a simple non-adaptive algorithm for approximate counting using $O(q \log^2 h / \varepsilon^2)$ samples, as well as an algorithm that achieves $O(q \log h / \varepsilon^2)$ samples with just two rounds of adaptivity, matching the best sample complexity of sequential algorithms. These algorithms naturally give rise to work-efficient parallel (RNC) counting algorithms. We discuss applications to RNC counting algorithms for several classic models, including the anti-ferromagnetic 2-spin, monomer-dimer and ferromagnetic Ising models.
DSFeb 8, 2025
Approximating the total variation distance between spin systemsWeiming Feng, Hongyang Liu, Minji Yang
Spin systems form an important class of undirected graphical models. For two Gibbs distributions $μ$ and $ν$ induced by two spin systems on the same graph $G = (V, E)$, we study the problem of approximating the total variation distance $d_{TV}(μ,ν)$ with an $ε$-relative error. We propose a new reduction that connects the problem of approximating the TV-distance to sampling and approximate counting. Our applications include the hardcore model and the antiferromagnetic Ising model in the uniqueness regime, the ferromagnetic Ising model, and the general Ising model satisfying the spectral condition. Additionally, we explore the computational complexity of approximating the total variation distance $d_{TV}(μ_S,ν_S)$ between two marginal distributions on an arbitrary subset $S \subseteq V$. We prove that this problem remains hard even when both $μ$ and $ν$ admit polynomial-time sampling and approximate counting algorithms.
COMP-PHMar 7
Full-Scale GPU-Accelerated Transient EM-Thermal-Mechanical Co-Simulation for Early-Stage Design of Advanced PackagesHongyang Liu, Tejas Kulkarni, Ganesh Subbarayan et al.
In the early-stage design of advanced electronic packages, designers face a critical trade-off between simulation fidelity and computational turnaround time. Conventional early-stage methodologies typically achieve speed by relying on steady-state assumptions and structural homogenization. While computationally efficient, these approximations fundamentally fail to capture dynamic thermal events and stress concentrations at fine-grained internal interfaces, effectively masking failure mechanisms driven by transient signal bursts. In this work, we present a GPU-accelerated transient coupled Electromagnetic-Thermal-Mechanical solver that resolves this bottleneck. The proposed solver enables full-scale, non-homogenized, time-domain simulation of large-scale packages with runtimes amenable for rapid design iteration. Simulation of a NEC SX-Aurora TSUBASA package demonstrates that the tool allows for the identification of signal-induced adiabatic stress that is typically invisible to steady-state and homogenized baselines. This capability brings sign-off level physics fidelity to the early design phase, facilitating the prevention of costly late-stage design failures and broader transient thermal performance degradation risks.