Junyi Wen

LG
h-index17
3papers
28citations
Novelty55%
AI Score38

3 Papers

LGOct 30, 2022Code
Mitigating Unfairness via Evolutionary Multi-objective Ensemble Learning

Qingquan Zhang, Jialin Liu, Zeqi Zhang et al.

In the literature of mitigating unfairness in machine learning, many fairness measures are designed to evaluate predictions of learning models and also utilised to guide the training of fair models. It has been theoretically and empirically shown that there exist conflicts and inconsistencies among accuracy and multiple fairness measures. Optimising one or several fairness measures may sacrifice or deteriorate other measures. Two key questions should be considered, how to simultaneously optimise accuracy and multiple fairness measures, and how to optimise all the considered fairness measures more effectively. In this paper, we view the mitigating unfairness problem as a multi-objective learning problem considering the conflicts among fairness measures. A multi-objective evolutionary learning framework is used to simultaneously optimise several metrics (including accuracy and multiple fairness measures) of machine learning models. Then, ensembles are constructed based on the learning models in order to automatically balance different metrics. Empirical results on eight well-known datasets demonstrate that compared with the state-of-the-art approaches for mitigating unfairness, our proposed algorithm can provide decision-makers with better tradeoffs among accuracy and multiple fairness metrics. Furthermore, the high-quality models generated by the framework can be used to construct an ensemble to automatically achieve a better tradeoff among all the considered fairness metrics than other ensemble methods. Our code is publicly available at https://github.com/qingquan63/FairEMOL

CLJul 10, 2025
Krul: Efficient State Restoration for Multi-turn Conversations with Dynamic Cross-layer KV Sharing

Junyi Wen, Junyuan Liang, Zicong Hong et al.

Efficient state restoration in multi-turn conversations with large language models (LLMs) remains a critical challenge, primarily due to the overhead of recomputing or loading full key-value (KV) caches for all historical tokens. To address this, existing approaches compress KV caches across adjacent layers with highly similar attention patterns. However, these methods often apply a fixed compression scheme across all conversations, selecting the same layer pairs for compression without considering conversation-specific attention dynamics. This static strategy overlooks variability in attention pattern similarity across different conversations, which can lead to noticeable accuracy degradation. We present Krul, a multi-turn LLM inference system that enables accurate and efficient KV cache restoration. Krul dynamically selects compression strategies based on attention similarity across layer pairs and uses a recomputation-loading pipeline to restore the KV cache. It introduces three key innovations: 1) a preemptive compression strategy selector to preserve critical context for future conversation turns and selects a customized strategy for the conversation; 2) a token-wise heterogeneous attention similarity estimator to mitigate the attention similarity computation and storage overhead during model generation; 3) a bubble-free restoration scheduler to reduce potential bubbles brought by the imbalance of recomputing and loading stream due to compressed KV caches. Empirical evaluations on real-world tasks demonstrate that Krul achieves a 1.5x-2.68x reduction in time-to-first-token (TTFT) and a 1.33x-2.35x reduction in KV cache storage compared to state-of-the-art methods without compromising generation quality.

LGMay 23, 2023
Constrained Reinforcement Learning for Dynamic Material Handling

Chengpeng Hu, Ziming Wang, Jialin Liu et al.

As one of the core parts of flexible manufacturing systems, material handling involves storage and transportation of materials between workstations with automated vehicles. The improvement in material handling can impulse the overall efficiency of the manufacturing system. However, the occurrence of dynamic events during the optimisation of task arrangements poses a challenge that requires adaptability and effectiveness. In this paper, we aim at the scheduling of automated guided vehicles for dynamic material handling. Motivated by some real-world scenarios, unknown new tasks and unexpected vehicle breakdowns are regarded as dynamic events in our problem. We formulate the problem as a constrained Markov decision process which takes into account tardiness and available vehicles as cumulative and instantaneous constraints, respectively. An adaptive constrained reinforcement learning algorithm that combines Lagrangian relaxation and invalid action masking, named RCPOM, is proposed to address the problem with two hybrid constraints. Moreover, a gym-like dynamic material handling simulator, named DMH-GYM, is developed and equipped with diverse problem instances, which can be used as benchmarks for dynamic material handling. Experimental results on the problem instances demonstrate the outstanding performance of our proposed approach compared with eight state-of-the-art constrained and non-constrained reinforcement learning algorithms, and widely used dispatching rules for material handling.