99.7AIJun 3
Agents' Last ExamYiyou Sun, Xinyang Han, Weichen Zhang et al.
Recent AI systems have achieved strong results on a wide range of benchmarks, yet these gains have not translated into economically meaningful deployment across many professional domains. We argue that this gap is largely an evaluation problem: widely used benchmarks lack sustained performance measurement on real and economically valuable workflows. This paper introduces Agents' Last Exam (ALE), a benchmark designed to evaluate AI agents on long-horizon, economically valuable, real-world tasks with verifiable outcomes. Developed in collaboration with 250+ industry experts, ALE covers non-physical industries defined with reference to O*NET / SOC 2018 (the U.S. federal occupational taxonomy). It is organized around a task taxonomy with 55 subfields grouped into 13 industry clusters covering 1K+ tasks. Current results show that the hardest tier remains far from saturated: across mainstream harness and backbone configurations, the average full pass rate is 2.6%. ALE is designed as a living benchmark: its task pool grows continuously as new workflows and industries are onboarded. More broadly, ALE is intended not merely as another leaderboard, but as an instrument for closing the gap between benchmark success and GDP-relevant impact.
LGJul 16, 2024
Dynamic Dimension Wrapping (DDW) Algorithm: A Novel Approach for Efficient Cross-Dimensional Search in Dynamic Multidimensional SpacesDongnan Jin, Yali Liu, Qiuzhi Song et al.
To effectively search for the optimal motion template in dynamic multidimensional space, this paper proposes a novel optimization algorithm, Dynamic Dimension Wrapping (DDW).The algorithm combines Dynamic Time Warping (DTW) and Euclidean distance, and designs a fitness function that adapts to dynamic multidimensional space by establishing a time-data chain mapping across dimensions. This paper also proposes a novel update mechanism,Optimal Dimension Collection (ODC), combined with the search strategy of traditional optimization algorithms, enables DDW to adjust both the dimension values and the number of dimensions of the population individuals simultaneously. In this way, DDW significantly reduces computational complexity and improves search accuracy. Experimental results show that DDW performs excellently in dynamic multidimensional space, outperforming 31 traditional optimization algorithms. This algorithm provides a novel approach to solving dynamic multidimensional optimization problems and demonstrates broad application potential in fields such as motion data analysis.
LGFeb 18, 2019
An Adaptive Deep Learning Algorithm Based Autoencoder for Interference ChannelsDehao Wu, Maziar Nekovee, Yue Wang
Deep learning (DL) based autoencoder has shown great potential to significantly enhance the physical layer performance. In this paper, we present a DL based autoencoder for interference channel. Based on a characterization of a k-user Gaussian interference channel, where the interferences are classified as different levels from weak to very strong interferences based on a coupling parameter α, a DL neural network (NN) based autoencoder is designed to train the data set and decode the received signals. The performance such a DL autoencoder for different interference scenarios are studied, with α known or partially known, where we assume that α is predictable but with a varying up to 10\% at the training stage. The results demonstrate that DL based approach has a significant capability to mitigate the effect induced by a poor signal-to-noise ratio (SNR) and a high interference-to-noise ratio (INR). However, the enhancement depends on the knowledge of α as well as the interference levels. The proposed DL approach performs well with α up to 10\% offset for weak interference level. For strong and very strong interference channel, the offset of α needs to be constrained to less than 5\% and 2\%, respectively, to maintain similar performance as α is known.