CLJul 7, 2025
Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic CapabilitiesGheorghe Comanici, Eric Bieber, Mike Schaekermann et al. · amazon-science, baidu
In this report, we introduce the Gemini 2.X model family: Gemini 2.5 Pro and Gemini 2.5 Flash, as well as our earlier Gemini 2.0 Flash and Flash-Lite models. Gemini 2.5 Pro is our most capable model yet, achieving SoTA performance on frontier coding and reasoning benchmarks. In addition to its incredible coding and reasoning skills, Gemini 2.5 Pro is a thinking model that excels at multimodal understanding and it is now able to process up to 3 hours of video content. Its unique combination of long context, multimodal and reasoning capabilities can be combined to unlock new agentic workflows. Gemini 2.5 Flash provides excellent reasoning abilities at a fraction of the compute and latency requirements and Gemini 2.0 Flash and Flash-Lite provide high performance at low latency and cost. Taken together, the Gemini 2.X model generation spans the full Pareto frontier of model capability vs cost, allowing users to explore the boundaries of what is possible with complex agentic problem solving.
ROOct 8, 2019
Multi-Vehicle Interaction Scenarios Generation with Interpretable Traffic Primitives and Gaussian Process RegressionWeiyang Zhang, Wenshuo Wang, Ding Zhao
Generating multi-vehicle interaction scenarios can benefit motion planning and decision making of autonomous vehicles when on-road data is insufficient. This paper presents an efficient approach to generate varied multi-vehicle interaction scenarios that can both adapt to different road geometries and inherit the key interaction patterns in real-world driving. Towards this end, the available multi-vehicle interaction scenarios are temporally segmented into several interpretable fundamental building blocks, called traffic primitives, via the Bayesian nonparametric learning. Then, the changepoints of traffic primitives are transformed into the desired road to generate collision-free interaction trajectories through a sampling-based path planning algorithm. The Gaussian process regression is finally introduced to control the variance and smoothness of the generated multi-vehicle interaction trajectories. Experiments with simulation results of three typical multi-vehicle trajectories at different road conditions are carried out. The experimental results demonstrate that our proposed method can generate a bunch of human-like multi-vehicle interaction trajectories that can fit different road conditions remaining the key interaction patterns of agents in the provided scenarios, which is import to the development of autonomous vehicles.
ROSep 19, 2019
How to Evaluate Proving Grounds for Self-Driving? A Quantitative ApproachRui Chen, Mansur Arief, Weiyang Zhang et al.
Proving ground has been a critical component in testing and validation for Connected and Automated Vehicles (CAV). Although quite a few world-class testing facilities have been under construction over the years, the evaluation of proving grounds themselves as testing approaches has rarely been studied. In this paper, we present the first attempt to systematically evaluate CAV proving grounds and contribute to a generative sample-based approach to assessing the representation of traffic scenarios in proving grounds. Leveraging typical use cases extracted from naturalistic driving events, we establish a strong link between proving ground testing results of CAVs and their anticipated public street performance. We present benchmark results of our approach on three world-class CAV testing facilities: Mcity, Almono (Uber ATG), and Kcity. We successfully show the overall evaluation of these proving grounds in terms of their capability to accommodate real-world traffic scenarios. We believe that when the effectiveness of a testing ground itself is validated, the testing results would grant more confidence for CAV public deployment.
LGJul 27, 2018
Understanding V2V Driving Scenarios through Traffic PrimitivesWenshuo Wang, Weiyang Zhang, Ding Zhao
Semantically understanding complex drivers' encountering behavior, wherein two or multiple vehicles are spatially close to each other, does potentially benefit autonomous car's decision-making design. This paper presents a framework of analyzing various encountering behaviors through decomposing driving encounter data into small building blocks, called driving primitives, using nonparametric Bayesian learning (NPBL) approaches, which offers a flexible way to gain an insight into the complex driving encounters without any prerequisite knowledge. The effectiveness of our proposed primitive-based framework is validated based on 976 naturalistic driving encounters, from which more than 4000 driving primitives are learned using NPBL - a sticky HDP-HMM, combined a hidden Markov model (HMM) with a hierarchical Dirichlet process (HDP). After that, a dynamic time warping method integrated with k-means clustering is then developed to cluster all these extracted driving primitives into groups. Experimental results find that there exist 20 kinds of driving primitives capable of representing the basic components of driving encounters in our database. This primitive-based analysis methodology potentially reveals underlying information of vehicle-vehicle encounters for self-driving applications.