Shixin Li

CL
h-index117
3papers
3,096citations
Novelty35%
AI Score38

3 Papers

CLJul 7, 2025
Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities

Gheorghe Comanici, Eric Bieber, Mike Schaekermann et al. · amazon-science, baidu

In this report, we introduce the Gemini 2.X model family: Gemini 2.5 Pro and Gemini 2.5 Flash, as well as our earlier Gemini 2.0 Flash and Flash-Lite models. Gemini 2.5 Pro is our most capable model yet, achieving SoTA performance on frontier coding and reasoning benchmarks. In addition to its incredible coding and reasoning skills, Gemini 2.5 Pro is a thinking model that excels at multimodal understanding and it is now able to process up to 3 hours of video content. Its unique combination of long context, multimodal and reasoning capabilities can be combined to unlock new agentic workflows. Gemini 2.5 Flash provides excellent reasoning abilities at a fraction of the compute and latency requirements and Gemini 2.0 Flash and Flash-Lite provide high performance at low latency and cost. Taken together, the Gemini 2.X model generation spans the full Pareto frontier of model capability vs cost, allowing users to explore the boundaries of what is possible with complex agentic problem solving.

CLNov 15, 2024Code
Legal Evalutions and Challenges of Large Language Models

Jiaqi Wang, Huan Zhao, Zhenyuan Yang et al.

In this paper, we review legal testing methods based on Large Language Models (LLMs), using the OPENAI o1 model as a case study to evaluate the performance of large models in applying legal provisions. We compare current state-of-the-art LLMs, including open-source, closed-source, and legal-specific models trained specifically for the legal domain. Systematic tests are conducted on English and Chinese legal cases, and the results are analyzed in depth. Through systematic testing of legal cases from common law systems and China, this paper explores the strengths and weaknesses of LLMs in understanding and applying legal texts, reasoning through legal issues, and predicting judgments. The experimental results highlight both the potential and limitations of LLMs in legal applications, particularly in terms of challenges related to the interpretation of legal language and the accuracy of legal reasoning. Finally, the paper provides a comprehensive analysis of the advantages and disadvantages of various types of models, offering valuable insights and references for the future application of AI in the legal field.

LGNov 15, 2016
Robust Matrix Regression

Hang Zhang, Fengyuan Zhu, Shixin Li

Modern technologies are producing datasets with complex intrinsic structures, and they can be naturally represented as matrices instead of vectors. To preserve the latent data structures during processing, modern regression approaches incorporate the low-rank property to the model and achieve satisfactory performance for certain applications. These approaches all assume that both predictors and labels for each pair of data within the training set are accurate. However, in real-world applications, it is common to see the training data contaminated by noises, which can affect the robustness of these matrix regression methods. In this paper, we address this issue by introducing a novel robust matrix regression method. We also derive efficient proximal algorithms for model training. To evaluate the performance of our methods, we apply it to real world applications with comparative studies. Our method achieves the state-of-the-art performance, which shows the effectiveness and the practical value of our method.