Tian–Li Yu

h-index21

3papers

230citations

Novelty50%

AI Score36

Ranked #100,471 of 194,257 authors (top 52%)#18,709 in CL (top 61%)

3 Papers

32.8CVOct 13, 2023

PaLI-3 Vision Language Models: Smaller, Faster, Stronger

Xi Chen, Xiao Wang, Lucas Beyer et al. · deepmind

This paper presents PaLI-3, a smaller, faster, and stronger vision language model (VLM) that compares favorably to similar models that are 10x larger. As part of arriving at this strong performance, we compare Vision Transformer (ViT) models pretrained using classification objectives to contrastively (SigLIP) pretrained ones. We find that, while slightly underperforming on standard image classification benchmarks, SigLIP-based PaLI shows superior performance across various multimodal benchmarks, especially on localization and visually-situated text understanding. We scale the SigLIP image encoder up to 2 billion parameters, and achieves a new state-of-the-art on multilingual cross-modal retrieval. We hope that PaLI-3, at only 5B parameters, rekindles research on fundamental pieces of complex VLMs, and could fuel a new generation of scaled-up models.

8.3CLJul 22, 2025

Towards Compute-Optimal Many-Shot In-Context Learning

Shahriar Golchin, Yanfei Chen, Rujun Han et al.

Long-context large language models (LLMs) are able to process inputs containing up to several million tokens. In the scope of in-context learning (ICL), this translates into using hundreds/thousands of demonstrations in the input prompt, enabling many-shot ICL. In practice, a fixed set of demonstrations is often selected at random in many-shot settings due to (1) high inference costs, (2) the benefits of caching and reusing computations, and (3) the similar performance offered by this strategy compared to others when scaled. In this work, we propose two straightforward strategies for demonstration selection in many-shot ICL that improve performance with minimal computational overhead. Our first method combines a small number of demonstrations, selected based on their similarity to each test sample, with a disproportionately larger set of random demonstrations that are cached. The second strategy improves the first by replacing random demonstrations with those selected using centroids derived from test sample representations via k-means clustering. Our experiments with Gemini Pro and Flash across several datasets indicate that our strategies consistently outperform random selection and surpass or match the most performant selection approach while supporting caching and reducing inference cost by up to an order of magnitude. We also show that adjusting the proportion of demonstrations selected based on different criteria can balance performance and inference cost in many-shot ICL.

12.8NEJul 31, 2018

Optimization by Pairwise Linkage Detection, Incremental Linkage Set, and Restricted / Back Mixing: DSMGA-II

Shih-Huan Hsu, Tian-Li Yu

This paper proposes a new evolutionary algorithm, called DSMGA-II, to efficiently solve optimization problems via exploiting problem substructures. The proposed algorithm adopts pairwise linkage detection and stores the information in the form of dependency structure matrix (DSM). A new linkage model, called the incremental linkage set, is then constructed by using the DSM. Inspired by the idea of optimal mixing, the restricted mixing and the back mixing are proposed. The former aims at efficient exploration under certain constrains. The latter aims at exploitation by refining the DSM so as to reduce unnecessary evaluations. Experimental results show that DSMGA-II outperforms LT-GOMEA and hBOA in terms of number of function evaluations on the concatenated/folded/cyclic trap problems, NK-landscape problems with various degrees of overlapping, 2D Ising spin-glass problems, and MAX-SAT. The investigation of performance comparison with P3 is also included.