Yu Song Meng

h-index22

6papers

352citations

Novelty24%

AI Score33

Ranked #118,878 of 194,257 authors (top 61%)#21,833 in CL (top 71%)

6 Papers

27.7CLOct 11, 2023Code

Evaluating Large Language Models at Evaluating Instruction Following

Zhiyuan Zeng, Jiatong Yu, Tianyu Gao et al. · princeton, uw

As research in large language models (LLMs) continues to accelerate, LLM-based evaluation has emerged as a scalable and cost-effective alternative to human evaluations for comparing the ever increasing list of models. This paper investigates the efficacy of these ``LLM evaluators'', particularly in using them to assess instruction following, a metric that gauges how closely generated text adheres to the given instruction. We introduce a challenging meta-evaluation benchmark, LLMBar, designed to test the ability of an LLM evaluator in discerning instruction-following outputs. The authors manually curated 419 pairs of outputs, one adhering to instructions while the other diverging, yet may possess deceptive qualities that mislead an LLM evaluator, e.g., a more engaging tone. Contrary to existing meta-evaluation, we discover that different evaluators (i.e., combinations of LLMs and prompts) exhibit distinct performance on LLMBar and even the highest-scoring ones have substantial room for improvement. We also present a novel suite of prompting strategies that further close the gap between LLM and human evaluators. With LLMBar, we hope to offer more insight into LLM evaluators and foster future research in developing better instruction-following models.

1.2SYMar 8, 2018

Verifying nonlinear analog and mixed-signal circuits with inputs

Chuchu Fan, Yu Meng, Jürgen Maier et al.

We present a new technique for verifying nonlinear and hybrid models with inputs. We observe that once an input signal is fixed, the sensitivity analysis of the model can be computed much more precisely. Based on this result, we propose a new simulation-driven verification algorithm and apply it to a suite of nonlinear and hybrid models of CMOS digital circuits under different input signals. The models are low-dimensional but with highly nonlinear ODEs, with nearly hundreds of logarithmic and exponential terms. Some of our experiments analyze the metastability of bistable circuits with very sensitive ODEs and rigorously establish the connection between metastability recovery time and sensitivity.

10.8CLMay 23, 2024Code

Unchosen Experts Can Contribute Too: Unleashing MoE Models' Power by Self-Contrast

Chufan Shi, Cheng Yang, Xinyu Zhu et al.

Mixture-of-Experts (MoE) has emerged as a prominent architecture for scaling model size while maintaining computational efficiency. In MoE, each token in the input sequence activates a different subset of experts determined by a routing mechanism. However, the unchosen experts in MoE models do not contribute to the output, potentially leading to underutilization of the model's capacity. In this work, we first conduct exploratory studies to demonstrate that increasing the number of activated experts does not necessarily improve and can even degrade the output quality. Then, we show that output distributions from an MoE model using different routing strategies substantially differ, indicating that different experts do not always act synergistically. Motivated by these findings, we propose Self-Contrast Mixture-of-Experts (SCMoE), a training-free strategy that utilizes unchosen experts in a self-contrast manner during inference. In SCMoE, the next-token probabilities are determined by contrasting the outputs from strong and weak activation using the same MoE model. Our method is conceptually simple and computationally lightweight, as it incurs minimal latency compared to greedy decoding. Experiments on several benchmarks (GSM8K, StrategyQA, MBPP and HumanEval) demonstrate that SCMoE can consistently enhance Mixtral 8x7B's reasoning capability across various domains. For example, it improves the accuracy on GSM8K from 61.79 to 66.94. Moreover, combining SCMoE with self-consistency yields additional gains, increasing major@20 accuracy from 75.59 to 78.31.

1.2IMApr 19, 2018Code

Analyzing Solar Irradiance Variation From GPS and Cameras

Shilpa Manandhar, Soumyabrata Dev, Yee Hui Lee et al.

The total amount of solar irradiance falling on the earth's surface is an important area of study amongst the photo-voltaic (PV) engineers and remote sensing analysts. The received solar irradiance impacts the total amount of generated solar energy. However, this generation is often hindered by the high degree of solar irradiance variability. In this paper, we study the main factors behind such variability with the assistance of Global Positioning System (GPS) and ground-based, high-resolution sky cameras. This analysis will also be helpful for understanding cloud phenomenon and other events in the earth's atmosphere.

1.7CVAug 24, 2017Code

Correlating Satellite Cloud Cover with Sky Cameras

Shilpa Manandhar, Soumyabrata Dev, Yee Hui Lee et al.

The role of clouds is manifold in understanding the various events in the atmosphere, and also in studying the radiative balance of the earth. The conventional manner of such cloud analysis is performed mainly via satellite images. However, because of its low temporal- and spatial- resolutions, ground-based sky cameras are now getting popular. In this paper, we study the relation between the cloud cover obtained from MODIS images, with the coverage obtained from ground-based sky cameras. This will help us to better understand cloud formation in the atmosphere - both from satellite images and ground-based observations.

2.4CVAug 24, 2017Code

Analyzing Cloud Optical Properties Using Sky Cameras

Shilpa Manandhar, Soumyabrata Dev, Yee Hui Lee et al.

Clouds play a significant role in the fluctuation of solar radiation received by the earth's surface. It is important to study the various cloud properties, as it impacts the total solar irradiance falling on the earth's surface. One of such important optical properties of the cloud is the Cloud Optical Thickness (COT). It is defined with the amount of light that can pass through the clouds. The COT values are generally obtained from satellite images. However, satellite images have a low temporal- and spatial- resolutions; and are not suitable for study in applications as solar energy generation and forecasting. Therefore, ground-based sky cameras are now getting popular in such fields. In this paper, we analyze the cloud optical thickness value, from the ground-based sky cameras, and provide future research directions.