CVNov 9, 2025
AesTest: Measuring Aesthetic Intelligence from Perception to ProductionGuolong Wang, Heng Huang, Zhiqiang Zhang et al.
Perceiving and producing aesthetic judgments is a fundamental yet underexplored capability for multimodal large language models (MLLMs). However, existing benchmarks for image aesthetic assessment (IAA) are narrow in perception scope or lack the diversity needed to evaluate systematic aesthetic production. To address this gap, we introduce AesTest, a comprehensive benchmark for multimodal aesthetic perception and production, distinguished by the following features: 1) It consists of curated multiple-choice questions spanning ten tasks, covering perception, appreciation, creation, and photography. These tasks are grounded in psychological theories of generative learning. 2) It integrates data from diverse sources, including professional editing workflows, photographic composition tutorials, and crowdsourced preferences. It ensures coverage of both expert-level principles and real-world variation. 3) It supports various aesthetic query types, such as attribute-based analysis, emotional resonance, compositional choice, and stylistic reasoning. We evaluate both instruction-tuned IAA MLLMs and general MLLMs on AesTest, revealing significant challenges in building aesthetic intelligence. We will publicly release AesTest to support future research in this area.
CLMar 5
Non-Zipfian Distribution of Stopwords and Subset Selection ModelsWentian Li, Oscar Fontanelli
Stopwords are words that are not very informative to the content or the meaning of a language text. Most stopwords are function words but can also be common verbs, adjectives and adverbs. In contrast to the well known Zipf's law for rank-frequency plot for all words, the rank-frequency plot for stopwords are best fitted by the Beta Rank Function (BRF). On the other hand, the rank-frequency plots of non-stopwords also deviate from the Zipf's law, but are fitted better by a quadratic function of log-token-count over log-rank than by BRF. Based on the observed rank of stopwords in the full word list, we propose a stopword (subset) selection model that the probability for being selected as a function of the word's rank $r$ is a decreasing Hill's function ($1/(1+(r/r_{mid})^γ)$); whereas the probability for not being selected is the standard Hill's function ( $1/(1+(r_{mid}/r)^γ)$). We validate this selection probability model by a direct estimation from an independent collection of texts. We also show analytically that this model leads to a BRF rank-frequency distribution for stopwords when the original full word list follows the Zipf's law, as well as explaining the quadratic fitting function for the non-stopwords.
SIJan 13
An Extensive Study of Two-Node McCulloch-Pitts NetworksWentian Li, Astero Provata, Thomas MacCarthy
Networks with two nodes are previously grouped into either two classes (mutually interactive, master-slave) or five classes (mutualism, competition, predator-prey, commensalism, amensalism). By allowing self-loops, the number of signed regulatory graphs increases to 39. We provide a complete summary of dynamical behaviors of the 39 two-node McCulloch-Pitts models when the link weights are constrained to three values [$-1$,0,$+1$] and Boolean node variables. Depending on whether the Boolean values are [$-1,1$] (bipolar) or [0,1] (binary), we show that the dynamics could also be different with the same signed regulatory graphs. We demonstrate that slight variations in the McCulloch-Pitts model (called variants) may lead to fundamentally different dynamics. We study the full model space and three kinds of robustness or stability: a) of a rule against parameter change on its overall dynamics, b) for a given state against parameter change on its final state, and c) against an initial state change on its final state. All these stability properties are loosely related to a model's limiting dynamics, with the fixed-point rules to be more stable in the first two types of robustness, but less stable in the third robustness type. These analyses pave the way towards a better understanding of a minimum complex system.
CLNov 18, 2025
Quadratic Term Correction on Heaps' LawOscar Fontanelli, Wentian Li
Heaps' or Herdan's law characterizes the word-type vs. word-token relation by a power-law function, which is concave in linear-linear scale but a straight line in log-log scale. However, it has been observed that even in log-log scale, the type-token curve is still slightly concave, invalidating the power-law relation. At the next-order approximation, we have shown, by twenty English novels or writings (some are translated from another language to English), that quadratic functions in log-log scale fit the type-token data perfectly. Regression analyses of log(type)-log(token) data with both a linear and quadratic term consistently lead to a linear coefficient of slightly larger than 1, and a quadratic coefficient around -0.02. Using the ``random drawing colored ball from the bag with replacement" model, we have shown that the curvature of the log-log scale is identical to a ``pseudo-variance" which is negative. Although a pseudo-variance calculation may encounter numeric instability when the number of tokens is large, due to the large values of pseudo-weights, this formalism provides a rough estimation of the curvature when the number of tokens is small.
CLMay 8, 2012
Characterizing Ranked Chinese Syllable-to-Character Mapping Spectrum: A Bridge Between the Spoken and Written Chinese LanguageWentian Li
One important aspect of the relationship between spoken and written Chinese is the ranked syllable-to-character mapping spectrum, which is the ranked list of syllables by the number of characters that map to the syllable. Previously, this spectrum is analyzed for more than 400 syllables without distinguishing the four intonations. In the current study, the spectrum with 1280 toned syllables is analyzed by logarithmic function, Beta rank function, and piecewise logarithmic function. Out of the three fitting functions, the two-piece logarithmic function fits the data the best, both by the smallest sum of squared errors (SSE) and by the lowest Akaike information criterion (AIC) value. The Beta rank function is the close second. By sampling from a Poisson distribution whose parameter value is chosen from the observed data, we empirically estimate the $p$-value for testing the two-piece-logarithmic-function being better than the Beta rank function hypothesis, to be 0.16. For practical purposes, the piecewise logarithmic function and the Beta rank function can be considered a tie.