Dalin Li

h-index1
2papers

2 Papers

48.9CVMay 27
Qwen-Image-Bench: From Generation to Creation in Text-to-Image Evaluation

Niantong Li, Guangzheng Hu, Weixu Qiao et al.

Text-to-Image generation has evolved from basic image synthesis into a frequently used core capability in professional creative workflows, where simple text-image alignment can no longer satisfy users' pressing demands for faithful real-world reconstruction and genuine creative expression. Existing benchmarks, however, remain anchored in these foundational criteria and do not yet capture the nuanced capabilities that matter in authentic artistic practice, making it difficult to reliably distinguish state-of-the-art T2I models. To address the gap, we introduce Qwen-Image-Bench, a creator-centric benchmark co-designed with professional artists and grounded in real-world creation scenarios. Qwen-Image-Bench enriches conventional evaluation with two application-driven dimensions: Real-world Fidelity and Creative Generation. Drawing on the staged reasoning inherent in professional artistic workflows, we organize these five pillars into a top-down hierarchical taxonomy that further decomposes into 23 second-level sub-capabilities and 56 third-level verifiable rubrics. To ensure broad coverage, we curate 1000 stratified prompts with each prompt jointly exercising more than four fine-grained facets across multiple pillars. We train a unified judge model Q-Judger based on Qwen3.6-27B, supervised by 80 professional annotators from global art academies under blind labeling and triple-review protocols, that scores every image across all 56 verifiable facets, producing fine-grained, rubric-grounded, and fully attributable diagnostics rather than a single opaque score. Empirically, Qwen-Image-Bench reliably distinguishes leading T2I models, achieving the greatest separation on the two application-driven dimensions of Real-world Fidelity and Creative Generation where existing benchmarks provide little insight, while also providing a trustworthy optimization signal for production-level T2I development.

LGNov 21, 2024
An accuracy improving method for advertising click through rate prediction based on enhanced xDeepFM model

Xiaowei Xi, Song Leng, Yuqing Gong et al.

Advertising click-through rate (CTR) prediction aims to forecast the probability that a user will click on an advertisement in a given context, thus providing enterprises with decision support for product ranking and ad placement. However, CTR prediction faces challenges such as data sparsity and class imbalance, which adversely affect model training effectiveness. Moreover, most current CTR prediction models fail to fully explore the associations among user history, interests, and target advertisements from multiple perspectives, neglecting important information at different levels. To address these issues, this paper proposes an improved CTR prediction model based on the xDeepFM architecture. By integrating a multi-head attention mechanism, the model can simultaneously focus on different aspects of feature interactions, enhancing its ability to learn intricate patterns without significantly increasing computational complexity. Furthermore, replacing the linear model with a Factorization Machine (FM) model improves the handling of high-dimensional sparse data by flexibly capturing both first-order and second-order feature interactions. Experimental results on the Criteo dataset demonstrate that the proposed model outperforms other state-of-the-art methods, showing significant improvements in both AUC and Logloss metrics. This enhancement facilitates better mining of implicit relationships between features and improves the accuracy of advertising CTR prediction.