CVNov 27, 2023Code
PKU-I2IQA: An Image-to-Image Quality Assessment Database for AI Generated ImagesJiquan Yuan, Xinyan Cao, Changjin Li et al.
As image generation technology advances, AI-based image generation has been applied in various fields and Artificial Intelligence Generated Content (AIGC) has garnered widespread attention. However, the development of AI-based image generative models also brings new problems and challenges. A significant challenge is that AI-generated images (AIGI) may exhibit unique distortions compared to natural images, and not all generated images meet the requirements of the real world. Therefore, it is of great significance to evaluate AIGIs more comprehensively. Although previous work has established several human perception-based AIGC image quality assessment (AIGCIQA) databases for text-generated images, the AI image generation technology includes scenarios like text-to-image and image-to-image, and assessing only the images generated by text-to-image models is insufficient. To address this issue, we establish a human perception-based image-to-image AIGCIQA database, named PKU-I2IQA. We conduct a well-organized subjective experiment to collect quality labels for AIGIs and then conduct a comprehensive analysis of the PKU-I2IQA database. Furthermore, we have proposed two benchmark models: NR-AIGCIQA based on the no-reference image quality assessment method and FR-AIGCIQA based on the full-reference image quality assessment method. Finally, leveraging this database, we conduct benchmark experiments and compare the performance of the proposed benchmark models. The PKU-I2IQA database and benchmarks will be released to facilitate future research on \url{https://github.com/jiquan123/I2IQA}.
CLNov 14, 2025Code
LaoBench: A Large-Scale Multidimensional Lao Benchmark for Large Language ModelsJian Gao, Richeng Xuan, Zhaolu Kang et al.
The rapid advancement of large language models (LLMs) has not been matched by their evaluation in low-resource languages, especially Southeast Asian languages like Lao. To fill this gap, we introduce LaoBench, the first large-scale, high-quality, and multidimensional benchmark dataset dedicated to assessing LLMs' comprehensive language understanding and reasoning abilities in Lao. LaoBench comprises over 17,000 carefully curated samples spanning three core dimensions: knowledge application, K12 foundational education, and bilingual translation among Lao, Chinese, and English. The dataset is divided into open-source and closed-source subsets, with the closed-source portion enabling black-box evaluation on an official platform to ensure fairness and data security. Our data construction pipeline integrates expert human curation with automated agent-assisted verification, ensuring linguistic accuracy, cultural relevance, and educational value. Benchmarking multiple state-of-the-art LLMs on LaoBench reveals that current models still face significant challenges in mastering Lao across diverse tasks. We hope LaoBench will catalyze further research and development of AI technologies for underrepresented Southeast Asian languages.
ROFeb 10
Sci-VLA: Agentic VLA Inference Plugin for Long-Horizon Tasks in Scientific ExperimentsYiwen Pang, Bo Zhou, Changjin Li et al.
Robotic laboratories play a critical role in autonomous scientific discovery by enabling scalable, continuous experimental execution. Recent vision-language-action (VLA) models offer a promising foundation for robotic laboratories. However, scientific experiments typically involve long-horizon tasks composed of multiple atomic tasks, posing a fundamental challenge to existing VLA models. While VLA models fine-tuned for scientific tasks can reliably execute atomic experimental actions seen during training, they often fail to perform composite tasks formed by reordering and composing these known atomic actions. This limitation arises from a distributional mismatch between training-time atomic tasks and inference-time composite tasks, which prevents VLA models from executing necessary transitional operations between atomic tasks. To address this challenge, we propose an Agentic VLA Inference Plugin for Long-Horizon Tasks in Scientific Experiments. It introduces an LLM-based agentic inference mechanism that intervenes when executing sequential manipulation tasks. By performing explicit transition inference and generating transitional robotic action code, the proposed plugin guides VLA models through missing transitional steps, enabling reliable execution of composite scientific workflows without any additional training. This inference-only intervention makes our method computationally efficient, data-efficient, and well-suited for open-ended and long-horizon robotic laboratory tasks. We build 3D assets of scientific instruments and common scientific operating scenes within an existing simulation environment. In these scenes, we have verified that our method increases the average success rate per atomic task by 42\% during inference. Furthermore, we show that our method can be easily transferred from the simulation to real scientific laboratories.
CVJun 3, 2020
Deep Learning Methods for Real-time Detection and Analysis of Wagner Ulcer Classification SystemAifu Han, Yongze Zhang, Ajuan Li et al.
At present, the ubiquity method to diagnose the severity of diabetic feet (DF) depends on professional podiatrists. However, in most cases, professional podiatrists have a heavy workload, especially in underdeveloped and developing countries and regions, and there are often insufficient podiatrists to meet the rapidly growing treatment needs of DF patients. It is necessary to develop a medical system that assists in diagnosing DF in order to reduce part of the workload for podiatrists and to provide timely relevant information to patients with DF. In this paper, we have developed a system that can classify and locate Wagner ulcers of diabetic foot in real-time. First, we proposed a dataset of 2688 diabetic feet with annotations. Then, in order to enable the system to detect diabetic foot ulcers in real time and accurately, this paper is based on the YOLOv3 algorithm coupled with image fusion, label smoothing, and variant learning rate mode technologies to improve the robustness and predictive accuracy of the original algorithm. Finally, the refinements on YOLOv3 was used as the optimal algorithm in this paper to deploy into Android smartphone to predict the classes and localization of the diabetic foot with real-time. The experimental results validate that the improved YOLOv3 algorithm achieves a mAP of 91.95%, and meets the needs of real-time detection and analysis of diabetic foot Wagner Ulcer on mobile devices, such as smart phones. This work has the potential to lead to a paradigm shift for clinical treatment of the DF in the future, to provide an effective healthcare solution for DF tissue analysis and healing status.