CVMar 25Code
VFIG: Vectorizing Complex Figures in SVG with Vision-Language ModelsQijia He, Xunmei Liu, Hammaad Memon et al. · allen-ai
Scalable Vector Graphics (SVG) are an essential format for technical illustration and digital design, offering precise resolution independence and flexible semantic editability. In practice, however, original vector source files are frequently lost or inaccessible, leaving only "flat" rasterized versions (e.g., PNG or JPEG) that are difficult to modify or scale. Manually reconstructing these figures is a prohibitively labor-intensive process, requiring specialized expertise to recover the original geometric intent. To bridge this gap, we propose VFIG, a family of Vision-Language Models trained for complex and high-fidelity figure-to-SVG conversion. While this task is inherently data-driven, existing datasets are typically small-scale and lack the complexity of professional diagrams. We address this by introducing VFIG-DATA, a large-scale dataset of 66K high-quality figure-SVG pairs, curated from a diverse mix of real-world paper figures and procedurally generated diagrams. Recognizing that SVGs are composed of recurring primitives and hierarchical local structures, we introduce a coarse-to-fine training curriculum that begins with supervised fine-tuning (SFT) to learn atomic primitives and transitions to reinforcement learning (RL) refinement to optimize global diagram fidelity, layout consistency, and topological edge cases. Finally, we introduce VFIG-BENCH, a comprehensive evaluation suite with novel metrics designed to measure the structural integrity of complex figures. VFIG achieves state-of-the-art performance among open-source models and performs on par with GPT-5.2, achieving a VLM-Judge score of 0.829 on VFIG-BENCH.
CVJun 16, 2022Code
NCAGC: A Neighborhood Contrast Framework for Attributed Graph ClusteringTong Wang, Guanyu Yang, Qijia He et al.
Attributed graph clustering is one of the most fundamental tasks among graph learning field, the goal of which is to group nodes with similar representations into the same cluster without human annotations. Recent studies based on graph contrastive learning method have achieved remarkable results when exploit graph-structured data. However, most existing methods 1) do not directly address the clustering task, since the representation learning and clustering process are separated; 2) depend too much on data augmentation, which greatly limits the capability of contrastive learning; 3) ignore the contrastive message for clustering tasks, which adversely degenerate the clustering results. In this paper, we propose a Neighborhood Contrast Framework for Attributed Graph Clustering, namely NCAGC, seeking for conquering the aforementioned limitations. Specifically, by leveraging the Neighborhood Contrast Module, the representation of neighbor nodes will be 'push closer' and become clustering-oriented with the neighborhood contrast loss. Moreover, a Contrastive Self-Expression Module is built by minimizing the node representation before and after the self-expression layer to constraint the learning of self-expression matrix. All the modules of NCAGC are optimized in a unified framework, so the learned node representation contains clustering-oriented messages. Extensive experimental results on four attributed graph datasets demonstrate the promising performance of NCAGC compared with 16 state-of-the-art clustering methods. The code is available at https://github.com/wangtong627/NCAGC.
AISep 14, 2025
Rethinking Human Preference Evaluation of LLM RationalesZiang Li, Manasi Ganti, Zixian Ma et al.
Large language models (LLMs) often generate natural language rationales -- free-form explanations that help improve performance on complex reasoning tasks and enhance interpretability for human users. However, evaluating these rationales remains challenging. While recent work has relied on binary preference judgments from humans or LLM judges, such evaluations are often opaque and coarse-grained, offering limited insight into what makes one rationale better than another. In this work, we rethink preference evaluation for LLM-generated rationales by asking: (1) What attributes define good rationales? (2) Can human preferences be explained by these attributes? (3) Can attribute-based evaluation overcome the limitations of binary comparisons? We identify a set of key rationale attributes from prior literature and assess them using automatic metrics, LLM judgments, and human annotations. We then analyze two standard human preference datasets MT Bench and Chatbot Arena using SHAP to identify which attributes best explain human preference outcomes. Finally, we re-evaluate model-generated rationales using attribute-specific ELO scores, revealing more nuanced model comparisons and insights. Our findings suggest that fine-grained attribute evaluations can better characterize rationale quality and guide future research toward more interpretable and reliable evaluation practices.