CV AIMar 8

SketchGraphNet: A Memory-Efficient Hybrid Graph Transformer for Large-Scale Sketch Corpora Recognition

Shilong Chen, Mingyuan Li, Zhaoyang Wang, Zhonglin Ye, Haixing Zhao

arXiv:2603.07521v1

Predicted impact top 83% in CV · last 90 daysOriginality Incremental advance

AI Analysis

This work provides a new benchmark and a memory-efficient method for large-scale sketch recognition, which is an incremental improvement for researchers and developers working with sketch data.

This paper tackles large-scale sketch recognition by modeling free-hand sketches directly as structured graphs. Their proposed SketchGraphNet achieves Top-1 accuracies of 83.62% on SketchGraph-A and 87.61% on SketchGraph-R, while reducing peak GPU memory by over 40% and training time by more than 30% compared to Performer-based global attention.

This work investigates large-scale sketch recognition from a graph-native perspective, where free-hand sketches are directly modeled as structured graphs rather than raster images or stroke sequences. We propose SketchGraphNet, a hybrid graph neural architecture that integrates local message passing with a memory-efficient global attention mechanism, without relying on auxiliary positional or structural encodings. To support systematic evaluation, we construct SketchGraph, a large-scale benchmark comprising 3.44 million graph-structured sketches across 344 categories, with two variants (A and R) to reflect different noise conditions. Each sketch is represented as a spatiotemporal graph with normalized stroke-order attributes. On SketchGraph-A and SketchGraph-R, SketchGraphNet achieves Top-1 accuracies of 83.62% and 87.61%, respectively, under a unified training configuration. MemEffAttn further reduces peak GPU memory by over 40% and training time by more than 30% compared with Performer-based global attention, while maintaining comparable accuracy.

View on arXiv PDF

Similar