LGAICVJan 9

AGDC: Autoregressive Generation of Variable-Length Sequences with Joint Discrete and Continuous Spaces

arXiv:2601.05680v1h-index: 4
Originality Incremental advance
AI Analysis

This addresses the challenge of precision loss in generating hybrid discrete-continuous sequences for applications like semiconductor design, where errors can cause functional failure, representing an incremental improvement over existing methods.

The paper tackles the problem of generating variable-length sequences with both discrete and continuous values, which is limited by discretization-based methods in high-precision domains like semiconductor circuit designs, and proposes AGDC, a unified framework that achieves superior performance in generating high-fidelity hybrid vector representations across domains such as semiconductor layouts, graphic layouts, and SVGs, with experiments showing scalable high-precision generation.

Transformer-based autoregressive models excel in data generation but are inherently constrained by their reliance on discretized tokens, which limits their ability to represent continuous values with high precision. We analyze the scalability limitations of existing discretization-based approaches for generating hybrid discrete-continuous sequences, particularly in high-precision domains such as semiconductor circuit designs, where precision loss can lead to functional failure. To address the challenge, we propose AGDC, a novel unified framework that jointly models discrete and continuous values for variable-length sequences. AGDC employs a hybrid approach that combines categorical prediction for discrete values with diffusion-based modeling for continuous values, incorporating two key technical components: an end-of-sequence (EOS) logit adjustment mechanism that uses an MLP to dynamically adjust EOS token logits based on sequence context, and a length regularization term integrated into the loss function. Additionally, we present ContLayNet, a large-scale benchmark comprising 334K high-precision semiconductor layout samples with specialized evaluation metrics that capture functional correctness where precision errors significantly impact performance. Experiments on semiconductor layouts (ContLayNet), graphic layouts, and SVGs demonstrate AGDC's superior performance in generating high-fidelity hybrid vector representations compared to discretization-based and fixed-schema baselines, achieving scalable high-precision generation across diverse domains.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes