CVJul 1, 2024
Label-free Neural Semantic Image SynthesisJiayi Wang, Kevin Alexander Laube, Yumeng Li et al. · amazon-science
Recent work has shown great progress in integrating spatial conditioning to control large, pre-trained text-to-image diffusion models. Despite these advances, existing methods describe the spatial image content using hand-crafted conditioning inputs, which are either semantically ambiguous (e.g., edges) or require expensive manual annotations (e.g., semantic segmentation). To address these limitations, we propose a new label-free way of conditioning diffusion models to enable fine-grained spatial control. We introduce the concept of neural semantic image synthesis, which uses neural layouts extracted from pre-trained foundation models as conditioning. Neural layouts are advantageous as they provide rich descriptions of the desired image, containing both semantics and detailed geometry of the scene. We experimentally show that images synthesized via neural semantic image synthesis achieve similar or superior pixel-level alignment of semantic classes compared to those created using expensive semantic label maps. At the same time, they capture better semantics, instance separation, and object orientation than other label-free conditioning options, such as edges or depth. Moreover, we show that images generated by neural layout conditioning can effectively augment real data for training various perception tasks.
LGDec 3, 2021Code
The UniNAS framework: combining modules in arbitrarily complex configurations with argument treesKevin Alexander Laube
Designing code to be simplistic yet to offer choice is a tightrope walk. Additional modules such as optimizers and data sets make a framework useful to a broader audience, but the added complexity quickly becomes a problem. Framework parameters may apply only to some modules but not others, be mutually exclusive or depend on each other, often in unclear ways. Even so, many frameworks are limited to a few specific use cases. This paper presents the underlying concept of UniNAS, a framework designed to incorporate a variety of Neural Architecture Search approaches. Since they differ in the number of optimizers and networks, hyper-parameter optimization, network designs, candidate operations, and more, a traditional approach can not solve the task. Instead, every module defines its own hyper-parameters and a local tree structure of module requirements. A configuration file specifies which modules are used, their used parameters, and which other modules they use in turn This concept of argument trees enables combining and reusing modules in complex configurations while avoiding many problems mentioned above. Argument trees can also be configured from a graphical user interface so that designing and changing experiments becomes possible without writing a single line of code. UniNAS is publicly available at https://github.com/cogsys-tuebingen/uninas
LGJun 18, 2019Code
Prune and Replace NASKevin Alexander Laube, Andreas Zell
While recent NAS algorithms are thousands of times faster than the pioneering works, it is often overlooked that they use fewer candidate operations, resulting in a significantly smaller search space. We present PR-DARTS, a NAS algorithm that discovers strong network configurations in a much larger search space and a single day. A small candidate operation pool is used, from which candidates are progressively pruned and replaced with better performing ones. Experiments on CIFAR-10 and CIFAR-100 achieve 2.51% and 15.53% test error, respectively, despite searching in a space where each cell has 150 times as many possible configurations than in the DARTS baseline. Code is available at https://github.com/cogsys-tuebingen/prdarts
36.0CVApr 28
The Surprising Effectiveness of Canonical Knowledge Distillation for Semantic SegmentationMuhammad Ali, Kevin Alexander Laube, Madan Ravi Ganesh et al.
Recent knowledge distillation (KD) methods for semantic segmentation introduce increasingly complex hand-crafted objectives, yet are typically evaluated under fixed iteration schedules. These objectives substantially increase per-iteration cost, meaning equal iteration counts do not correspond to equal training budgets. It is therefore unclear whether reported gains reflect stronger distillation signals or simply greater compute. We show that iteration-based comparisons are misleading: when wall-clock compute is matched, \textit{canonical} logit- and feature-based KD outperform recent segmentation-specific methods. Under extended training, feature-based distillation achieves state-of-the-art ResNet-18 performance on Cityscapes and ADE20K. A PSPNet ResNet-18 student closely approaches its ResNet-101 teacher despite using only one quarter of the parameters, reaching 99\% of the teacher's mIoU on Cityscapes (79.0 vs.\ 79.8) and 92\% on ADE20K. Our results challenge the prevailing assumption that KD for segmentation requires task-specific mechanisms and suggest that scaling, rather than complex hand-crafted objectives, should guide future method design.
CVJun 2, 2025
Improving Knowledge Distillation Under Unknown Covariate Shift Through Confidence-Guided Data AugmentationNiclas Popp, Kevin Alexander Laube, Matthias Hein et al.
Large foundation models trained on extensive datasets demonstrate strong zero-shot capabilities in various domains. To replicate their success when data and model size are constrained, knowledge distillation has become an established tool for transferring knowledge from foundation models to small student networks. However, the effectiveness of distillation is critically limited by the available training data. This work addresses the common practical issue of covariate shift in knowledge distillation, where spurious features appear during training but not at test time. We ask the question: when these spurious features are unknown, yet a robust teacher is available, is it possible for a student to also become robust to them? We address this problem by introducing a novel diffusion-based data augmentation strategy that generates images by maximizing the disagreement between the teacher and the student, effectively creating challenging samples that the student struggles with. Experiments demonstrate that our approach significantly improves worst group and mean group accuracy on CelebA and SpuCo Birds as well as the spurious mAUC on spurious ImageNet under covariate shift, outperforming state-of-the-art diffusion-based data augmentation baselines
LGApr 23, 2021
Conditional super-network weightsKevin Alexander Laube, Andreas Zell
Modern Neural Architecture Search methods have repeatedly broken state-of-the-art results for several disciplines. The super-network, a central component of many such methods, enables quick estimates of accuracy or loss statistics for any architecture in the search space. They incorporate the network weights of all candidate architectures and can thus approximate specific ones by applying the respective operations. However, this design ignores potential dependencies between consecutive operations. We extend super-networks with conditional weights that depend on combinations of choices and analyze their effect. Experiments in NAS-Bench 201 and NAS-Bench-Macro-based search spaces show improvements in the architecture selection and that the resource overhead is nearly negligible for sequential network designs.
LGDec 7, 2018
ShuffleNASNets: Efficient CNN models through modified Efficient Neural Architecture SearchKevin Alexander Laube, Andreas Zell
Neural network architectures found by sophistic search algorithms achieve strikingly good test performance, surpassing most human-crafted network models by significant margins. Although computationally efficient, their design is often very complex, impairing execution speed. Additionally, finding models outside of the search space is not possible by design. While our space is still limited, we implement undiscoverable expert knowledge into the economic search algorithm Efficient Neural Architecture Search (ENAS), guided by the design principles and architecture of ShuffleNet V2. While maintaining baseline-like 2.85% test error on CIFAR-10, our ShuffleNASNets are significantly less complex, require fewer parameters, and are two times faster than the ENAS baseline in a classification task. These models also scale well to a low parameter space, achieving less than 5% test error with little regularization and only 236K parameters.