IVOct 20, 2022
Reversed Image Signal Processing and RAW Reconstruction. AIM 2022 Challenge ReportMarcos V. Conde, Radu Timofte, Yibin Huang et al.
Cameras capture sensor RAW images and transform them into pleasant RGB images, suitable for the human eyes, using their integrated Image Signal Processor (ISP). Numerous low-level vision tasks operate in the RAW domain (e.g. image denoising, white balance) due to its linear relationship with the scene irradiance, wide-range of information at 12bits, and sensor designs. Despite this, RAW image datasets are scarce and more expensive to collect than the already large and public RGB datasets. This paper introduces the AIM 2022 Challenge on Reversed Image Signal Processing and RAW Reconstruction. We aim to recover raw sensor images from the corresponding RGBs without metadata and, by doing this, "reverse" the ISP transformation. The proposed methods and benchmark establish the state-of-the-art for this low-level vision inverse problem, and generating realistic raw sensor readings can potentially benefit other tasks such as denoising and super-resolution.
CVOct 20, 2022
Overexposure Mask Fusion: Generalizable Reverse ISP Multi-Step RefinementJinha Kim, Jun Jiang, Jinwei Gu
With the advent of deep learning methods replacing the ISP in transforming sensor RAW readings into RGB images, numerous methodologies solidified into real-life applications. Equally potent is the task of inverting this process which will have applications in enhancing computational photography tasks that are conducted in the RAW domain, addressing lack of available RAW data while reaping from the benefits of performing tasks directly on sensor readings. This paper's proposed methodology is a state-of-the-art solution to the task of RAW reconstruction, and the multi-step refinement process integrating an overexposure mask is novel in three ways: instead of from RGB to bayer, the pipeline trains from RGB to demosaiced RAW allowing use of perceptual loss functions; the multi-step processes has greatly enhanced the performance of the baseline U-Net from start to end; the pipeline is a generalizable process of refinement that can enhance other high performance methodologies that support end-to-end learning.
LGJul 1, 2024Code
Swish-T : Enhancing Swish Activation with Tanh Bias for Improved Neural Network PerformanceYoungmin Seo, Jinha Kim, Unsang Park
We propose the Swish-T family, an enhancement of the existing non-monotonic activation function Swish. Swish-T is defined by adding a Tanh bias to the original Swish function. This modification creates a family of Swish-T variants, each designed to excel in different tasks, showcasing specific advantages depending on the application context. The Tanh bias allows for broader acceptance of negative values during initial training stages, offering a smoother non-monotonic curve than the original Swish. We ultimately propose the Swish-T$_{\textbf{C}}$ function, while Swish-T and Swish-T$_{\textbf{B}}$, byproducts of Swish-T$_{\textbf{C}}$, also demonstrate satisfactory performance. Furthermore, our ablation study shows that using Swish-T$_{\textbf{C}}$ as a non-parametric function can still achieve high performance. The superiority of the Swish-T family has been empirically demonstrated across various models and benchmark datasets, including MNIST, Fashion MNIST, SVHN, CIFAR-10, and CIFAR-100. The code is publicly available at https://github.com/ictseoyoungmin/Swish-T-pytorch.
LGApr 11, 2024
Can Contrastive Learning Refine EmbeddingsLihui Liu, Jinha Kim, Vidit Bansal
Recent advancements in contrastive learning have revolutionized self-supervised representation learning and achieved state-of-the-art performance on benchmark tasks. While most existing methods focus on applying contrastive learning to input data modalities such as images, natural language sentences, or networks, they overlook the potential of utilizing outputs from previously trained encoders. In this paper, we introduce SIMSKIP, a novel contrastive learning framework that specifically refines input embeddings for downstream tasks. Unlike traditional unsupervised learning approaches, SIMSKIP takes advantage of the output embeddings of encoder models as its input. Through theoretical analysis, we provide evidence that applying SIMSKIP does not result in larger upper bounds on downstream task errors than those of the original embeddings, which serve as SIMSKIP's input. Experimental results on various open datasets demonstrate that the embeddings produced by SIMSKIP improve performance on downstream tasks.
LGJul 25, 2025
WACA-UNet: Weakness-Aware Channel Attention for Static IR Drop Prediction in Integrated Circuit DesignYoungmin Seo, Yunhyeong Kwon, Younghun Park et al.
Accurate spatial prediction of power integrity issues, such as IR drop, is critical for reliable VLSI design. However, traditional simulation-based solvers are computationally expensive and difficult to scale. We address this challenge by reformulating IR drop estimation as a pixel-wise regression task on heterogeneous multi-channel physical maps derived from circuit layouts. Prior learning-based methods treat all input layers (e.g., metal, via, and current maps) equally, ignoring their varying importance to prediction accuracy. To tackle this, we propose a novel Weakness-Aware Channel Attention (WACA) mechanism, which recursively enhances weak feature channels while suppressing over-dominant ones through a two-stage gating strategy. Integrated into a ConvNeXtV2-based attention U-Net, our approach enables adaptive and balanced feature representation. On the public ICCAD-2023 benchmark, our method outperforms the ICCAD-2023 contest winner by reducing mean absolute error by 61.1% and improving F1-score by 71.0%. These results demonstrate that channel-wise heterogeneity is a key inductive bias in physical layout analysis for VLSI.