Jiyin Zhang

2papers

2 Papers

CVFeb 23
InfScene-SR: Spatially Continuous Inference for Arbitrary-Size Image Super-Resolution

Shoukun Sun, Zhe Wang, Xiang Que et al.

Image Super-Resolution (SR) aims to recover high-resolution (HR) details from low-resolution (LR) inputs, a task where Denoising Diffusion Probabilistic Models (DDPMs) have recently shown superior performance compared to Generative Adversarial Networks (GANs) based approaches. However, standard diffusion-based SR models, such as SR3, are typically trained on fixed-size patches and struggle to scale to arbitrary-sized images due to memory constraints. Applying these models via independent patch processing leads to visible seams and inconsistent textures across boundaries. In this paper, we propose InfScene-SR, a framework enabling spatially continuous super-resolution for large, arbitrary scenes. We adapt the iterative refinement process of diffusion models with a novel guided and variance-corrected fusion mechanism, allowing for the seamless generation of large-scale high-resolution imagery without retraining. We validate our approach on remote sensing datasets, demonstrating that InfScene-SR not only reconstructs fine details with high perceptual quality but also eliminates boundary artifacts, benefiting downstream tasks such as semantic segmentation.

LGNov 26, 2025
CTR Prediction on Alibaba's Taobao Advertising Dataset Using Traditional and Deep Learning Models

Hongyu Yang, Chunxi Wen, Jiyin Zhang et al.

Click-through rates prediction is critical in modern advertising systems, where ranking relevance and user engagement directly impact platform efficiency and business value. In this project, we explore how to model CTR more effectively using a large-scale Taobao dataset released by Alibaba. We start with supervised learning models, including logistic regression and Light-GBM, that are trained on static features such as user demographics, ad attributes, and contextual metadata. These models provide fast, interpretable benchmarks, but have limited capabilities to capture patterns of behavior that drive clicks. To better model user intent, we combined behavioral data from hundreds of millions of interactions over a 22-day period. By extracting and encoding user action sequences, we construct representations of user interests over time. We use deep learning models to fuse behavioral embeddings with static features. Among them, multilayer perceptrons (MLPs) have achieved significant performance improvements. To capture temporal dynamics, we designed a Transformer-based architecture that uses a self-attention mechanism to learn contextual dependencies across behavioral sequences, modeling not only what the user interacts with, but also the timing and frequency of interactions. Transformer improves AUC by 2.81 % over the baseline (LR model), with the largest gains observed for users whose interests are diverse or change over time. In addition to modeling, we propose an A/B testing strategy for real-world evaluation. We also think about the broader implications: personalized ad targeting technology can be applied to public health scenarios to achieve precise delivery of health information or behavior guidance. Our research provides a roadmap for advancing click-through rate predictions and extending their value beyond e-commerce.