Shihong Wang

CR
h-index11
5papers
12citations
Novelty40%
AI Score32

5 Papers

LGMar 1, 2022
E-LMC: Extended Linear Model of Coregionalization for Spatial Field Prediction

Shihong Wang, Xueying Zhang, Yichen Meng et al.

Physical simulations based on partial differential equations typically generate spatial fields results, which are utilized to calculate specific properties of a system for engineering design and optimization. Due to the intensive computational burden of the simulations, a surrogate model mapping the low-dimensional inputs to the spatial fields are commonly built based on a relatively small dataset. To resolve the challenge of predicting the whole spatial field, the popular linear model of coregionalization (LMC) can disentangle complicated correlations within the high-dimensional spatial field outputs and deliver accurate predictions. However, LMC fails if the spatial field cannot be well approximated by a linear combination of base functions with latent processes. In this paper, we present the Extended Linear Model of Coregionalization (E-LMC) by introducing an invertible neural network to linearize the highly complex and nonlinear spatial fields so that the LMC can easily generalize to nonlinear problems while preserving the traceability and scalability. Several real-world applications demonstrate that E-LMC can exploit spatial correlations effectively, showing a maximum improvement of about 40% over the original LMC and outperforming the other state-of-the-art spatial field models.

CVApr 8, 2024Code
Class Similarity Transition: Decoupling Class Similarities and Imbalance from Generalized Few-shot Segmentation

Shihong Wang, Ruixun Liu, Kaiyu Li et al.

In Generalized Few-shot Segmentation (GFSS), a model is trained with a large corpus of base class samples and then adapted on limited samples of novel classes. This paper focuses on the relevance between base and novel classes, and improves GFSS in two aspects: 1) mining the similarity between base and novel classes to promote the learning of novel classes, and 2) mitigating the class imbalance issue caused by the volume difference between the support set and the training set. Specifically, we first propose a similarity transition matrix to guide the learning of novel classes with base class knowledge. Then, we leverage the Label-Distribution-Aware Margin (LDAM) loss and Transductive Inference to the GFSS task to address the problem of class imbalance as well as overfitting the support set. In addition, by extending the probability transition matrix, the proposed method can mitigate the catastrophic forgetting of base classes when learning novel classes. With a simple training phase, our proposed method can be applied to any segmentation network trained on base classes. We validated our methods on the adapted version of OpenEarthMap. Compared to existing GFSS baselines, our method excels them all from 3% to 7% and ranks second in the OpenEarthMap Land Cover Mapping Few-Shot Challenge at the completion of this paper. Code: https://github.com/earth-insights/ClassTrans

CVAug 25, 2025
Annotation-Free Open-Vocabulary Segmentation for Remote-Sensing Images

Kaiyu Li, Xiangyong Cao, Ruixun Liu et al.

Semantic segmentation of remote sensing (RS) images is pivotal for comprehensive Earth observation, but the demand for interpreting new object categories, coupled with the high expense of manual annotation, poses significant challenges. Although open-vocabulary semantic segmentation (OVSS) offers a promising solution, existing frameworks designed for natural images are insufficient for the unique complexities of RS data. They struggle with vast scale variations and fine-grained details, and their adaptation often relies on extensive, costly annotations. To address this critical gap, this paper introduces SegEarth-OV, the first framework for annotation-free open-vocabulary segmentation of RS images. Specifically, we propose SimFeatUp, a universal upsampler that robustly restores high-resolution spatial details from coarse features, correcting distorted target shapes without any task-specific post-training. We also present a simple yet effective Global Bias Alleviation operation to subtract the inherent global context from patch features, significantly enhancing local semantic fidelity. These components empower SegEarth-OV to effectively harness the rich semantics of pre-trained VLMs, making OVSS possible in optical RS contexts. Furthermore, to extend the framework's universality to other challenging RS modalities like SAR images, where large-scale VLMs are unavailable and expensive to create, we introduce AlignEarth, which is a distillation-based strategy and can efficiently transfer semantic knowledge from an optical VLM encoder to an SAR encoder, bypassing the need to build SAR foundation models from scratch and enabling universal OVSS across diverse sensor types. Extensive experiments on both optical and SAR datasets validate that SegEarth-OV can achieve dramatic improvements over the SOTA methods, establishing a robust foundation for annotation-free and open-world Earth observation.

CRDec 31, 2018
Security analysis of a self-embedding fragile image watermark scheme

Xinhui Gong, Feng Yu, Xiaohong Zhao et al.

Recently, a self-embedding fragile watermark scheme based on reference-bits interleaving and adaptive selection of embedding mode was proposed. Reference bits are derived from the scrambled MSB bits of a cover image, and then are combined with authentication bits to form the watermark bits for LSB embedding. We find this algorithm has a feature of block independence of embedding watermark such that it is vulnerable to a collage attack. In addition, because the generation of authentication bits via hash function operations is not related to secret keys, we analyze this algorithm by a multiple stego-image attack. We find that the cost of obtaining all the permutation relations of $l\cdot b^2$ watermark bits of each block (i.e., equivalent permutation keys) is about $(l\cdot b^2)!$ for the embedding mode $(m, l)$, where $m$ MSB layers of a cover image are used for generating reference bits and $l$ LSB layers for embedding watermark, and $b\times b$ is the size of image block. The simulation results and the statistical results demonstrate our analysis is effective.

CRAug 31, 2017
A secure blind watermarking scheme based on DCT domain of the scrambled image

Lei Chen, Shihong Wang

This paper investigates a secure blind watermarking scheme. The main idea of the scheme not only protects the watermark information but also the embedding positions. To achieve a higher level of security, we propose a sub key generation mechanism based on the singular value decomposition and hash function, where sub keys depend on both the main key and the feature codes of the original image. The different sub keys ensure that the embedding positions are randomly selected for different original images. Watermark is embedded in the Discrete Cosine Transform (DCT) coefficients of the scrambled original image. Simulation results show that such embedded method resolves well the contradiction of imperceptibility and robustness. Based on good correlation properties of chaotic sequences, we design a detection method, which can accurately compute geometric transformation (rotation and translation transformations) parameters. The security analysis, including key space analysis, key sensitivity analysis, cryptanalysis, and the comparison results demonstrate that the proposed watermarking scheme also achieves high security.