Yan Ke

h-index13

16papers

693citations

Novelty44%

AI Score47

Ranked #55,045 of 201,018 authors (top 27%)#1,146 in CR (top 16%)

16 Papers

CVOct 10, 2023

Skeleton Ground Truth Extraction: Methodology, Annotation Tool and Benchmarks

Cong Yang, Bipin Indurkhya, John See et al.

Skeleton Ground Truth (GT) is critical to the success of supervised skeleton extraction methods, especially with the popularity of deep learning techniques. Furthermore, we see skeleton GTs used not only for training skeleton detectors with Convolutional Neural Networks (CNN) but also for evaluating skeleton-related pruning and matching algorithms. However, most existing shape and image datasets suffer from the lack of skeleton GT and inconsistency of GT standards. As a result, it is difficult to evaluate and reproduce CNN-based skeleton detectors and algorithms on a fair basis. In this paper, we present a heuristic strategy for object skeleton GT extraction in binary shapes and natural images. Our strategy is built on an extended theory of diagnosticity hypothesis, which enables encoding human-in-the-loop GT extraction based on clues from the target's context, simplicity, and completeness. Using this strategy, we developed a tool, SkeView, to generate skeleton GT of 17 existing shape and image datasets. The GTs are then structurally evaluated with representative methods to build viable baselines for fair comparisons. Experiments demonstrate that GTs generated by our strategy yield promising quality with respect to standard consistency, and also provide a balance between simplicity and completeness.

AIMay 10

Beyond ESG Scores: Learning Dynamic Constraints for Sequential Portfolio Optimization

Xin Li, Yan Ke, Longbing Cao

ESG-aware portfolio optimization is increasingly important for sustainable capital allocation, yet most learning-based methods still operationalize ESG by appending static scores to the policy observation or reward. This creates a mismatch for sequential control: ESG scores are noisy, provider-dependent, low-frequency, and temporally misaligned with sequential portfolio decisions, while financial evidence suggests that ESG is better treated as a portfolio preference, risk-exposure, or hedge dimension than as a robust alpha factor. We propose to impose ESG constraints without modifying the financial policy's observation or reward, using a Multimodal Action-Conditioned Constraint Field (MACF) that learns mechanism-specific ESG costs from point-in-time multimodal evidence and contemplated portfolio transitions. We then introduce MACF-X, a family of optimizer-specific adapters that converts MACF costs and uncertainties into native constrained-optimization interfaces through a shared slack- and uncertainty-aware pressure layer. Across multiple constraint-integration interfaces, MACF-X reduces tail ESG budget pressure while maintaining competitive financial performance. Ablations show that this improvement depends on dynamic evidence inputs and three-head decomposition, while static ESG-score proxies are nearly indistinguishable from score-shuffled noise baselines.

CRAug 20, 2024

NeR-VCP: A Video Content Protection Method Based on Implicit Neural Representation

Yangping Lin, Yan Ke, Ke Niu et al.

With the popularity of video applications, the security of video content has emerged as a pressing issue that demands urgent attention. Most video content protection methods mainly rely on encryption technology, which needs to be manually designed or implemented in an experience-based manner. To address this problem, we propose an automatic encryption technique for video content protection based on implicit neural representation. We design a key-controllable module, which serves as a key for encryption and decryption. NeR-VCP first pre-distributes the key-controllable module trained by the sender to the recipients, and then uses Implicit Neural Representation (INR) with a (pre-distributed) key-controllable module to encrypt plain video as an implicit neural network, and the legal recipients uses a pre-distributed key-controllable module to decrypt this cipher neural network (the corresponding implicit neural network). Under the guidance of the key-controllable design, our method can improve the security of video content and provide a novel video encryption scheme. Moreover, using model compression techniques, this method can achieve video content protection while effectively mitigating the amount of encrypted data transferred. We experimentally find that it has superior performance in terms of visual representation, imperceptibility to illegal users, and security from a cryptographic viewpoint.

CVOct 14, 2024

StegaINR4MIH: steganography by implicit neural representation for multi-image hiding

Weina Dong, Jia Liu, Lifeng Chen et al.

Multi-image hiding, which embeds multiple secret images into a cover image and is able to recover these images with high quality, has gradually become a research hotspot in the field of image steganography. However, due to the need to embed a large amount of data in a limited cover image space, issues such as contour shadowing or color distortion often arise, posing significant challenges for multi-image hiding. In this paper, we propose StegaINR4MIH, a novel implicit neural representation steganography framework that enables the hiding of multiple images within a single implicit representation function. In contrast to traditional methods that use multiple encoders to achieve multi-image embedding, our approach leverages the redundancy of implicit representation function parameters and employs magnitude-based weight selection and secret weight substitution on pre-trained cover image functions to effectively hide and independently extract multiple secret images. We conduct experiments on images with a resolution of from three different datasets: CelebA-HQ, COCO, and DIV2K. When hiding two secret images, the PSNR values of both the secret images and the stego images exceed 42. When hiding five secret images, the PSNR values of both the secret images and the stego images exceed 39. Extensive experiments demonstrate the superior performance of the proposed method in terms of visual quality and undetectability.

CVSep 29, 2025

Dynamic Orchestration of Multi-Agent System for Real-World Multi-Image Agricultural VQA

Yan Ke, Xin Yu, Heming Du et al.

Agricultural visual question answering is essential for providing farmers and researchers with accurate and timely knowledge. However, many existing approaches are predominantly developed for evidence-constrained settings such as text-only queries or single-image cases. This design prevents them from coping with real-world agricultural scenarios that often require multi-image inputs with complementary views across spatial scales, and growth stages. Moreover, limited access to up-to-date external agricultural context makes these systems struggle to adapt when evidence is incomplete. In addition, rigid pipelines often lack systematic quality control. To address this gap, we propose a self-reflective and self-improving multi-agent framework that integrates four roles, the Retriever, the Reflector, the Answerer, and the Improver. They collaborate to enable context enrichment, reflective reasoning, answer drafting, and iterative improvement. A Retriever formulates queries and gathers external information, while a Reflector assesses adequacy and triggers sequential reformulation and renewed retrieval. Two Answerers draft candidate responses in parallel to reduce bias. The Improver refines them through iterative checks while ensuring that information from multiple images is effectively aligned and utilized. Experiments on the AgMMU benchmark show that our framework achieves competitive performance on multi-image agricultural QA.

CVSep 23, 2025

MoiréNet: A Compact Dual-Domain Network for Image Demoiréing

Shuwei Guo, Simin Luan, Yan Ke et al.

Moiré patterns arise from spectral aliasing between display pixel lattices and camera sensor grids, manifesting as anisotropic, multi-scale artifacts that pose significant challenges for digital image demoiréing. We propose MoiréNet, a convolutional neural U-Net-based framework that synergistically integrates frequency and spatial domain features for effective artifact removal. MoiréNet introduces two key components: a Directional Frequency-Spatial Encoder (DFSE) that discerns moiré orientation via directional difference convolution, and a Frequency-Spatial Adaptive Selector (FSAS) that enables precise, feature-adaptive suppression. Extensive experiments demonstrate that MoiréNet achieves state-of-the-art performance on public and actively used datasets while being highly parameter-efficient. With only 5.513M parameters, representing a 48% reduction compared to ESDNet-L, MoiréNet combines superior restoration quality with parameter efficiency, making it well-suited for resource-constrained applications including smartphone photography, industrial imaging, and augmented reality.

CRSep 25, 2020

A Reversible Data hiding Scheme in Encrypted Domain for Secret Image Sharing based on Chinese Remainder Theorem

Yan Ke, Minqing Zhang, Xinpeng Zhang et al.

Reversible data hiding in encrypted domain (RDH-ED) schemes based on symmetric or public key encryption are mainly applied to the security of end-to-end communication. Aimed at providing reliable technical supports for multi-party security scenarios, a separable RDH-ED scheme for secret image sharing based on Chinese remainder theorem (CRT) is presented. In the application of (t, n) secret image sharing, the image is first shared into n different shares of ciphertext. Only when not less than t shares obtained, can the original image be reconstructed. In our scheme, additional data could be embedded into the image shares. To realize data extraction from the image shares and the reconstructed image separably, two data hiding methods are proposed: one is homomorphic difference expansion in encrypted domain (HDE-ED) that supports data extraction from the reconstructed image by utilizing the addition homomorphism of CRT secret sharing; the other is difference expansion in image shares (DE-IS) that supports the data extraction from the marked shares before image reconstruction. Experimental results demonstrate that the proposed scheme could not only maintain the security and the threshold function of secret sharing system, but also obtain a better reversibility and efficiency compared with most existing RDH-ED algorithms. The maximum embedding rate of HDE-ED could reach 0.5000 bits per pixel and the average embedding rate of DE-IS is 0.0545 bits per bit of ciphertext.

CVJul 19, 2020

PIoU Loss: Towards Accurate Oriented Object Detection in Complex Environments

Zhiming Chen, Kean Chen, Weiyao Lin et al.

Object detection using an oriented bounding box (OBB) can better target rotated objects by reducing the overlap with background areas. Existing OBB approaches are mostly built on horizontal bounding box detectors by introducing an additional angle dimension optimized by a distance loss. However, as the distance loss only minimizes the angle error of the OBB and that it loosely correlates to the IoU, it is insensitive to objects with high aspect ratios. Therefore, a novel loss, Pixels-IoU (PIoU) Loss, is formulated to exploit both the angle and IoU for accurate OBB regression. The PIoU loss is derived from IoU metric with a pixel-wise form, which is simple and suitable for both horizontal and oriented bounding box. To demonstrate its effectiveness, we evaluate the PIoU loss on both anchor-based and anchor-free frameworks. The experimental results show that PIoU loss can dramatically improve the performance of OBB detectors, particularly on objects with high aspect ratios and complex backgrounds. Besides, previous evaluation datasets did not include scenarios where the objects have high aspect ratios, hence a new dataset, Retail50K, is introduced to encourage the community to adapt OBB detectors for more complex environments.

CRJun 18, 2019

Recent Advances of Image Steganography with Generative Adversarial Networks

Jia Liu, Yan Ke, Yu Lei et al.

In the past few years, the Generative Adversarial Network (GAN) which proposed in 2014 has achieved great success. GAN has achieved many research results in the field of computer vision and natural language processing. Image steganography is dedicated to hiding secret messages in digital images, and has achieved the purpose of covert communication. Recently, research on image steganography has demonstrated great potential for using GAN and neural networks. In this paper we review different strategies for steganography such as cover modification, cover selection and cover synthesis by GANs, and discuss the characteristics of these methods as well as evaluation metrics and provide some possible future research directions in image steganography.

CRApr 29, 2019

Fully Homomorphic Encryption Encapsulated Difference Expansion for Reversible Data hiding in Encrypted Domain

Yan Ke, Min-qing Zhang, Jia Liu et al.

This paper proposes a fully homomorphic encryption encapsulated difference expansion (FHEE-DE) scheme for reversible data hiding in encrypted domain (RDH-ED). In the proposed scheme, we use key-switching and bootstrapping techniques to control the ciphertext extension and decryption failure. To realize the data extraction directly from the encrypted domain without the private key, a key-switching based least-significant-bit (KS-LSB) data hiding method has been designed. In application, the user first encrypts the plaintext and uploads ciphertext to the server. Then the server performs data hiding by FHEE-DE and KS-LSB to obtain the marked ciphertext. Additional data can be extracted directly from the marked ciphertext by the server without the private key. The user can decrypt the marked ciphertext to obtain the marked plaintext. Then additional data or plaintext can be obtained from the marked plaintext by using the standard DE extraction or recovery. A fidelity constraint of DE is introduced to reduce the distortion of the marked plaintext. FHEE-DE enables the server to implement FHEE-DE recovery or extraction on the marked ciphertext, which returns the ciphertext of original plaintext or additional data to the user. In addition, we simplified the homomorphic operations of the proposed universal FHEE-DE to obtain an efficient version. The Experimental results demonstrate that the embedding capacity, fidelity, and reversibility of the proposed scheme are superior to existing RDH-ED methods, and fully separability is achieved without reducing the security of encryption.

MMJun 10, 2018

Steganography Security: Principle and Practice

Yan Ke, Jia Liu, Min-qing Zhang et al.

This paper focuses on several theoretical issues and principles in steganography security, and defines four security levels by analyzing the corresponding algorithm instances. In the theoretical analysis, we discuss the differences between steganography security and watermarking security. The two necessary conditions for the steganography security are obtained. Under the current technology situation, we then analyze the indistinguishability of the cover and stego-cover, and consider that the steganography security should rely on the key secrecy with algorithms open. By specifying the role of key in steganography, the necessary conditions for a secure steganography algorithm in theory are formally presented. When analyzing the security instances, we have classified the steganalysis attacks according to their variable access to the steganography system, and then defined the four security levels. The higher level security one has, the higher level attacks one can resist. We have also presented algorithm instances based on current technical conditions, and analyzed their data hiding process, security level, and practice requirements.

MMApr 26, 2018

Generative Steganography by Sampling

Zhuo Zhang, Jia Liu, Yan Ke et al.

In this paper, a novel data-driven information hiding scheme called generative steganography by sampling (GSS) is proposed. Unlike in traditional modification-based steganography, in our method the stego image is directly sampled by a powerful generator: no explicit cover is used. Both parties share a secret key used for message embedding and extraction. The Jensen-Shannon divergence is introduced as a new criterion for evaluating the security of generative steganography. Based on these principles, we propose a simple practical generative steganography method that uses semantic image inpainting. The message is written in advance to an uncorrupted region that needs to be retained in the corrupted image. Then, the corrupted image with the secret message is fed into a Generator trained by a generative adversarial network (GAN) for semantic completion. Message loss and prior loss terms are proposed for penalizing message extraction error and unrealistic stego image. In our design, we first train a generator whose training target is the generation of new data samples from the same distribution as that of existing training data. Next, for the trained generator, backpropagation to the message and prior loss are introduced to optimize the coding of the input noise data for the generator. The presented experiments demonstrate the potential of the proposed framework based on both qualitative and quantitative evaluations of the generated stego images.

CRApr 18, 2018

The Reincarnation of Grille Cipher: A Generative Approach

Jia Liu, Yan Ke, Yu Lei et al.

In order to keep the data secret, various techniques have been implemented to encrypt and decrypt the secret data. Cryptography is committed to the security of content, i.e. it cannot be restored with a given ciphertext. Steganography is to hiding the existence of a communication channel within a stego. However, it has been difficult to construct a cipher (cypher) that simultaneously satisfy both channel and content security for secure communication. Inspired by the Cardan grille, this paper presents a new generative framework for grille cipher. A digital cardan grille is used for message encryption and decryption. The ciphertext is directly sampled by a powerful generator without an explicit cover. Message loss and prior loss are proposed for penalizing message extraction error and unrealistic ciphertext. Jensen-Shannon Divergence is introduced as new criteria for channel security. A simple practical data-driven grille cipher is proposed using semantic image inpainting and generative adversarial network. Experimental results demonstrate the promising of the proposed method.

MMMar 25, 2018

Digital Cardan Grille: A Modern Approach for Information Hiding

Jia Liu, Tanping Zhou, Zhuo Zhang et al.

In this paper, a new framework for construction of Cardan grille for information hiding is proposed. Based on the semantic image inpainting technique, the stego image are driven by secret messages directly. A mask called Digital Cardan Grille (DCG) for determining the hidden location is introduced to hide the message. The message is written to the corrupted region that needs to be filled in the corrupted image in advance. Then the corrupted image with secret message is feeded into a Generative Adversarial Network (GAN) for semantic completion. The adversarial game not only reconstruct the corrupted image , but also generate a stego image which contains the logic rationality of image content. The experimental results verify the feasibility of the proposed method.

CRDec 18, 2017

Coverless Information Hiding Based on Generative adversarial networks

Ming-ming Liu, Min-qing Zhang, Jia Liu et al.

Traditional image steganography modifies the content of the image more or less, it is hard to resist the detection of image steganalysis tools. To address this problem, a novel method named generative coverless information hiding method based on generative adversarial networks is proposed in this paper. The main idea of the method is that the class label of generative adversarial networks is replaced with the secret information as a driver to generate hidden image directly, and then extract the secret information from the hidden image through the discriminator. It's the first time that the coverless information hiding is achieved by generative adversarial networks. Compared with the traditional image steganography, this method does not modify the content of the original image. therefore, this method can resist image steganalysis tools effectively. In terms of steganographic capacity, anti-steganalysis, safety and reliability, the experimen shows that this hidden algorithm performs well.

MMNov 14, 2017

Generative Steganography with Kerckhoffs' Principle

Yan Ke, Minqing Zhang, Jia Liu et al.

The distortion in steganography that usually comes from the modification or recoding on the cover image during the embedding process leaves the steganalyzer with possibility of discriminating. Faced with such a risk, we propose generative steganography with Kerckhoffs' principle (GSK) in this letter. In GSK, the secret messages are generated by a cover image using a generator rather than embedded into the cover, thus resulting in no modifications in the cover. To ensure the security, the generators are trained to meet Kerckhoffs' principle based on generative adversarial networks (GAN). Everything about the GSK system, except the extraction key, is public knowledge for the receivers. The secret messages can be outputted by the generator if and only if the extraction key and the cover image are both inputted. In the generator training procedures, there are two GANs, Message- GAN and Cover-GAN, designed to work jointly making the generated results under the control of the extraction key and the cover image. We provide experimental results on the training process and give an example of the working process by adopting a generator trained on MNIST, which demonstrate that GSK can use a cover image without any modification to generate messages, and without the extraction key or the cover image, only meaningless results would be obtained.