CVApr 16, 2022
Privacy-Preserving Image Classification Using Isotropic NetworkAprilPyone MaungMaung, Hitoshi Kiya
In this paper, we propose a privacy-preserving image classification method that uses encrypted images and an isotropic network such as the vision transformer. The proposed method allows us not only to apply images without visual information to deep neural networks (DNNs) for both training and testing but also to maintain a high classification accuracy. In addition, compressible encrypted images, called encryption-then-compression (EtC) images, can be used for both training and testing without any adaptation network. Previously, to classify EtC images, an adaptation network was required before a classification network, so methods with an adaptation network have been only tested on small images. To the best of our knowledge, previous privacy-preserving image classification methods have never considered image compressibility and patch embedding-based isotropic networks. In an experiment, the proposed privacy-preserving image classification was demonstrated to outperform state-of-the-art methods even when EtC images were used in terms of classification accuracy and robustness against various attacks under the use of two isotropic networks: vision transformer and ConvMixer.
CVJul 12, 2022
Image and Model Transformation with Secret Key for Vision TransformerHitoshi Kiya, Ryota Iijima, MaungMaung Aprilpyone et al.
In this paper, we propose a combined use of transformed images and vision transformer (ViT) models transformed with a secret key. We show for the first time that models trained with plain images can be directly transformed to models trained with encrypted images on the basis of the ViT architecture, and the performance of the transformed models is the same as models trained with plain images when using test images encrypted with the key. In addition, the proposed scheme does not require any specially prepared data for training models or network modification, so it also allows us to easily update the secret key. In an experiment, the effectiveness of the proposed scheme is evaluated in terms of performance degradation and model protection performance in an image classification task on the CIFAR-10 dataset.
CVMay 24, 2022
Privacy-Preserving Image Classification Using Vision TransformerZheng Qi, AprilPyone MaungMaung, Yuma Kinoshita et al.
In this paper, we propose a privacy-preserving image classification method that is based on the combined use of encrypted images and the vision transformer (ViT). The proposed method allows us not only to apply images without visual information to ViT models for both training and testing but to also maintain a high classification accuracy. ViT utilizes patch embedding and position embedding for image patches, so this architecture is shown to reduce the influence of block-wise image transformation. In an experiment, the proposed method for privacy-preserving image classification is demonstrated to outperform state-of-the-art methods in terms of classification accuracy and robustness against various attacks.
CVMar 9, 2023
Generative Model-Based Attack on Learnable Image Encryption for Privacy-Preserving Deep LearningAprilPyone MaungMaung, Hitoshi Kiya
In this paper, we propose a novel generative model-based attack on learnable image encryption methods proposed for privacy-preserving deep learning. Various learnable encryption methods have been studied to protect the sensitive visual information of plain images, and some of them have been investigated to be robust enough against all existing attacks. However, previous attacks on image encryption focus only on traditional cryptanalytic attacks or reverse translation models, so these attacks cannot recover any visual information if a block-scrambling encryption step, which effectively destroys global information, is applied. Accordingly, in this paper, generative models are explored to evaluate whether such models can restore sensitive visual information from encrypted images for the first time. We first point out that encrypted images have some similarity with plain images in the embedding space. By taking advantage of leaked information from encrypted images, we propose a guided generative model as an attack on learnable image encryption to recover personally identifiable visual information. We implement the proposed attack in two ways by utilizing two state-of-the-art generative models: a StyleGAN-based model and latent diffusion-based one. Experiments were carried out on the CelebA-HQ and ImageNet datasets. Results show that images reconstructed by the proposed method have perceptual similarities to plain images.
CVAug 4, 2022
Privacy-Preserving Image Classification Using ConvMixer with Adaptive Permutation MatrixZheng Qi, AprilPyone MaungMaung, Hitoshi Kiya
In this paper, we propose a privacy-preserving image classification method using encrypted images under the use of the ConvMixer structure. Block-wise scrambled images, which are robust enough against various attacks, have been used for privacy-preserving image classification tasks, but the combined use of a classification network and an adaptation network is needed to reduce the influence of image encryption. However, images with a large size cannot be applied to the conventional method with an adaptation network because the adaptation network has so many parameters. Accordingly, we propose a novel method, which allows us not only to apply block-wise scrambled images to ConvMixer for both training and testing without the adaptation network, but also to provide a higher classification accuracy than conventional methods.
CVJun 11, 2022
Access Control of Semantic Segmentation Models Using Encrypted Feature MapsHiroki Ito, AprilPyone MaungMaung, Sayaka Shiota et al.
In this paper, we propose an access control method with a secret key for semantic segmentation models for the first time so that unauthorized users without a secret key cannot benefit from the performance of trained models. The method enables us not only to provide a high segmentation performance to authorized users but to also degrade the performance for unauthorized users. We first point out that, for the application of semantic segmentation, conventional access control methods which use encrypted images for classification tasks are not directly applicable due to performance degradation. Accordingly, in this paper, selected feature maps are encrypted with a secret key for training and testing models, instead of input images. In an experiment, the protected models allowed authorized users to obtain almost the same performance as that of non-protected models but also with robustness against unauthorized access without a key.
CRJul 26, 2023
Enhanced Security against Adversarial Examples Using a Random Ensemble of Encrypted Vision Transformer ModelsRyota Iijima, Miki Tanaka, Sayaka Shiota et al.
Deep neural networks (DNNs) are well known to be vulnerable to adversarial examples (AEs). In addition, AEs have adversarial transferability, which means AEs generated for a source model can fool another black-box model (target model) with a non-trivial probability. In previous studies, it was confirmed that the vision transformer (ViT) is more robust against the property of adversarial transferability than convolutional neural network (CNN) models such as ConvMixer, and moreover encrypted ViT is more robust than ViT without any encryption. In this article, we propose a random ensemble of encrypted ViT models to achieve much more robust models. In experiments, the proposed scheme is verified to be more robust against not only black-box attacks but also white-box ones than convention methods.
CVJan 23, 2023
Combined Use of Federated Learning and Image Encryption for Privacy-Preserving Image Classification with Vision TransformerTeru Nagamori, Hitoshi Kiya
In recent years, privacy-preserving methods for deep learning have become an urgent problem. Accordingly, we propose the combined use of federated learning (FL) and encrypted images for privacy-preserving image classification under the use of the vision transformer (ViT). The proposed method allows us not only to train models over multiple participants without directly sharing their raw data but to also protect the privacy of test (query) images for the first time. In addition, it can also maintain the same accuracy as normally trained models. In an experiment, the proposed method was demonstrated to well work without any performance degradation on the CIFAR-10 and CIFAR-100 datasets.
CVSep 16, 2022
StyleGAN Encoder-Based Attack for Block Scrambled Face ImagesAprilPyone MaungMaung, Hitoshi Kiya
In this paper, we propose an attack method to block scrambled face images, particularly Encryption-then-Compression (EtC) applied images by utilizing the existing powerful StyleGAN encoder and decoder for the first time. Instead of reconstructing identical images as plain ones from encrypted images, we focus on recovering styles that can reveal identifiable information from the encrypted images. The proposed method trains an encoder by using plain and encrypted image pairs with a particular training strategy. While state-of-the-art attack methods cannot recover any perceptual information from EtC images, the proposed method discloses personally identifiable information such as hair color, skin color, eyeglasses, gender, etc. Experiments were carried out on the CelebA dataset, and results show that reconstructed images have some perceptual similarities compared to plain images.
CVSep 7, 2022
On the Transferability of Adversarial Examples between Encrypted ModelsMiki Tanaka, Isao Echizen, Hitoshi Kiya
Deep neural networks (DNNs) are well known to be vulnerable to adversarial examples (AEs). In addition, AEs have adversarial transferability, namely, AEs generated for a source model fool other (target) models. In this paper, we investigate the transferability of models encrypted for adversarially robust defense for the first time. To objectively verify the property of transferability, the robustness of models is evaluated by using a benchmark attack method, called AutoAttack. In an image-classification experiment, the use of encrypted models is confirmed not only to be robust against AEs but to also reduce the influence of AEs in terms of the transferability of models.
CRJan 12, 2023
Color-NeuraCrypt: Privacy-Preserving Color-Image Classification Using Extended Random Neural NetworksZheng Qi, AprilPyone MaungMaung, Hitoshi Kiya
In recent years, with the development of cloud computing platforms, privacy-preserving methods for deep learning have become an urgent problem. NeuraCrypt is a private random neural network for privacy-preserving that allows data owners to encrypt the medical data before the data uploading, and data owners can train and then test their models in a cloud server with the encrypted data directly. However, we point out that the performance of NeuraCrypt is heavily degraded when using color images. In this paper, we propose a Color-NeuraCrypt to solve this problem. Experiment results show that our proposed Color-NeuraCrypt can achieve a better classification accuracy than the original one and other privacy-preserving methods.
CVSep 29, 2022
Access Control with Encrypted Feature Maps for Object Detection ModelsTeru Nagamori, Hiroki Ito, AprilPyone MaungMaung et al.
In this paper, we propose an access control method with a secret key for object detection models for the first time so that unauthorized users without a secret key cannot benefit from the performance of trained models. The method enables us not only to provide a high detection performance to authorized users but to also degrade the performance for unauthorized users. The use of transformed images was proposed for the access control of image classification models, but these images cannot be used for object detection models due to performance degradation. Accordingly, in this paper, selected feature maps are encrypted with a secret key for training and testing models, instead of input images. In an experiment, the protected models allowed authorized users to obtain almost the same performance as that of non-protected models but also with robustness against unauthorized access without a key.
LGSep 19, 2022
On the Adversarial Transferability of ConvMixer ModelsRyota Iijima, Miki Tanaka, Isao Echizen et al.
Deep neural networks (DNNs) are well known to be vulnerable to adversarial examples (AEs). In addition, AEs have adversarial transferability, which means AEs generated for a source model can fool another black-box model (target model) with a non-trivial probability. In this paper, we investigate the property of adversarial transferability between models including ConvMixer, which is an isotropic network, for the first time. To objectively verify the property of transferability, the robustness of models is evaluated by using a benchmark attack method called AutoAttack. In an image classification experiment, ConvMixer is confirmed to be weak to adversarial transferability.
CRJul 25, 2022
An Encryption Method of ConvMixer Models without Performance DegradationRyota Iijima, Hitoshi Kiya
In this paper, we propose an encryption method for ConvMixer models with a secret key. Encryption methods for DNN models have been studied to achieve adversarial defense, model protection and privacy-preserving image classification. However, the use of conventional encryption methods degrades the performance of models compared with that of plain models. Accordingly, we propose a novel method for encrypting ConvMixer models. The method is carried out on the basis of an embedding architecture that ConvMixer has, and models encrypted with the method can have the same performance as models trained with plain images only when using test images encrypted with a secret key. In addition, the proposed method does not require any specially prepared data for model training or network modification. In an experiment, the effectiveness of the proposed method is evaluated in terms of classification accuracy and model protection in an image classification task on the CIFAR10 dataset.
44.0CRMay 2
FLRSP: Privacy-Preserving Federated Learning Using Randomly Selected Model ParametersHiroto Sawada, Shoko Imaizumi, Hitoshi Kiya
In this paper, we propose a method for privacy-preserving federated learning that uses randomly selected model parameters to update global models. High-quality deep neural networks (DNN) models require a huge amount of training data in general, but model training raises privacy concerns when dealing with sensitive or personal information. Federated learning is a distributed machine learning framework in which multiple clients and a server train a model collaboratively. However, if the shared updates are compromised, an attacker may reconstruct the original training data. In addition, previous methods for improving robustness generally reduce the accuracy. To overcome these issues, in our method called federated learning using randomly selected model parameters (FLRSP), model parameters computed in each local server are randomly selected and shared to update a global model in a central server. In experiments, image classification tasks were carried out on the ResNet34 architecture and the Vision Transformer (ViT) under the use of Federated Stochastic Gradient Descent (FedSGD) and Federated Averaging (FedAvg), and the results demonstrated our method's effectiveness in terms of image classification accuracy and robustness against state-of-the-art attacks compared with previous methods.
CVJan 10, 2023
A Privacy Preserving Method with a Random Orthogonal Matrix for ConvMixer ModelsRei Aso, Tatsuya Chuman, Hitoshi Kiya
In this paper, a privacy preserving image classification method is proposed under the use of ConvMixer models. To protect the visual information of test images, a test image is divided into blocks, and then every block is encrypted by using a random orthogonal matrix. Moreover, a ConvMixer model trained with plain images is transformed by the random orthogonal matrix used for encrypting test images, on the basis of the embedding structure of ConvMixer. The proposed method allows us not only to use the same classification accuracy as that of ConvMixer models without considering privacy protection but to also enhance robustness against various attacks compared to conventional privacy-preserving learning.
CVAug 28, 2022
An Access Control Method with Secret Key for Semantic Segmentation ModelsTeru Nagamori, Ryota Iijima, Hitoshi Kiya
A novel method for access control with a secret key is proposed to protect models from unauthorized access in this paper. We focus on semantic segmentation models with the vision transformer (ViT), called segmentation transformer (SETR). Most existing access control methods focus on image classification tasks, or they are limited to CNNs. By using a patch embedding structure that ViT has, trained models and test images can be efficiently encrypted with a secret key, and then semantic segmentation tasks are carried out in the encrypted domain. In an experiment, the method is confirmed to provide the same accuracy as that of using plain images without any encryption to authorized users with a correct key and also to provide an extremely degraded accuracy to unauthorized users.
CVAug 10, 2022
A Detection Method of Temporally Operated Videos Using Robust HashingShoko Niwa, Miki Tanaka, Hitoshi Kiya
SNS providers are known to carry out the recompression and resizing of uploaded videos/images, but most conventional methods for detecting tampered videos/images are not robust enough against such operations. In addition, videos are temporally operated such as the insertion of new frames and the permutation of frames, of which operations are difficult to be detected by using conventional methods. Accordingly, in this paper, we propose a novel method with a robust hashing algorithm for detecting temporally operated videos even when applying resizing and compression to the videos.
CVAug 3, 2022
Template matching with white balance adjustment under multiple illuminantsTeruaki Akazawa, Yuma Kinoshita, Hitoshi Kiya
In this paper, we propose a novel template matching method with a white balancing adjustment, called N-white balancing, which was proposed for multi-illuminant scenes. To reduce the influence of lighting effects, N-white balancing is applied to images for multi-illumination color constancy, and then a template matching method is carried out by using adjusted images. In experiments, the effectiveness of the proposed method is demonstrated to be effective in object detection tasks under various illumination conditions.
CVSep 5, 2023
Domain Adaptation for Efficiently Fine-tuning Vision Transformer with Encrypted ImagesTeru Nagamori, Sayaka Shiota, Hitoshi Kiya
In recent years, deep neural networks (DNNs) trained with transformed data have been applied to various applications such as privacy-preserving learning, access control, and adversarial defenses. However, the use of transformed data decreases the performance of models. Accordingly, in this paper, we propose a novel method for fine-tuning models with transformed images under the use of the vision transformer (ViT). The proposed domain adaptation method does not cause the accuracy degradation of models, and it is carried out on the basis of the embedding structure of ViT. In experiments, we confirmed that the proposed method prevents accuracy degradation even when using encrypted images with the CIFAR-10 and CIFAR-100 datasets.
15.0CVApr 16
Privacy-Preserving Semantic Segmentation without Key ManagementMare Hirose, Shoko Imaizumi, Hitoshi Kiya
This paper proposes a novel privacy-preserving semantic segmentation method that can use independent keys for each client and image. In the proposed method, the model creator and each client encrypt images using locally generated keys, and model training and inference are conducted on the encrypted images. To mitigate performance degradation, an image encryption method is applied to model training in addition to the generation of test images. In experiments, the effectiveness of the proposed method is confirmed on the Cityscapes dataset under the use of a vision transformer-based model, called SETR.
23.2CVMay 7
CFE-PPAR: Compression-friendly encryption for privacy-preserving action recognition leveraging video transformersHaiwei Lin, Shoko Imaizumi, Hitoshi Kiya
Privacy-preserving action recognition (PPAR) enables machines to understand human activities in videos without revealing sensitive visual content. Among the various strategies for PPAR, encryption-based methods achieve strong privacy protection while maintaining high recognition performance. However, these methods lead to a catastrophic decrease in recognition performance and visual quality when the encrypted videos are compressed. That is, the previous methods are not compression-friendly. To address these issues, in this paper, we propose the first compression-friendly encryption method for PPAR, called CFE-PPAR. In CFE-PPAR, videos encrypted with secret keys can be directly recognized by a video transformer, which uses parameters transformed by the same keys as those used for video encryption. In experiments, it is verified that CFE-PPAR outperforms previous methods on the UCF101 and HMDB51 datasets under Motion-JPEG and H.264 compression.
CVNov 28, 2023
Efficient Key-Based Adversarial Defense for ImageNet by Using Pre-trained ModelAprilPyone MaungMaung, Isao Echizen, Hitoshi Kiya
In this paper, we propose key-based defense model proliferation by leveraging pre-trained models and utilizing recent efficient fine-tuning techniques on ImageNet-1k classification. First, we stress that deploying key-based models on edge devices is feasible with the latest model deployment advancements, such as Apple CoreML, although the mainstream enterprise edge artificial intelligence (Edge AI) has been focused on the Cloud. Then, we point out that the previous key-based defense on on-device image classification is impractical for two reasons: (1) training many classifiers from scratch is not feasible, and (2) key-based defenses still need to be thoroughly tested on large datasets like ImageNet. To this end, we propose to leverage pre-trained models and utilize efficient fine-tuning techniques to proliferate key-based models even on limited computing resources. Experiments were carried out on the ImageNet-1k dataset using adaptive and non-adaptive attacks. The results show that our proposed fine-tuned key-based models achieve a superior classification accuracy (more than 10% increase) compared to the previous key-based models on classifying clean and adversarial examples.
CVAug 16, 2024
Privacy-Preserving Vision Transformer Using Images Encrypted with Restricted Random Permutation MatricesKouki Horio, Kiyoshi Nishikawa, Hitoshi Kiya
We propose a novel method for privacy-preserving fine-tuning vision transformers (ViTs) with encrypted images. Conventional methods using encrypted images degrade model performance compared with that of using plain images due to the influence of image encryption. In contrast, the proposed encryption method using restricted random permutation matrices can provide a higher performance than the conventional ones.
7.6CVApr 29
Privacy-Preserving Clothing Classification using Vision Transformer for Thermal Comfort EstimationTatsuya Chuman, Yousuke Udagawa, Hitoshi Kiya
A privacy-preserving clothing classification scheme is presented to enable secure occupant-centric control (OCC) systems. Although the utilization of camera images for HVAC control has been widely studied to optimize thermal comfort, privacy protection of occupant images has not been considered in prior works. While various privacy-preserving methods have been proposed for image classification, applying conventional schemes results in severe accuracy degradation. In this paper, we introduce a privacy-preserving classification method using Vision Transformer (ViT) applied to clothing insulation estimation. In an experiment using the DeepFashion dataset categorized by clothing insulation, while the conventional pixel-based method suffers a severe accuracy drop, our scheme maintains a high accuracy on encrypted images, showing no degradation from plain images across all categories.
CVFeb 13, 2024
Fine-Tuning Text-To-Image Diffusion Models for Class-Wise Spurious Feature GenerationAprilPyone MaungMaung, Huy H. Nguyen, Hitoshi Kiya et al.
We propose a method for generating spurious features by leveraging large-scale text-to-image diffusion models. Although the previous work detects spurious features in a large-scale dataset like ImageNet and introduces Spurious ImageNet, we found that not all spurious images are spurious across different classifiers. Although spurious images help measure the reliance of a classifier, filtering many images from the Internet to find more spurious features is time-consuming. To this end, we utilize an existing approach of personalizing large-scale text-to-image diffusion models with available discovered spurious images and propose a new spurious feature similarity loss based on neural features of an adversarially robust model. Precisely, we fine-tune Stable Diffusion with several reference images from Spurious ImageNet with a modified objective incorporating the proposed spurious-feature similarity loss. Experiment results show that our method can generate spurious images that are consistently spurious across different classifiers. Moreover, the generated spurious images are visually similar to reference images from Spurious ImageNet.
AIFeb 11, 2024
A Random Ensemble of Encrypted Vision Transformers for Adversarially Robust DefenseRyota Iijima, Sayaka Shiota, Hitoshi Kiya
Deep neural networks (DNNs) are well known to be vulnerable to adversarial examples (AEs). In previous studies, the use of models encrypted with a secret key was demonstrated to be robust against white-box attacks, but not against black-box ones. In this paper, we propose a novel method using the vision transformer (ViT) that is a random ensemble of encrypted models for enhancing robustness against both white-box and black-box attacks. In addition, a benchmark attack method, called AutoAttack, is applied to models to test adversarial robustness objectively. In experiments, the method was demonstrated to be robust against not only white-box attacks but also black-box ones in an image classification task on the CIFAR-10 and ImageNet datasets. The method was also compared with the state-of-the-art in a standardized benchmark for adversarial robustness, RobustBench, and it was verified to outperform conventional defenses in terms of clean accuracy and robust accuracy.
CVJan 10, 2024
Efficient Fine-Tuning with Domain Adaptation for Privacy-Preserving Vision TransformerTeru Nagamori, Sayaka Shiota, Hitoshi Kiya
We propose a novel method for privacy-preserving deep neural networks (DNNs) with the Vision Transformer (ViT). The method allows us not only to train models and test with visually protected images but to also avoid the performance degradation caused from the use of encrypted images, whereas conventional methods cannot avoid the influence of image encryption. A domain adaptation method is used to efficiently fine-tune ViT with encrypted images. In experiments, the method is demonstrated to outperform conventional methods in an image classification task on the CIFAR-10 and ImageNet datasets in terms of classification accuracy.
CVSep 18, 2025
Scale and Rotation Estimation of Similarity-Transformed Images via Cross-Correlation Maximization Based on Auxiliary Function MethodShinji Yamashita, Yuma Kinoshita, Hitoshi Kiya
This paper introduces a highly efficient algorithm capable of jointly estimating scale and rotation between two images with sub-pixel precision. Image alignment serves as a critical process for spatially registering images captured from different viewpoints, and finds extensive use in domains such as medical imaging and computer vision. Traditional phase-correlation techniques are effective in determining translational shifts; however, they are inadequate when addressing scale and rotation changes, which often arise due to camera zooming or rotational movements. In this paper, we propose a novel algorithm that integrates scale and rotation estimation based on the Fourier transform in log-polar coordinates with a cross-correlation maximization strategy, leveraging the auxiliary function method. By incorporating sub-pixel-level cross-correlation our method enables precise estimation of both scale and rotation. Experimental results demonstrate that the proposed method achieves lower mean estimation errors for scale and rotation than conventional Fourier transform-based techniques that rely on discrete cross-correlation.
CVJul 17, 2025
A Privacy-Preserving Semantic-Segmentation Method Using Domain-Adaptation TechniqueHomare Sueyoshi, Kiyoshi Nishikawa, Hitoshi Kiya
We propose a privacy-preserving semantic-segmentation method for applying perceptual encryption to images used for model training in addition to test images. This method also provides almost the same accuracy as models without any encryption. The above performance is achieved using a domain-adaptation technique on the embedding structure of the Vision Transformer (ViT). The effectiveness of the proposed method was experimentally confirmed in terms of the accuracy of semantic segmentation when using a powerful semantic-segmentation model with ViT called Segmentation Transformer.
CRJul 16, 2025
Effective Fine-Tuning of Vision Transformers with Low-Rank Adaptation for Privacy-Preserving Image ClassificationHaiwei Lin, Shoko Imaizumi, Hitoshi Kiya
We propose a low-rank adaptation method for training privacy-preserving vision transformer (ViT) models that efficiently freezes pre-trained ViT model weights. In the proposed method, trainable rank decomposition matrices are injected into each layer of the ViT architecture, and moreover, the patch embedding layer is not frozen, unlike in the case of the conventional low-rank adaptation methods. The proposed method allows us not only to reduce the number of trainable parameters but to also maintain almost the same accuracy as that of full-time tuning.
CVOct 21, 2024
Scene-Segmentation-Based Exposure Compensation for Tone Mapping of High Dynamic Range ScenesYuma Kinoshita, Hitoshi Kiya
We propose a novel scene-segmentation-based exposure compensation method for multi-exposure image fusion (MEF) based tone mapping. The aim of MEF-based tone mapping is to display high dynamic range (HDR) images on devices with limited dynamic range. To achieve this, this method generates a stack of differently exposed images from an input HDR image and fuses them into a single image. Our approach addresses the limitations of MEF-based tone mapping with existing segmentation-based exposure compensation, which often result in visually unappealing outcomes due to inappropriate exposure value selection. The proposed exposure compensation method first segments the input HDR image into subregions based on luminance values of pixels. It then determines exposure values for multi-exposure images to maximize contrast between regions while preserving relative luminance relationships. This approach contrasts with conventional methods that may invert luminance relationships or compromise contrast between regions. Additionally, we present an improved technique for calculating fusion weights to better reflect the effects of exposure compensation in the final fused image. In a simulation experiment to evaluate the quality of tone-mapped images, the MEF-based tone mapping with the proposed method outperforms three typical tone mapping methods including conventional MEF-based one, in terms of the tone mapped image quality index (TMQI).
CRJan 5, 2024
A Random Ensemble of Encrypted models for Enhancing Robustness against Adversarial ExamplesRyota Iijima, Sayaka Shiota, Hitoshi Kiya
Deep neural networks (DNNs) are well known to be vulnerable to adversarial examples (AEs). In addition, AEs have adversarial transferability, which means AEs generated for a source model can fool another black-box model (target model) with a non-trivial probability. In previous studies, it was confirmed that the vision transformer (ViT) is more robust against the property of adversarial transferability than convolutional neural network (CNN) models such as ConvMixer, and moreover encrypted ViT is more robust than ViT without any encryption. In this article, we propose a random ensemble of encrypted ViT models to achieve much more robust models. In experiments, the proposed scheme is verified to be more robust against not only black-box attacks but also white-box ones than convention methods.
CVSep 4, 2023
Hindering Adversarial Attacks with Multiple Encrypted Patch EmbeddingsAprilPyone MaungMaung, Isao Echizen, Hitoshi Kiya
In this paper, we propose a new key-based defense focusing on both efficiency and robustness. Although the previous key-based defense seems effective in defending against adversarial examples, carefully designed adaptive attacks can bypass the previous defense, and it is difficult to train the previous defense on large datasets like ImageNet. We build upon the previous defense with two major improvements: (1) efficient training and (2) optional randomization. The proposed defense utilizes one or more secret patch embeddings and classifier heads with a pre-trained isotropic network. When more than one secret embeddings are used, the proposed defense enables randomization on inference. Experiments were carried out on the ImageNet dataset, and the proposed defense was evaluated against an arsenal of state-of-the-art attacks, including adaptive ones. The results show that the proposed defense achieves a high robust accuracy and a comparable clean accuracy compared to the previous key-based defense.
CVFeb 5, 2022
On the predictability in reversible steganographyChing-Chun Chang, Xu Wang, Sisheng Chen et al.
Artificial neural networks have advanced the frontiers of reversible steganography. The core strength of neural networks is the ability to render accurate predictions for a bewildering variety of data. Residual modulation is recognised as the most advanced reversible steganographic algorithm for digital images. The pivot of this algorithm is predictive analytics in which pixel intensities are predicted given some pixel-wise contextual information. This task can be perceived as a low-level vision problem and hence neural networks for addressing a similar class of problems can be deployed. On top of the prior art, this paper investigates predictability of pixel intensities based on supervised and unsupervised learning frameworks. Predictability analysis enables adaptive data embedding, which in turn leads to a better trade-off between capacity and imperceptibility. While conventional methods estimate predictability by the statistics of local image patterns, learning-based frameworks consider further the degree to which correct predictions can be made by a designated predictor. Not only should the image patterns be taken into account but also the predictor in use. Experimental results show that steganographic performance can be significantly improved by incorporating the learning-based predictability analysers into a reversible steganographic system.
CVFeb 5, 2022
Adversarial Detector with Robust ClassifierTakayuki Osakabe, Maungmaung Aprilpyone, Sayaka Shiota et al.
Deep neural network (DNN) models are wellknown to easily misclassify prediction results by using input images with small perturbations, called adversarial examples. In this paper, we propose a novel adversarial detector, which consists of a robust classifier and a plain one, to highly detect adversarial examples. The proposed adversarial detector is carried out in accordance with the logits of plain and robust classifiers. In an experiment, the proposed detector is demonstrated to outperform a state-of-the-art detector without any robust classifier.
CRFeb 1, 2022
Security Evaluation of Block-based Image Encryption for Vision Transformer against Jigsaw Puzzle Solver AttackTatsuya Chuman, Hitoshi Kiya
The aim of this paper is to evaluate the security of a block-based image encryption for the vision transformer against jigsaw puzzle solver attacks. The vision transformer, a model for image classification based on the transformer architecture, is carried out by dividing an image into a grid of square patches. Some encryption schemes for the vision transformer have been proposed by applying block-based image encryption such as block scrambling and rotating to patches of the image. On the other hand, the security of encryption scheme for the vision transformer has never evaluated. In this paper, jigsaw puzzle solver attacks are utilized to evaluate the security of encrypted images by regarding the divided patches as pieces of a jigsaw puzzle. In experiments, an image is resized and divided into patches to apply block scrambling-based image encryption, and then the security of encrypted images for the vision transformer against jigsaw puzzle solver attacks is evaluated.
IVFeb 1, 2022
A Privacy-Preserving Image Retrieval Scheme with a Mixture of Plain and EtC ImagesKenta Iida, Hitoshi Kiya
In this paper, we propose a novel content-based image-retrieval scheme that allows us to use a mixture of plain images and compressible encrypted ones called "encryption-then-compression (EtC) images." In the proposed scheme, extended SIMPLE descriptors are extracted from EtC images as well as from plain ones, so the mixed use of plain and encrypted images is available for image retrieval. In an experiment, the proposed scheme was demonstrated to have almost the same retrieval performance as that for plain images, even with a mixture of plain and encrypted images.
CVFeb 1, 2022
Access Control of Object Detection Models Using Encrypted Feature MapsTeru Nagamori, Hiroki Ito, April Pyone Maung Maung et al.
In this paper, we propose an access control method for object detection models. The use of encrypted images or encrypted feature maps has been demonstrated to be effective in access control of models from unauthorized access. However, the effectiveness of the approach has been confirmed in only image classification models and semantic segmentation models, but not in object detection models. In this paper, the use of encrypted feature maps is shown to be effective in access control of object detection models for the first time.
CVJan 26, 2022
An Overview of Compressible and Learnable Image Transformation with Secret Key and Its ApplicationsHitoshi Kiya, AprilPyone MaungMaung, Yuma Kinoshita et al.
This article presents an overview of image transformation with a secret key and its applications. Image transformation with a secret key enables us not only to protect visual information on plain images but also to embed unique features controlled with a key into images. In addition, numerous encryption methods can generate encrypted images that are compressible and learnable for machine learning. Various applications of such transformation have been developed by using these properties. In this paper, we focus on a class of image transformation referred to as learnable image encryption, which is applicable to privacy-preserving machine learning and adversarially robust defense. Detailed descriptions of both transformation algorithms and performances are provided. Moreover, we discuss robustness against various attacks.
CVNov 17, 2021
Protection of SVM Model with Secret Key from Unauthorized AccessRyota Iijima, AprilPyone MaungMaung, Hitoshi Kiya
In this paper, we propose a block-wise image transformation method with a secret key for support vector machine (SVM) models. Models trained by using transformed images offer a poor performance to unauthorized users without a key, while they can offer a high performance to authorized users with a key. The proposed method is demonstrated to be robust enough against unauthorized access even under the use of kernel functions in a facial recognition experiment.
CVNov 5, 2021
Self-Supervised Intrinsic Image Decomposition Network Considering Reflectance ConsistencyYuma Kinoshita, Hitoshi Kiya
We propose a novel intrinsic image decomposition network considering reflectance consistency. Intrinsic image decomposition aims to decompose an image into illumination-invariant and illumination-variant components, referred to as ``reflectance'' and ``shading,'' respectively. Although there are three consistencies that the reflectance and shading should satisfy, most conventional work does not sufficiently account for consistency with respect to reflectance, owing to the use of a white-illuminant decomposition model and the lack of training images capturing the same objects under various illumination-brightness and -color conditions. For this reason, the three consistencies are considered in the proposed network by using a color-illuminant model and training the network with losses calculated from images taken under various illumination conditions. In addition, the proposed network can be trained in a self-supervised manner because various illumination conditions can easily be simulated. Experimental results show that our network can decompose images into reflectance and shading components.
IVSep 4, 2021
A Privacy-Preserving Image Retrieval Scheme Using A Codebook Generated From Independent Plain-Image DatasetKenta Iida, Hitoshi Kiya
In this paper, we propose a privacy-preserving image-retrieval scheme using a codebook generated by using a plain-image dataset. Encryption-then-compression (EtC) images, which were proposed for EtC systems, have been used in conventional privacy-preserving image-retrieval schemes, in which a codebook is generated from EtC images uploaded by image owners, and extended SIMPLE descriptors are then calculated as image descriptors by using the codebook. In contrast, in the proposed scheme, a codebook is generated from a dataset independent of uploaded images. The use of an independent dataset enables us not only to use a codebook that does not require recalculation but also to constantly provide a high retrieval accuracy. In an experiment, the proposed scheme is demonstrated to maintain a high retrieval performance, even if codebooks are generated from a plain image dataset independent of image owners' encrypted images.
CVSep 3, 2021
Spatially varying white balancing for mixed and non-uniform illuminantsTeruaki Akazawa, Yuma Kinoshita, Hitoshi Kiya
In this paper, we propose a novel white balance adjustment, called "spatially varying white balancing," for single, mixed, and non-uniform illuminants. By using n diagonal matrices along with a weight, the proposed method can reduce lighting effects on all spatially varying colors in an image under such illumination conditions. In contrast, conventional white balance adjustments do not consider the correcting of all colors except under a single illuminant. Also, multi-color balance adjustments can map multiple colors into corresponding ground truth colors, although they may cause the rank deficiency problem to occur as a non-diagonal matrix is used, unlike white balancing. In an experiment, the effectiveness of the proposed method is shown under mixed and non-uniform illuminants, compared with conventional white and multi-color balancing. Moreover, under a single illuminant, the proposed method has almost the same performance as the conventional white balancing.
CVSep 3, 2021
Access Control Using Spatially Invariant Permutation of Feature Maps for Semantic Segmentation ModelsHiroki Ito, MaungMaung AprilPyone, Hitoshi Kiya
In this paper, we propose an access control method that uses the spatially invariant permutation of feature maps with a secret key for protecting semantic segmentation models. Segmentation models are trained and tested by permuting selected feature maps with a secret key. The proposed method allows rightful users with the correct key not only to access a model to full capacity but also to degrade the performance for unauthorized users. Conventional access control methods have focused only on image classification tasks, and these methods have never been applied to semantic segmentation tasks. In an experiment, the protected models were demonstrated to allow rightful users to obtain almost the same performance as that of non-protected models but also to be robust against access by unauthorized users without a key. In addition, a conventional method with block-wise transformations was also verified to have degraded performance under semantic segmentation models.
CVSep 1, 2021
A Protection Method of Trained CNN Model Using Feature Maps Transformed With Secret Key From Unauthorized AccessMaungMaung AprilPyone, Hitoshi Kiya
In this paper, we propose a model protection method for convolutional neural networks (CNNs) with a secret key so that authorized users get a high classification accuracy, and unauthorized users get a low classification accuracy. The proposed method applies a block-wise transformation with a secret key to feature maps in the network. Conventional key-based model protection methods cannot maintain a high accuracy when a large key space is selected. In contrast, the proposed method not only maintains almost the same accuracy as non-protected accuracy, but also has a larger key space. Experiments were carried out on the CIFAR-10 dataset, and results show that the proposed model protection method outperformed the previous key-based model protection methods in terms of classification accuracy, key space, and robustness against key estimation attacks and fine-tuning attacks.
CVAug 4, 2021
A universal detector of CNN-generated images using properties of checkerboard artifacts in the frequency domainMiki Tanaka, Sayaka Shiota, Hitoshi Kiya
We propose a novel universal detector for detecting images generated by using CNNs. In this paper, properties of checkerboard artifacts in CNN-generated images are considered, and the spectrum of images is enhanced in accordance with the properties. Next, a classifier is trained by using the enhanced spectrums to judge a query image to be a CNN-generated ones or not. In addition, an ensemble of the proposed detector with emphasized spectrums and a conventional detector is proposed to improve the performance of these methods. In an experiment, the proposed ensemble is demonstrated to outperform a state-of-the-art method under some conditions.
IVJul 20, 2021
Protecting Semantic Segmentation Models by Using Block-wise Image Encryption with Secret Key from Unauthorized AccessHiroki Ito, MaungMaung AprilPyone, Hitoshi Kiya
Since production-level trained deep neural networks (DNNs) are of a great business value, protecting such DNN models against copyright infringement and unauthorized access is in a rising demand. However, conventional model protection methods focused only the image classification task, and these protection methods were never applied to semantic segmentation although it has an increasing number of applications. In this paper, we propose to protect semantic segmentation models from unauthorized access by utilizing block-wise transformation with a secret key for the first time. Protected models are trained by using transformed images. Experiment results show that the proposed protection method allows rightful users with the correct key to access the model to full capacity and deteriorate the performance for unauthorized users. However, protected models slightly drop the segmentation performance compared to non-protected models.
IVJun 1, 2021
Separated-Spectral-Distribution Estimation Based on Bayesian Inference with Single RGB CameraYuma Kinoshita, Hitoshi Kiya
In this paper, we propose a novel method for separately estimating spectral distributions from images captured by a typical RGB camera. The proposed method allows us to separately estimate a spectral distribution of illumination, reflectance, or camera sensitivity, while recent hyperspectral cameras are limited to capturing a joint spectral distribution from a scene. In addition, the use of Bayesian inference makes it possible to take into account prior information of both spectral distributions and image noise as probability distributions. As a result, the proposed method can estimate spectral distributions in a unified way, and it can enhance the robustness of the estimation against noise, which conventional spectral-distribution estimation methods cannot. The use of Bayesian inference also enables us to obtain the confidence of estimation results. In an experiment, the proposed method is shown not only to outperform conventional estimation methods in terms of RMSE but also to be robust against noise.
CVMay 31, 2021
A Protection Method of Trained CNN Model with Secret Key from Unauthorized AccessAprilPyone MaungMaung, Hitoshi Kiya
In this paper, we propose a novel method for protecting convolutional neural network (CNN) models with a secret key set so that unauthorized users without the correct key set cannot access trained models. The method enables us to protect not only from copyright infringement but also the functionality of a model from unauthorized access without any noticeable overhead. We introduce three block-wise transformations with a secret key set to generate learnable transformed images: pixel shuffling, negative/positive transformation, and FFX encryption. Protected models are trained by using transformed images. The results of experiments with the CIFAR and ImageNet datasets show that the performance of a protected model was close to that of non-protected models when the key set was correct, while the accuracy severely dropped when an incorrect key set was given. The protected model was also demonstrated to be robust against various attacks. Compared with the state-of-the-art model protection with passports, the proposed method does not have any additional layers in the network, and therefore, there is no overhead during training and inference processes.