Masayuki Tanaka

h-index29

29papers

754citations

Novelty43%

AI Score46

Ranked #62,211 of 205,806 authors (top 30%)#22,411 in CV (top 38%)

29 Papers

LGJul 5, 2022Code

PoF: Post-Training of Feature Extractor for Improving Generalization

Ikuro Sato, Ryota Yamada, Masayuki Tanaka et al.

It has been intensively investigated that the local shape, especially flatness, of the loss landscape near a minimum plays an important role for generalization of deep models. We developed a training algorithm called PoF: Post-Training of Feature Extractor that updates the feature extractor part of an already-trained deep model to search a flatter minimum. The characteristics are two-fold: 1) Feature extractor is trained under parameter perturbations in the higher-layer parameter space, based on observations that suggest flattening higher-layer parameter space, and 2) the perturbation range is determined in a data-driven manner aiming to reduce a part of test loss caused by the positive loss curvature. We provide a theoretical analysis that shows the proposed algorithm implicitly reduces the target Hessian components as well as the loss. Experimental results show that PoF improved model performance against baseline methods on both CIFAR-10 and CIFAR-100 datasets for only 10-epoch post-training, and on SVHN dataset for 50-epoch post-training. Source code is available at: \url{https://github.com/DensoITLab/PoF-v1

IVSep 13, 2022

Two-Step Color-Polarization Demosaicking Network

Vy Nguyen, Masayuki Tanaka, Yusuke Monno et al.

Polarization information of light in a scene is valuable for various image processing and computer vision tasks. A division-of-focal-plane polarimeter is a promising approach to capture the polarization images of different orientations in one shot, while it requires color-polarization demosaicking. In this paper, we propose a two-step color-polarization demosaicking network~(TCPDNet), which consists of two sub-tasks of color demosaicking and polarization demosaicking. We also introduce a reconstruction loss in the YCbCr color space to improve the performance of TCPDNet. Experimental comparisons demonstrate that TCPDNet outperforms existing methods in terms of the image quality of polarization images and the accuracy of Stokes parameters.

16.5CVMay 26

Joint 2D-3D Segmentation and Association in Street-level Imaging

Amir Melnikov, Masayuki Tanaka, Yusuke Monno et al.

Accurate interpretation of street-level imagery is essential for large-scale urban mapping and the creation of Spatial Digital Twin (SDT) environments. This work presents a unified framework for joint 2D-3D segmentation and association that integrates visual semantics with multi-view geometric reasoning. Unlike conventional approaches that rely heavily on sequential frames for temporal tracking, our method leverages zero-shot detection and segmentation together with structure-from-motion reconstruction to establish stable cross-view correspondences. A 3D-driven association mechanism replaces traditional 2D multi-object tracking, using geometric consistency to guide identity preservation across wide-baseline viewpoints and varying imaging conditions. By combining 2D texture cues with global 3D context, the proposed pipeline is well-suited for scalable street-level processing and can be used for a variety of object types. Experiments demonstrate substantially improved coverage of ground-truth sequences and more robust identity retention compared to state-of-the-art 2D-only tracking methods, achieving a 22% performance gain in challenging urban scenarios.

LGJun 20, 2024Code

Adaptive Adversarial Cross-Entropy Loss for Sharpness-Aware Minimization

Tanapat Ratchatorn, Masayuki Tanaka

Recent advancements in learning algorithms have demonstrated that the sharpness of the loss surface is an effective measure for improving the generalization gap. Building upon this concept, Sharpness-Aware Minimization (SAM) was proposed to enhance model generalization and achieved state-of-the-art performance. SAM consists of two main steps, the weight perturbation step and the weight updating step. However, the perturbation in SAM is determined by only the gradient of the training loss, or cross-entropy loss. As the model approaches a stationary point, this gradient becomes small and oscillates, leading to inconsistent perturbation directions and also has a chance of diminishing the gradient. Our research introduces an innovative approach to further enhancing model generalization. We propose the Adaptive Adversarial Cross-Entropy (AACE) loss function to replace standard cross-entropy loss for SAM's perturbation. AACE loss and its gradient uniquely increase as the model nears convergence, ensuring consistent perturbation direction and addressing the gradient diminishing issue. Additionally, a novel perturbation-generating function utilizing AACE loss without normalization is proposed, enhancing the model's exploratory capabilities in near-optimum stages. Empirical testing confirms the effectiveness of AACE, with experiments demonstrating improved performance in image classification tasks using Wide ResNet and PyramidNet across various datasets. The reproduction code is available online

CVMay 23, 2024

Hyperspectral Image Dataset for Individual Penguin Identification

Youta Noboru, Yuko Ozasa, Masayuki Tanaka

Remote individual animal identification is important for food safety, sport, and animal conservation. Numerous existing remote individual animal identification studies have focused on RGB images. In this paper, we tackle individual penguin identification using hyperspectral (HS) images. To the best of our knowledge, it is the first work to analyze spectral differences between penguin individuals using an HS camera. We have constructed a novel penguin HS image dataset, including 990 hyperspectral images of 27 penguins. We experimentally demonstrate that the spectral information of HS image pixels can be used for individual penguin identification. The experimental results show the effectiveness of using HS images for individual penguin identification. The dataset and source code are available here: https://033labcodes.github.io/igrass24_penguin/

LGFeb 19, 2025

Rectified Lagrangian for Out-of-Distribution Detection in Modern Hopfield Networks

Ryo Moriai, Nakamasa Inoue, Masayuki Tanaka et al.

Modern Hopfield networks (MHNs) have recently gained significant attention in the field of artificial intelligence because they can store and retrieve a large set of patterns with an exponentially large memory capacity. A MHN is generally a dynamical system defined with Lagrangians of memory and feature neurons, where memories associated with in-distribution (ID) samples are represented by attractors in the feature space. One major problem in existing MHNs lies in managing out-of-distribution (OOD) samples because it was originally assumed that all samples are ID samples. To address this, we propose the rectified Lagrangian (RegLag), a new Lagrangian for memory neurons that explicitly incorporates an attractor for OOD samples in the dynamical system of MHNs. RecLag creates a trivial point attractor for any interaction matrix, enabling OOD detection by identifying samples that fall into this attractor as OOD. The interaction matrix is optimized so that the probability densities can be estimated to identify ID/OOD. We demonstrate the effectiveness of RecLag-based MHNs compared to energy-based OOD detection methods, including those using state-of-the-art Hopfield energies, across nine image datasets.

IVSep 12, 2025

Polarization Denoising and Demosaicking: Dataset and Baseline Method

Muhamad Daniel Ariff Bin Abdul Rahman, Yusuke Monno, Masayuki Tanaka et al.

A division-of-focal-plane (DoFP) polarimeter enables us to acquire images with multiple polarization orientations in one shot and thus it is valuable for many applications using polarimetric information. The image processing pipeline for a DoFP polarimeter entails two crucial tasks: denoising and demosaicking. While polarization demosaicking for a noise-free case has increasingly been studied, the research for the joint task of polarization denoising and demosaicking is scarce due to the lack of a suitable evaluation dataset and a solid baseline method. In this paper, we propose a novel dataset and method for polarization denoising and demosaicking. Our dataset contains 40 real-world scenes and three noise-level conditions, consisting of pairs of noisy mosaic inputs and noise-free full images. Our method takes a denoising-then-demosaicking approach based on well-accepted signal processing components to offer a reproducible method. Experimental results demonstrate that our method exhibits higher image reconstruction performance than other alternative methods, offering a solid baseline.

CVJul 23, 2021

Multi-Modal Pedestrian Detection with Large Misalignment Based on Modal-Wise Regression and Multi-Modal IoU

Napat Wanchaitanawong, Masayuki Tanaka, Takashi Shibata et al.

The combined use of multiple modalities enables accurate pedestrian detection under poor lighting conditions by using the high visibility areas from these modalities together. The vital assumption for the combination use is that there is no or only a weak misalignment between the two modalities. In general, however, this assumption often breaks in actual situations. Due to this assumption's breakdown, the position of the bounding boxes does not match between the two modalities, resulting in a significant decrease in detection accuracy, especially in regions where the amount of misalignment is large. In this paper, we propose a multi-modal Faster-RCNN that is robust against large misalignment. The keys are 1) modal-wise regression and 2) multi-modal IoU for mini-batch sampling. To deal with large misalignment, we perform bounding box regression for both the RPN and detection-head with both modalities. We also propose a new sampling strategy called "multi-modal mini-batch sampling" that integrates the IoU for both modalities. We demonstrate that the proposed method's performance is much better than that of the state-of-the-art methods for data with large misalignment through actual image experiments.

CVJul 22, 2021

Geometric Data Augmentation Based on Feature Map Ensemble

Takashi Shibata, Masayuki Tanaka, Masatoshi Okutomi

Deep convolutional networks have become the mainstream in computer vision applications. Although CNNs have been successful in many computer vision tasks, it is not free from drawbacks. The performance of CNN is dramatically degraded by geometric transformation, such as large rotations. In this paper, we propose a novel CNN architecture that can improve the robustness against geometric transformations without modifying the existing backbones of their CNNs. The key is to enclose the existing backbone with a geometric transformation (and the corresponding reverse transformation) and a feature map ensemble. The proposed method can inherit the strengths of existing CNNs that have been presented so far. Furthermore, the proposed method can be employed in combination with state-of-the-art data augmentation algorithms to improve their performance. We demonstrate the effectiveness of the proposed method using standard datasets such as CIFAR, CUB-200, and Mnist-rot-12k.

CVNov 20, 2020

Deep Snapshot HDR Imaging Using Multi-Exposure Color Filter Array

Takeru Suda, Masayuki Tanaka, Yusuke Monno et al.

In this paper, we propose a deep snapshot high dynamic range (HDR) imaging framework that can effectively reconstruct an HDR image from the RAW data captured using a multi-exposure color filter array (ME-CFA), which consists of a mosaic pattern of RGB filters with different exposure levels. To effectively learn the HDR image reconstruction network, we introduce the idea of luminance normalization that simultaneously enables effective loss computation and input data normalization by considering relative local contrasts in the "normalized-by-luminance" HDR domain. This idea makes it possible to equally handle the errors in both bright and dark areas regardless of absolute luminance levels, which significantly improves the visual image quality in a tone-mapped domain. Experimental results using two public HDR image datasets demonstrate that our framework outperforms other snapshot methods and produces high-quality HDR images with fewer visual artifacts.

CVNov 13, 2020

Adaptive Future Frame Prediction with Ensemble Network

Wonjik Kim, Masayuki Tanaka, Masatoshi Okutomi et al.

Future frame prediction in videos is a challenging problem because videos include complicated movements and large appearance changes. Learning-based future frame prediction approaches have been proposed in kinds of literature. A common limitation of the existing learning-based approaches is a mismatch of training data and test data. In the future frame prediction task, we can obtain the ground truth data by just waiting for a few frames. It means we can update the prediction model online in the test phase. Then, we propose an adaptive update framework for the future frame prediction task. The proposed adaptive updating framework consists of a pre-trained prediction network, a continuous-updating prediction network, and a weight estimation network. We also show that our pre-trained prediction model achieves comparable performance to the existing state-of-the-art approaches. We demonstrate that our approach outperforms existing methods especially for dynamically changing scenes.

CVOct 16, 2020

Human Segmentation with Dynamic LiDAR Data

Tao Zhong, Wonjik Kim, Masayuki Tanaka et al.

Consecutive LiDAR scans compose dynamic 3D sequences, which contain more abundant information than a single frame. Similar to the development history of image and video perception, dynamic 3D sequence perception starts to come into sight after inspiring research on static 3D data perception. This work proposes a spatio-temporal neural network for human segmentation with the dynamic LiDAR point clouds. It takes a sequence of depth images as input. It has a two-branch structure, i.e., the spatial segmentation branch and the temporal velocity estimation branch. The velocity estimation branch is designed to capture motion cues from the input sequence and then propagates them to the other branch. So that the segmentation branch segments humans according to both spatial and temporal features. These two branches are jointly learned on a generated dynamic point cloud dataset for human recognition. Our works fill in the blank of dynamic point cloud perception with the spherical representation of point cloud and achieves high accuracy. The experiments indicate that the introduction of temporal feature benefits the segmentation of dynamic point cloud.

IVJul 28, 2020

Monochrome and Color Polarization Demosaicking Using Edge-Aware Residual Interpolation

Miki Morimatsu, Yusuke Monno, Masayuki Tanaka et al.

A division-of-focal-plane or microgrid image polarimeter enables us to acquire a set of polarization images in one shot. Since the polarimeter consists of an image sensor equipped with a monochrome or color polarization filter array (MPFA or CPFA), the demosaicking process to interpolate missing pixel values plays a crucial role in obtaining high-quality polarization images. In this paper, we propose a novel MPFA demosaicking method based on edge-aware residual interpolation (EARI) and also extend it to CPFA demosaicking. The key of EARI is a new edge detector for generating an effective guide image used to interpolate the missing pixel values. We also present a newly constructed full color-polarization image dataset captured using a 3-CCD camera and a rotating polarizer. Using the dataset, we experimentally demonstrate that our EARI-based method outperforms existing methods in MPFA and CPFA demosaicking.

CVJul 20, 2020

Unsupervised Learning of Image Segmentation Based on Differentiable Feature Clustering

Wonjik Kim, Asako Kanezaki, Masayuki Tanaka

The usage of convolutional neural networks (CNNs) for unsupervised image segmentation was investigated in this study. In the proposed approach, label prediction and network parameter learning are alternately iterated to meet the following criteria: (a) pixels of similar features should be assigned the same label, (b) spatially continuous pixels should be assigned the same label, and (c) the number of unique labels should be large. Although these criteria are incompatible, the proposed approach minimizes the combination of similarity loss and spatial continuity loss to find a plausible solution of label assignment that balances the aforementioned criteria well. The contributions of this study are four-fold. First, we propose a novel end-to-end network of unsupervised image segmentation that consists of normalization and an argmax function for differentiable clustering. Second, we introduce a spatial continuity loss function that mitigates the limitations of fixed segment boundaries possessed by previous work. Third, we present an extension of the proposed method for segmentation with scribbles as user input, which showed better accuracy than existing methods while maintaining efficiency. Finally, we introduce another extension of the proposed method: unseen image segmentation by using networks pre-trained with a few reference images without re-training the networks. The effectiveness of the proposed approach was examined on several benchmark datasets of image segmentation.

CVJun 15, 2020

Classifying degraded images over various levels of degradation

Kazuki Endo, Masayuki Tanaka, Masatoshi Okutomi

Classification for degraded images having various levels of degradation is very important in practical applications. This paper proposes a convolutional neural network to classify degraded images by using a restoration network and an ensemble learning. The results demonstrate that the proposed network can classify degraded images over various levels of degradation well. This paper also reveals how the image-quality of training data for a classification network affects the classification performance of degraded images.

CVMar 11, 2020

Learning-Based Human Segmentation and Velocity Estimation Using Automatic Labeled LiDAR Sequence for Training

Wonjik Kim, Masayuki Tanaka, Masatoshi Okutomi et al.

In this paper, we propose an automatic labeled sequential data generation pipeline for human segmentation and velocity estimation with point clouds. Considering the impact of deep neural networks, state-of-the-art network architectures have been proposed for human recognition using point clouds captured by Light Detection and Ranging (LiDAR). However, one disadvantage is that legacy datasets may only cover the image domain without providing important label information and this limitation has disturbed the progress of research to date. Therefore, we develop an automatic labeled sequential data generation pipeline, in which we can control any parameter or data generation environment with pixel-wise and per-frame ground truth segmentation and pixel-wise velocity information for human recognition. Our approach uses a precise human model and reproduces a precise motion to generate realistic artificial data. We present more than 7K video sequences which consist of 32 frames generated by the proposed pipeline. With the proposed sequence generator, we confirm that human segmentation performance is improved when using the video domain compared to when using the image domain. We also evaluate our data by comparing with data generated under different conditions. In addition, we estimate pedestrian velocity with LiDAR by only utilizing data generated by the proposed pipeline.

CVJan 21, 2020

Block-wise Scrambled Image Recognition Using Adaptation Network

Koki Madono, Masayuki Tanaka, Masaki Onishi et al.

In this study, a perceptually hidden object-recognition method is investigated to generate secure images recognizable by humans but not machines. Hence, both the perceptual information hiding and the corresponding object recognition methods should be developed. Block-wise image scrambling is introduced to hide perceptual information from a third party. In addition, an adaptation network is proposed to recognize those scrambled images. Experimental comparisons conducted using CIFAR datasets demonstrated that the proposed adaptation network performed well in incorporating simple perceptual information hiding into DNN-based image classification.

LGSep 12, 2019

New Perspective of Interpretability of Deep Neural Networks

Masanari Kimura, Masayuki Tanaka

Deep neural networks (DNNs) are known as black-box models. In other words, it is difficult to interpret the internal state of the model. Improving the interpretability of DNNs is one of the hot research topics. However, at present, the definition of interpretability for DNNs is vague, and the question of what is a highly explanatory model is still controversial. To address this issue, we provide the definition of the human predictability of the model, as a part of the interpretability of the DNNs. The human predictability proposed in this paper is defined by easiness to predict the change of the inference when perturbating the model of the DNNs. In addition, we introduce one example of high human-predictable DNNs. We discuss that our definition will help to the research of the interpretability of the DNNs considering various types of applications.

LGJun 4, 2019

Breaking Inter-Layer Co-Adaptation by Classifier Anonymization

Ikuro Sato, Kohta Ishikawa, Guoqing Liu et al.

This study addresses an issue of co-adaptation between a feature extractor and a classifier in a neural network. A naive joint optimization of a feature extractor and a classifier often brings situations in which an excessively complex feature distribution adapted to a very specific classifier degrades the test performance. We introduce a method called Feature-extractor Optimization through Classifier Anonymization (FOCA), which is designed to avoid an explicit co-adaptation between a feature extractor and a particular classifier by using many randomly-generated, weak classifiers during optimization. We put forth a mathematical proposition that states the FOCA features form a point-like distribution within the same class in a class-separable fashion under special conditions. Real-data experiments under more general conditions provide supportive evidences.

CVMay 7, 2019

Intentional Attention Mask Transformation for Robust CNN Classification

Masanari Kimura, Masayuki Tanaka

Convolutional Neural Networks have achieved impressive results in various tasks, but interpreting the internal mechanism is a challenging problem. To tackle this problem, we exploit a multi-channel attention mechanism in feature space. Our network architecture allows us to obtain an attention mask for each feature while existing CNN visualization methods provide only a common attention mask for all features. We apply the proposed multi-channel attention mechanism to multi-attribute recognition task. We can obtain different attention mask for each feature and for each attribute. Those analyses give us deeper insight into the feature space of CNNs. Furthermore, our proposed attention mechanism naturally derives a method for improving the robustness of CNNs. From the observation of feature space based on the proposed attention mask, we demonstrate that we can obtain robust CNNs by intentionally emphasizing features that are important for attributes. The experimental results for the benchmark dataset show that the proposed method gives high human interpretability while accurately grasping the attributes of the data, and improves network robustness.

CVApr 30, 2019

Interpretation of Feature Space using Multi-Channel Attentional Sub-Networks

Masanari Kimura, Masayuki Tanaka

Convolutional Neural Networks have achieved impressive results in various tasks, but interpreting the internal mechanism is a challenging problem. To tackle this problem, we exploit a multi-channel attention mechanism in feature space. Our network architecture allows us to obtain an attention mask for each feature while existing CNN visualization methods provide only a common attention mask for all features. We apply the proposed multi-channel attention mechanism to multi-attribute recognition task. We can obtain different attention mask for each feature and for each attribute. Those analyses give us deeper insight into the feature space of CNNs. The experimental results for the benchmark dataset show that the proposed method gives high interpretability to humans while accurately grasping the attributes of the data.

LGMar 13, 2019

Improving Transparency of Deep Neural Inference Process

Hiroshi Kuwajima, Masayuki Tanaka, Masatoshi Okutomi

Deep learning techniques are rapidly advanced recently, and becoming a necessity component for widespread systems. However, the inference process of deep learning is black-box, and not very suitable to safety-critical systems which must exhibit high transparency. In this paper, to address this black-box limitation, we develop a simple analysis method which consists of 1) structural feature analysis: lists of the features contributing to inference process, 2) linguistic feature analysis: lists of the natural language labels describing the visual attributes for each feature contributing to inference process, and 3) consistency analysis: measuring consistency among input data, inference (label), and the result of our structural and linguistic feature analysis. Our analysis is simplified to reflect the actual inference process for high transparency, whereas it does not include any additional black-box mechanisms such as LSTM for highly human readable results. We conduct experiments and discuss the results of our analysis qualitatively and quantitatively, and come to believe that our work improves the transparency of neural networks. Evaluated through 12,800 human tasks, 75% workers answer that input data and result of our feature analysis are consistent, and 70% workers answer that inference (label) and result of our feature analysis are consistent. In addition to the evaluation of the proposed analysis, we find that our analysis also provide suggestions, or possible next actions such as expanding neural network complexity or collecting training data to improve a neural network.

CVFeb 14, 2019

Automatic Labeled LiDAR Data Generation based on Precise Human Model

Wonjik Kim, Masayuki Tanaka, Masatoshi Okutomi et al.

Following improvements in deep neural networks, state-of-the-art networks have been proposed for human recognition using point clouds captured by LiDAR. However, the performance of these networks strongly depends on the training data. An issue with collecting training data is labeling. Labeling by humans is necessary to obtain the ground truth label; however, labeling requires huge costs. Therefore, we propose an automatic labeled data generation pipeline, for which we can change any parameters or data generation environments. Our approach uses a human model named Dhaiba and a background of Miraikan and consequently generated realistic artificial data. We present 500k+ data generated by the proposed pipeline. This paper also describes the specification of the pipeline and data details with evaluations of various approaches.

CVDec 23, 2018

Estimation and Restoration of Compositional Degradation Using Convolutional Neural Networks

Kazutaka Uchida, Masayuki Tanaka, Masatoshi Okutomi

Image restoration from a single image degradation type, such as blurring, hazing, random noise, and compression has been investigated for decades. However, image degradations in practice are often a mixture of several types of degradation. Such compositional degradations complicate restoration because they require the differentiation of different degradation types and levels. In this paper, we propose a convolutional neural network (CNN) model for estimating the degradation properties of a given degraded image. Furthermore, we introduce an image restoration CNN model that adopts the estimated degradation properties as its input. Experimental results show that the proposed degradation estimation model can successfully infer the degradation properties of compositionally degraded images. The proposed restoration model can restore degraded images by exploiting the estimated degradation properties and can achieve both blind and nonblind image restorations.

CVOct 3, 2018

Weighted Sigmoid Gate Unit for an Activation Function of Deep Neural Network

Masayuki Tanaka

An activation function has crucial role in a deep neural network. A simple rectified linear unit (ReLU) are widely used for the activation function. In this paper, a weighted sigmoid gate unit (WiG) is proposed as the activation function. The proposed WiG consists of a multiplication of inputs and the weighted sigmoid gate. It is shown that the WiG includes the ReLU and same activation functions as a special case. Many activation functions have been proposed to overcome the performance of the ReLU. In the literature, the performance is mainly evaluated with an object recognition task. The proposed WiG is evaluated with the object recognition task and the image restoration task. Then, the expeirmental comparisons demonstrate the proposed WiG overcomes the existing activation functions including the ReLU.

CVSep 25, 2018

Gradient-Based Low-Light Image Enhancement

Masayuki Tanaka, Takashi Shibata, Masatoshi Okutomi

A low-light image enhancement is a highly demanded image processing technique, especially for consumer digital cameras and cameras on mobile phones. In this paper, a gradient-based low-light image enhancement algorithm is proposed. The key is to enhance the gradients of dark region, because the gradients are more sensitive for human visual system than absolute values. In addition, we involve the intensity-range constraints for the image integration. By using the intensity-range constraints, we can integrate the output image with enhanced gradients preserving the given gradient information while enforcing the intensity range of the output image within a certain intensity range. Experiments demonstrate that the proposed gradient-based low-light image enhancement can effectively enhance the low-light images.

CVSep 11, 2018

Non-blind Image Restoration Based on Convolutional Neural Network

Kazutaka Uchida, Masayuki Tanaka, Masatoshi Okutomi

Blind image restoration processors based on convolutional neural network (CNN) are intensively researched because of their high performance. However, they are too sensitive to the perturbation of the degradation model. They easily fail to restore the image whose degradation model is slightly different from the trained degradation model. In this paper, we propose a non-blind CNN-based image restoration processor, aiming to be robust against a perturbation of the degradation model compared to the blind restoration processor. Experimental comparisons demonstrate that the proposed non-blind CNN-based image restoration processor can robustly restore images compared to existing blind CNN-based image restoration processors.

CVMar 19, 2018

Learnable Image Encryption

Masayuki Tanaka

The network-based machine learning algorithm is very powerful tools. However, it requires huge training dataset. Researchers often meet privacy issues when they collect image dataset especially for surveillance applications. A learnable image encryption scheme is introduced. The key idea of this scheme is to encrypt images, so that human cannot understand images but the network can be train with encrypted images. This scheme allows us to train the network without the privacy issues. In this paper, a simple learnable image encryption algorithm is proposed. Then, the proposed algorithm is validated with cifar dataset.

CVDec 7, 2017

Network Analysis for Explanation

Hiroshi Kuwajima, Masayuki Tanaka

Safety critical systems strongly require the quality aspects of artificial intelligence including explainability. In this paper, we analyzed a trained network to extract features which mainly contribute the inference. Based on the analysis, we developed a simple solution to generate explanations of the inference processes.