Dogancan Temel

h-index12

21papers

541citations

Novelty40%

AI Score26

Ranked #158,551 of 194,257 authors (top 82%)#51,355 in CV (top 87%)

21 Papers

9.4CVAug 29, 2019Code

Traffic Sign Detection under Challenging Conditions: A Deeper Look Into Performance Variations and Spectral Characteristics

Dogancan Temel, Min-Hung Chen, Ghassan AlRegib

Traffic signs are critical for maintaining the safety and efficiency of our roads. Therefore, we need to carefully assess the capabilities and limitations of automated traffic sign detection systems. Existing traffic sign datasets are limited in terms of type and severity of challenging conditions. Metadata corresponding to these conditions are unavailable and it is not possible to investigate the effect of a single factor because of simultaneous changes in numerous conditions. To overcome the shortcomings in existing datasets, we introduced the CURE-TSD-Real dataset, which is based on simulated challenging conditions that correspond to adversaries that can occur in real-world environments and systems. We test the performance of two benchmark algorithms and show that severe conditions can result in an average performance degradation of 29% in precision and 68% in recall. We investigate the effect of challenging conditions through spectral analysis and show that challenging conditions can lead to distinct magnitude spectrum characteristics. Moreover, we show that mean magnitude spectrum of changes in video sequences under challenging conditions can be an indicator of detection performance. CURE-TSD-Real dataset is available online at https://github.com/olivesgatech/CURE-TSD.

5.4CVFeb 19, 2019Code

Challenging Environments for Traffic Sign Detection: Reliability Assessment under Inclement Conditions

Dogancan Temel, Tariq Alshawi, Min-Hung Chen et al.

State-of-the-art algorithms successfully localize and recognize traffic signs over existing datasets, which are limited in terms of challenging condition type and severity. Therefore, it is not possible to estimate the performance of traffic sign detection algorithms under overlooked challenging conditions. Another shortcoming of existing datasets is the limited utilization of temporal information and the unavailability of consecutive frames and annotations. To overcome these shortcomings, we generated the CURE-TSD video dataset and hosted the first IEEE Video and Image Processing (VIP) Cup within the IEEE Signal Processing Society. In this paper, we provide a detailed description of the CURE-TSD dataset, analyze the characteristics of the top performing algorithms, and provide a performance benchmark. Moreover, we investigate the robustness of the benchmarked algorithms with respect to sign size, challenge type and severity. Benchmarked algorithms are based on state-of-the-art and custom convolutional neural networks that achieved a precision of 0.55 and a recall of 0.32, F0.5 score of 0.48 and F2 score of 0.35. Experimental results show that benchmarked algorithms are highly sensitive to tested challenging conditions, which result in an average performance drop of 0.17 in terms of precision and a performance drop of 0.28 in recall under severe conditions. The dataset is publicly available at https://github.com/olivesgatech/CURE-TSD.

3.3CVSep 2, 2020Code

On the Structures of Representation for the Robustness of Semantic Segmentation to Input Corruption

Charles Lehman, Dogancan Temel, Ghassan AlRegib

Semantic segmentation is a scene understanding task at the heart of safety-critical applications where robustness to corrupted inputs is essential. Implicit Background Estimation (IBE) has demonstrated to be a promising technique to improve the robustness to out-of-distribution inputs for semantic segmentation models for little to no cost. In this paper, we provide analysis comparing the structures learned as a result of optimization objectives that use Softmax, IBE, and Sigmoid in order to improve understanding their relationship to robustness. As a result of this analysis, we propose combining Sigmoid with IBE (SCrIBE) to improve robustness. Finally, we demonstrate that SCrIBE exhibits superior segmentation performance aggregated across all corruptions and severity levels with a mIOU of 42.1 compared to both IBE 40.3 and the Softmax Baseline 37.5.

11.6CVAug 13, 2020

Novelty Detection Through Model-Based Characterization of Neural Networks

Gukyeong Kwon, Mohit Prabhushankar, Dogancan Temel et al.

In this paper, we propose a model-based characterization of neural networks to detect novel input types and conditions. Novelty detection is crucial to identify abnormal inputs that can significantly degrade the performance of machine learning algorithms. Majority of existing studies have focused on activation-based representations to detect abnormal inputs, which limits the characterization of abnormality from a data perspective. However, a model perspective can also be informative in terms of the novelties and abnormalities. To articulate the significance of the model perspective in novelty detection, we utilize backpropagated gradients. We conduct a comprehensive analysis to compare the representation capability of gradients with that of activation and show that the gradients outperform the activation in novel class and condition detection. We validate our approach using four image recognition datasets including MNIST, Fashion-MNIST, CIFAR-10, and CURE-TSR. We achieve a significant improvement on all four datasets with an average AUROC of 0.953, 0.918, 0.582, and 0.746, respectively.

15.7CVAug 1, 2020Code

Contrastive Explanations in Neural Networks

Mohit Prabhushankar, Gukyeong Kwon, Dogancan Temel et al.

Visual explanations are logical arguments based on visual features that justify the predictions made by neural networks. Current modes of visual explanations answer questions of the form $`Why \text{ } P?'$. These $Why$ questions operate under broad contexts thereby providing answers that are irrelevant in some cases. We propose to constrain these $Why$ questions based on some context $Q$ so that our explanations answer contrastive questions of the form $`Why \text{ } P, \text{} rather \text{ } than \text{ } Q?'$. In this paper, we formalize the structure of contrastive visual explanations for neural networks. We define contrast based on neural networks and propose a methodology to extract defined contrasts. We then use the extracted contrasts as a plug-in on top of existing $`Why \text{ } P?'$ techniques, specifically Grad-CAM. We demonstrate their value in analyzing both networks and data in applications of large-scale recognition, fine-grained recognition, subsurface seismic analysis, and image quality assessment.

20.8CVJul 18, 2020Code

Backpropagated Gradient Representations for Anomaly Detection

Gukyeong Kwon, Mohit Prabhushankar, Dogancan Temel et al.

Learning representations that clearly distinguish between normal and abnormal data is key to the success of anomaly detection. Most of existing anomaly detection algorithms use activation representations from forward propagation while not exploiting gradients from backpropagation to characterize data. Gradients capture model updates required to represent data. Anomalies require more drastic model updates to fully represent them compared to normal data. Hence, we propose the utilization of backpropagated gradients as representations to characterize model behavior on anomalies and, consequently, detect such anomalies. We show that the proposed method using gradient-based representations achieves state-of-the-art anomaly detection performance in benchmark image recognition datasets. Also, we highlight the computational efficiency and the simplicity of the proposed method in comparison with other state-of-the-art methods relying on adversarial networks or autoregressive models, which require at least 27 times more model parameters than the proposed method.

9.8CVAug 27, 2019

Distorted Representation Space Characterization Through Backpropagated Gradients

Gukyeong Kwon, Mohit Prabhushankar, Dogancan Temel et al.

In this paper, we utilize weight gradients from backpropagation to characterize the representation space learned by deep learning algorithms. We demonstrate the utility of such gradients in applications including perceptual image quality assessment and out-of-distribution classification. The applications are chosen to validate the effectiveness of gradients as features when the test image distribution is distorted from the train image distribution. In both applications, the proposed gradient based features outperform activation features. In image quality assessment, the proposed approach is compared with other state of the art approaches and is generally the top performing method on TID 2013 and MULTI-LIVE databases in terms of accuracy, consistency, linearity, and monotonic behavior. Finally, we analyze the effect of regularization on gradients using CURE-TSR dataset for out-of-distribution classification.

2.6CVMay 23, 2019Code

Implicit Background Estimation for Semantic Segmentation

Charles Lehman, Dogancan Temel, Ghassan AlRegib

Scene understanding and semantic segmentation are at the core of many computer vision tasks, many of which, involve interacting with humans in potentially dangerous ways. It is therefore paramount that techniques for principled design of robust models be developed. In this paper, we provide analytic and empirical evidence that correcting potentially errant non-distinct mappings that result from the softmax function can result in improving robustness characteristics on a state-of-the-art semantic segmentation model with minimal impact to performance and minimal changes to the code base.

2.6CVFeb 18, 2019

Object Recognition under Multifarious Conditions: A Reliability Analysis and A Feature Similarity-based Performance Estimation

Dogancan Temel, Jinsol Lee, Ghassan AlRegib

In this paper, we investigate the reliability of online recognition platforms, Amazon Rekognition and Microsoft Azure, with respect to changes in background, acquisition device, and object orientation. We focus on platforms that are commonly used by the public to better understand their real-world performances. To assess the variation in recognition performance, we perform a controlled experiment by changing the acquisition conditions one at a time. We use three smartphones, one DSLR, and one webcam to capture side views and overhead views of objects in a living room, an office, and photo studio setups. Moreover, we introduce a framework to estimate the recognition performance with respect to backgrounds and orientations. In this framework, we utilize both handcrafted features based on color, texture, and shape characteristics and data-driven features obtained from deep neural networks. Experimental results show that deep learning-based image representations can estimate the recognition performance variation with a Spearman's rank-order correlation of 0.94 under multifarious acquisition conditions.

3.4CVFeb 17, 2019

Semantically Interpretable and Controllable Filter Sets

Mohit Prabhushankar, Gukyeong Kwon, Dogancan Temel et al.

In this paper, we generate and control semantically interpretable filters that are directly learned from natural images in an unsupervised fashion. Each semantic filter learns a visually interpretable local structure in conjunction with other filters. The significance of learning these interpretable filter sets is demonstrated on two contrasting applications. The first application is image recognition under progressive decolorization, in which recognition algorithms should be color-insensitive to achieve a robust performance. The second application is image quality assessment where objective methods should be sensitive to color degradations. In the proposed work, the sensitivity and lack thereof are controlled by weighing the semantic filters based on the local structures they represent. To validate the proposed approach, we utilize the CURE-TSR dataset for image recognition and the TID 2013 dataset for image quality assessment. We show that the proposed semantic filter set achieves state-of-the-art performances in both datasets while maintaining its robustness across progressive distortions.

3.7IVNov 22, 2018

Image Quality Assessment and Color Difference

Dogancan Temel, Ghassan AlRegib

An average healthy person does not perceive the world in just black and white. Moreover, the perceived world is not composed of pixels and through vision humans perceive structures. However, the acquisition and display systems discretize the world. Therefore, we need to consider pixels, structures and colors to model the quality of experience. Quality assessment methods use the pixel-wise and structural metrics whereas color science approaches use the patch-based color differences. In this work, we combine these approaches by extending CIEDE2000 formula with perceptual color difference to assess image quality. We examine how perceptual color difference-based metric (PCDM) performs compared to PSNR, CIEDE2000, SSIM, MS-SSIM and CW-SSIM on the LIVE database. In terms of linear correlation, PCDM obtains compatible results under white noise (97.9%), Jpeg (95.9%) and Jp2k (95.6%) with an overall correlation of 92.7%. We also show that PCDM captures color-based artifacts that can not be captured by structure-based metrics.

2.0IVNov 21, 2018

A Comparative Study of Quality and Content-Based Spatial Pooling Strategies in Image Quality Assessment

Dogancan Temel, Ghassan AlRegib

The process of quantifying image quality consists of engineering the quality features and pooling these features to obtain a value or a map. There has been a significant research interest in designing the quality features but pooling is usually overlooked compared to feature design. In this work, we compare the state of the art quality and content-based spatial pooling strategies and show that although features are the key in any image quality assessment, pooling also matters. We also propose a quality-based spatial pooling strategy that is based on linearly weighted percentile pooling (WPP). Pooling strategies are analyzed for squared error, SSIM and PerSIM in LIVE, multiply distorted LIVE and TID2013 image databases.

2.0IVNov 21, 2018

Boosting in Image Quality Assessment

Dogancan Temel, Ghassan AlRegib

In this paper, we analyze the effect of boosting in image quality assessment through multi-method fusion. Existing multi-method studies focus on proposing a single quality estimator. On the contrary, we investigate the generalizability of multi-method fusion as a framework. In addition to support vector machines that are commonly used in the multi-method fusion, we propose using neural networks in the boosting. To span different types of image quality assessment algorithms, we use quality estimators based on fidelity, perceptually-extended fidelity, structural similarity, spectral similarity, color, and learning. In the experiments, we perform k-fold cross validation using the LIVE, the multiply distorted LIVE, and the TID 2013 databases and the performance of image quality assessment algorithms are measured via accuracy-, linearity-, and ranking-based metrics. Based on the experiments, we show that boosting methods generally improve the performance of image quality assessment and the level of improvement depends on the type of the boosting algorithm. Our experimental results also indicate that boosting the worst performing quality estimator with two or more additional methods leads to statistically significant performance enhancements independent of the boosting technique and neural network-based boosting outperforms support vector machine-based boosting when two or more methods are fused.

2.0IVNov 21, 2018

Coding of 3D Videos Based on Visual Discomfort

Dogancan Temel, Ghassan AlRegib

We propose a rate-distortion optimization method for 3D videos based on visual discomfort estimation. We calculate visual discomfort in the encoded depth maps using two indexes: temporal outliers (TO) and spatial outliers (SO). These two indexes are used to measure the difference between the processed depth map and the ground truth depth map. These indexes implicitly depend on the amount of edge information within a frame and on the amount of motion between frames. Moreover, we fuse these indexes considering the temporal and spatial complexities of the content. We test the proposed method on a number of videos and compare the results with the default rate-distortion algorithms in the H.264/AVC codec. We evaluate rate-distortion algorithms by comparing achieved bit-rates, visual degradations in the depth sequences and the fidelity of the depth videos measured by SSIM and PSNR.

15.8IVNov 18, 2018

PerSIM: Multi-resolution Image Quality Assessment in the Perceptually Uniform Color Domain

Dogancan Temel, Ghassan AlRegib

An average observer perceives the world in color instead of black and white. Moreover, the visual system focuses on structures and segments instead of individual pixels. Based on these observations, we propose a full reference objective image quality metric modeling visual system characteristics and chroma similarity in the perceptually uniform color domain (Lab). Laplacian of Gaussian features are obtained in the L channel to model the retinal ganglion cells in human visual system and color similarity is calculated over the a and b channels. In the proposed perceptual similarity index (PerSIM), a multi-resolution approach is followed to mimic the hierarchical nature of human visual system. LIVE and TID2013 databases are used in the validation and PerSIM outperforms all the compared metrics in the overall databases in terms of ranking, monotonic behavior and linearity.

7.6IVNov 14, 2018

ReSIFT: Reliability-Weighted SIFT-based Image Quality Assessment

Dogancan Temel, Ghassan AlRegib

This paper presents a full-reference image quality estimator based on SIFT descriptor matching over reliability-weighted feature maps. Reliability assignment includes a smoothing operation, a transformation to perceptual color domain, a local normalization stage, and a spectral residual computation with global normalization. The proposed method ReSIFT is tested on the LIVE and the LIVE Multiply Distorted databases and compared with 11 state-of-the-art full-reference quality estimators. In terms of the Pearson and the Spearman correlation, ReSIFT is the best performing quality estimator in the overall databases. Moreover, ReSIFT is the best performing quality estimator in at least one distortion group in compression, noise, and blur category.

11.1CVOct 18, 2018

CURE-OR: Challenging Unreal and Real Environments for Object Recognition

Dogancan Temel, Jinsol Lee, Ghassan AlRegib

In this paper, we introduce a large-scale, controlled, and multi-platform object recognition dataset denoted as Challenging Unreal and Real Environments for Object Recognition (CURE-OR). In this dataset, there are 1,000,000 images of 100 objects with varying size, color, and texture that are positioned in five different orientations and captured using five devices including a webcam, a DSLR, and three smartphone cameras in real-world (real) and studio (unreal) environments. The controlled challenging conditions include underexposure, overexposure, blur, contrast, dirty lens, image noise, resizing, and loss of color information. We utilize CURE-OR dataset to test recognition APIs-Amazon Rekognition and Microsoft Azure Computer Vision- and show that their performance significantly degrades under challenging conditions. Moreover, we investigate the relationship between object recognition and image quality and show that objective quality algorithms can estimate recognition performance under certain photometric challenging conditions. The dataset is publicly available at https://ghassanalregib.com/cure-or/.

8.7CVOct 15, 2018

CSV: Image Quality Assessment Based on Color, Structure, and Visual System

Dogancan Temel, Ghassan AlRegib

This paper presents a full-reference image quality estimator based on color, structure, and visual system characteristics denoted as CSV. In contrast to the majority of existing methods, we quantify perceptual color degradations rather than absolute pixel-wise changes. We use the CIEDE2000 color difference formulation to quantify low-level color degradations and the Earth Mover's Distance between color name descriptors to measure significant color degradations. In addition to the perceptual color difference, CSV also contains structural and perceptual differences. Structural feature maps are obtained by mean subtraction and divisive normalization, and perceptual feature maps are obtained from contrast sensitivity formulations of retinal ganglion cells. The proposed quality estimator CSV is tested on the LIVE, the Multiply Distorted LIVE, and the TID 2013 databases, and it is always among the top two performing quality estimators in terms of at least ranking, monotonic behavior or linearity.

7.8CVOct 15, 2018

Traffic Signs in the Wild: Highlights from the IEEE Video and Image Processing Cup 2017 Student Competition [SP Competitions]

Dogancan Temel, Ghassan AlRegib

Robust and reliable traffic sign detection is necessary to bring autonomous vehicles onto our roads. State-of-the-art algorithms successfully perform traffic sign detection over existing databases that mostly lack severe challenging conditions. VIP Cup 2017 competition focused on detecting such traffic signs under challenging conditions. To facilitate such task and competition, we introduced a video dataset denoted as CURE-TSD that includes a variety of challenging conditions. The goal of this challenge was to implement traffic sign detection algorithms that can robustly perform under such challenging conditions. In this article, we share an overview of the VIP Cup 2017 experience including competition setup, teams, technical approaches, participation statistics, and competition experience through finalist teams members' and organizers' eyes.

6.8CVOct 14, 2018

Perceptual Image Quality Assessment through Spectral Analysis of Error Representations

Dogancan Temel, Ghassan AlRegib

In this paper, we analyze the statistics of error signals to assess the perceived quality of images. Specifically, we focus on the magnitude spectrum of error images obtained from the difference of reference and distorted images. Analyzing spectral statistics over grayscale images partially models interference in spatial harmonic distortion exhibited by the visual system but it overlooks color information, selective and hierarchical nature of visual system. To overcome these shortcomings, we introduce an image quality assessment algorithm based on the Spectral Understanding of Multi-scale and Multi-channel Error Representations, denoted as SUMMER. We validate the quality assessment performance over 3 databases with around 30 distortion types. These distortion types are grouped into 7 main categories as compression artifact, image noise, color artifact, communication error, blur, global and local distortions. In total, we benchmark the performance of 17 algorithms along with the proposed algorithm using 5 performance metrics that measure linearity, monotonicity, accuracy, and consistency. In addition to experiments with standard performance metrics, we analyze the distribution of objective and subjective scores with histogram difference metrics and scatter plots. Moreover, we analyze the classification performance of quality assessment algorithms along with their statistical significance tests. Based on our experiments, SUMMER significantly outperforms majority of the compared methods in all benchmark categories

16.7CVDec 7, 2017

CURE-TSR: Challenging Unreal and Real Environments for Traffic Sign Recognition

Dogancan Temel, Gukyeong Kwon, Mohit Prabhushankar et al.

In this paper, we investigate the robustness of traffic sign recognition algorithms under challenging conditions. Existing datasets are limited in terms of their size and challenging condition coverage, which motivated us to generate the Challenging Unreal and Real Environments for Traffic Sign Recognition (CURE-TSR) dataset. It includes more than two million traffic sign images that are based on real-world and simulator data. We benchmark the performance of existing solutions in real-world scenarios and analyze the performance variation with respect to challenging conditions. We show that challenging conditions can decrease the performance of baseline methods significantly, especially if these challenging conditions result in loss or misplacement of spatial information. We also investigate the effect of data augmentation and show that utilization of simulator data along with real-world data enhance the average recognition performance in real-world scenarios. The dataset is publicly available at https://ghassanalregib.com/cure-tsr/.