CVAug 29, 2019Code
Traffic Sign Detection under Challenging Conditions: A Deeper Look Into Performance Variations and Spectral CharacteristicsDogancan Temel, Min-Hung Chen, Ghassan AlRegib
Traffic signs are critical for maintaining the safety and efficiency of our roads. Therefore, we need to carefully assess the capabilities and limitations of automated traffic sign detection systems. Existing traffic sign datasets are limited in terms of type and severity of challenging conditions. Metadata corresponding to these conditions are unavailable and it is not possible to investigate the effect of a single factor because of simultaneous changes in numerous conditions. To overcome the shortcomings in existing datasets, we introduced the CURE-TSD-Real dataset, which is based on simulated challenging conditions that correspond to adversaries that can occur in real-world environments and systems. We test the performance of two benchmark algorithms and show that severe conditions can result in an average performance degradation of 29% in precision and 68% in recall. We investigate the effect of challenging conditions through spectral analysis and show that challenging conditions can lead to distinct magnitude spectrum characteristics. Moreover, we show that mean magnitude spectrum of changes in video sequences under challenging conditions can be an indicator of detection performance. CURE-TSD-Real dataset is available online at https://github.com/olivesgatech/CURE-TSD.
CVFeb 19, 2019Code
Challenging Environments for Traffic Sign Detection: Reliability Assessment under Inclement ConditionsDogancan Temel, Tariq Alshawi, Min-Hung Chen et al.
State-of-the-art algorithms successfully localize and recognize traffic signs over existing datasets, which are limited in terms of challenging condition type and severity. Therefore, it is not possible to estimate the performance of traffic sign detection algorithms under overlooked challenging conditions. Another shortcoming of existing datasets is the limited utilization of temporal information and the unavailability of consecutive frames and annotations. To overcome these shortcomings, we generated the CURE-TSD video dataset and hosted the first IEEE Video and Image Processing (VIP) Cup within the IEEE Signal Processing Society. In this paper, we provide a detailed description of the CURE-TSD dataset, analyze the characteristics of the top performing algorithms, and provide a performance benchmark. Moreover, we investigate the robustness of the benchmarked algorithms with respect to sign size, challenge type and severity. Benchmarked algorithms are based on state-of-the-art and custom convolutional neural networks that achieved a precision of 0.55 and a recall of 0.32, F0.5 score of 0.48 and F2 score of 0.35. Experimental results show that benchmarked algorithms are highly sensitive to tested challenging conditions, which result in an average performance drop of 0.17 in terms of precision and a performance drop of 0.28 in recall under severe conditions. The dataset is publicly available at https://github.com/olivesgatech/CURE-TSD.
CVSep 2, 2020
On the Structures of Representation for the Robustness of Semantic Segmentation to Input CorruptionCharles Lehman, Dogancan Temel, Ghassan AlRegib
Semantic segmentation is a scene understanding task at the heart of safety-critical applications where robustness to corrupted inputs is essential. Implicit Background Estimation (IBE) has demonstrated to be a promising technique to improve the robustness to out-of-distribution inputs for semantic segmentation models for little to no cost. In this paper, we provide analysis comparing the structures learned as a result of optimization objectives that use Softmax, IBE, and Sigmoid in order to improve understanding their relationship to robustness. As a result of this analysis, we propose combining Sigmoid with IBE (SCrIBE) to improve robustness. Finally, we demonstrate that SCrIBE exhibits superior segmentation performance aggregated across all corruptions and severity levels with a mIOU of 42.1 compared to both IBE 40.3 and the Softmax Baseline 37.5.
CVAug 13, 2020
Novelty Detection Through Model-Based Characterization of Neural NetworksGukyeong Kwon, Mohit Prabhushankar, Dogancan Temel et al.
In this paper, we propose a model-based characterization of neural networks to detect novel input types and conditions. Novelty detection is crucial to identify abnormal inputs that can significantly degrade the performance of machine learning algorithms. Majority of existing studies have focused on activation-based representations to detect abnormal inputs, which limits the characterization of abnormality from a data perspective. However, a model perspective can also be informative in terms of the novelties and abnormalities. To articulate the significance of the model perspective in novelty detection, we utilize backpropagated gradients. We conduct a comprehensive analysis to compare the representation capability of gradients with that of activation and show that the gradients outperform the activation in novel class and condition detection. We validate our approach using four image recognition datasets including MNIST, Fashion-MNIST, CIFAR-10, and CURE-TSR. We achieve a significant improvement on all four datasets with an average AUROC of 0.953, 0.918, 0.582, and 0.746, respectively.
CVAug 1, 2020
Contrastive Explanations in Neural NetworksMohit Prabhushankar, Gukyeong Kwon, Dogancan Temel et al.
Visual explanations are logical arguments based on visual features that justify the predictions made by neural networks. Current modes of visual explanations answer questions of the form $`Why \text{ } P?'$. These $Why$ questions operate under broad contexts thereby providing answers that are irrelevant in some cases. We propose to constrain these $Why$ questions based on some context $Q$ so that our explanations answer contrastive questions of the form $`Why \text{ } P, \text{} rather \text{ } than \text{ } Q?'$. In this paper, we formalize the structure of contrastive visual explanations for neural networks. We define contrast based on neural networks and propose a methodology to extract defined contrasts. We then use the extracted contrasts as a plug-in on top of existing $`Why \text{ } P?'$ techniques, specifically Grad-CAM. We demonstrate their value in analyzing both networks and data in applications of large-scale recognition, fine-grained recognition, subsurface seismic analysis, and image quality assessment.
CVJul 18, 2020
Backpropagated Gradient Representations for Anomaly DetectionGukyeong Kwon, Mohit Prabhushankar, Dogancan Temel et al.
Learning representations that clearly distinguish between normal and abnormal data is key to the success of anomaly detection. Most of existing anomaly detection algorithms use activation representations from forward propagation while not exploiting gradients from backpropagation to characterize data. Gradients capture model updates required to represent data. Anomalies require more drastic model updates to fully represent them compared to normal data. Hence, we propose the utilization of backpropagated gradients as representations to characterize model behavior on anomalies and, consequently, detect such anomalies. We show that the proposed method using gradient-based representations achieves state-of-the-art anomaly detection performance in benchmark image recognition datasets. Also, we highlight the computational efficiency and the simplicity of the proposed method in comparison with other state-of-the-art methods relying on adversarial networks or autoregressive models, which require at least 27 times more model parameters than the proposed method.
CVAug 27, 2019
Distorted Representation Space Characterization Through Backpropagated GradientsGukyeong Kwon, Mohit Prabhushankar, Dogancan Temel et al.
In this paper, we utilize weight gradients from backpropagation to characterize the representation space learned by deep learning algorithms. We demonstrate the utility of such gradients in applications including perceptual image quality assessment and out-of-distribution classification. The applications are chosen to validate the effectiveness of gradients as features when the test image distribution is distorted from the train image distribution. In both applications, the proposed gradient based features outperform activation features. In image quality assessment, the proposed approach is compared with other state of the art approaches and is generally the top performing method on TID 2013 and MULTI-LIVE databases in terms of accuracy, consistency, linearity, and monotonic behavior. Finally, we analyze the effect of regularization on gradients using CURE-TSR dataset for out-of-distribution classification.
CVAug 6, 2019
Relative Afferent Pupillary Defect Screening through Transfer LearningDogancan Temel, Melvin J. Mathew, Ghassan AlRegib et al.
Abnormalities in pupillary light reflex can indicate optic nerve disorders that may lead to permanent visual loss if not diagnosed in an early stage. In this study, we focus on relative afferent pupillary defect (RAPD), which is based on the difference between the reactions of the eyes when they are exposed to light stimuli. Incumbent RAPD assessment methods are based on subjective practices that can lead to unreliable measurements. To eliminate subjectivity and obtain reliable measurements, we introduced an automated framework to detect RAPD. For validation, we conducted a clinical study with lab-on-a-headset, which can perform automated light reflex test. In addition to benchmarking handcrafted algorithms, we proposed a transfer learning-based approach that transformed a deep learning-based generic object recognition algorithm into a pupil detector. Based on the conducted experiments, proposed algorithm RAPDNet can achieve a sensitivity and a specificity of 90.6% over 64 test cases in a balanced set, which corresponds to an AUC of 0.929 in ROC analysis. According to our benchmark with three handcrafted algorithms and nine performance metrics, RAPDNet outperforms all other algorithms in every performance category.
CVMay 23, 2019
Implicit Background Estimation for Semantic SegmentationCharles Lehman, Dogancan Temel, Ghassan AlRegib
Scene understanding and semantic segmentation are at the core of many computer vision tasks, many of which, involve interacting with humans in potentially dangerous ways. It is therefore paramount that techniques for principled design of robust models be developed. In this paper, we provide analytic and empirical evidence that correcting potentially errant non-distinct mappings that result from the softmax function can result in improving robustness characteristics on a state-of-the-art semantic segmentation model with minimal impact to performance and minimal changes to the code base.
IVMay 21, 2019
Automated Pupillary Light Reflex Test on a Portable PlatformDogancan Temel, Melvin J. Mathew, Ghassan AlRegib et al.
In this paper, we introduce a portable eye imaging device denoted as lab-on-a-headset, which can automatically perform a swinging flashlight test. We utilized this device in a clinical study to obtain high-resolution recordings of eyes while they are exposed to a varying light stimuli. Half of the participants had relative afferent pupillary defect (RAPD) while the other half was a control group. In case of positive RAPD, patients pupils constrict less or do not constrict when light stimuli swings from the unaffected eye to the affected eye. To automatically diagnose RAPD, we propose an algorithm based on pupil localization, pupil size measurement, and pupil size comparison of right and left eye during the light reflex test. We validate the algorithmic performance over a dataset obtained from 22 subjects and show that proposed algorithm can achieve a sensitivity of 93.8% and a specificity of 87.5%.
CVFeb 18, 2019
Object Recognition under Multifarious Conditions: A Reliability Analysis and A Feature Similarity-based Performance EstimationDogancan Temel, Jinsol Lee, Ghassan AlRegib
In this paper, we investigate the reliability of online recognition platforms, Amazon Rekognition and Microsoft Azure, with respect to changes in background, acquisition device, and object orientation. We focus on platforms that are commonly used by the public to better understand their real-world performances. To assess the variation in recognition performance, we perform a controlled experiment by changing the acquisition conditions one at a time. We use three smartphones, one DSLR, and one webcam to capture side views and overhead views of objects in a living room, an office, and photo studio setups. Moreover, we introduce a framework to estimate the recognition performance with respect to backgrounds and orientations. In this framework, we utilize both handcrafted features based on color, texture, and shape characteristics and data-driven features obtained from deep neural networks. Experimental results show that deep learning-based image representations can estimate the recognition performance variation with a Spearman's rank-order correlation of 0.94 under multifarious acquisition conditions.
CVFeb 17, 2019
Semantically Interpretable and Controllable Filter SetsMohit Prabhushankar, Gukyeong Kwon, Dogancan Temel et al.
In this paper, we generate and control semantically interpretable filters that are directly learned from natural images in an unsupervised fashion. Each semantic filter learns a visually interpretable local structure in conjunction with other filters. The significance of learning these interpretable filter sets is demonstrated on two contrasting applications. The first application is image recognition under progressive decolorization, in which recognition algorithms should be color-insensitive to achieve a robust performance. The second application is image quality assessment where objective methods should be sensitive to color degradations. In the proposed work, the sensitivity and lack thereof are controlled by weighing the semantic filters based on the local structures they represent. To validate the proposed approach, we utilize the CURE-TSR dataset for image recognition and the TID 2013 dataset for image quality assessment. We show that the proposed semantic filter set achieves state-of-the-art performances in both datasets while maintaining its robustness across progressive distortions.
IVNov 22, 2018
Image Quality Assessment and Color DifferenceDogancan Temel, Ghassan AlRegib
An average healthy person does not perceive the world in just black and white. Moreover, the perceived world is not composed of pixels and through vision humans perceive structures. However, the acquisition and display systems discretize the world. Therefore, we need to consider pixels, structures and colors to model the quality of experience. Quality assessment methods use the pixel-wise and structural metrics whereas color science approaches use the patch-based color differences. In this work, we combine these approaches by extending CIEDE2000 formula with perceptual color difference to assess image quality. We examine how perceptual color difference-based metric (PCDM) performs compared to PSNR, CIEDE2000, SSIM, MS-SSIM and CW-SSIM on the LIVE database. In terms of linear correlation, PCDM obtains compatible results under white noise (97.9%), Jpeg (95.9%) and Jp2k (95.6%) with an overall correlation of 92.7%. We also show that PCDM captures color-based artifacts that can not be captured by structure-based metrics.
IVNov 21, 2018
MS-UNIQUE: Multi-model and Sharpness-weighted Unsupervised Image Quality EstimationMohit Prabhushankar, Dogancan Temel, Ghassan AlRegib
In this paper, we train independent linear decoder models to estimate the perceived quality of images. More specifically, we calculate the responses of individual non-overlapping image patches to each of the decoders and scale these responses based on the sharpness characteristics of filter set. We use multiple linear decoders to capture different abstraction levels of the image patches. Training each model is carried out on 100,000 image patches from the ImageNet database in an unsupervised fashion. Color space selection and ZCA Whitening are performed over these patches to enhance the descriptiveness of the data. The proposed quality estimator is tested on the LIVE and the TID 2013 image quality assessment databases. Performance of the proposed method is compared against eleven other state of the art methods in terms of accuracy, consistency, linearity, and monotonic behavior. Based on experimental results, the proposed method is generally among the top performing quality estimators in all categories.
IVNov 21, 2018
Generating Adaptive and Robust Filter Sets Using an Unsupervised Learning FrameworkMohit Prabhushankar, Dogancan Temel, Ghassan AlRegib
In this paper, we introduce an adaptive unsupervised learning framework, which utilizes natural images to train filter sets. The applicability of these filter sets is demonstrated by evaluating their performance in two contrasting applications - image quality assessment and texture retrieval. While assessing image quality, the filters need to capture perceptual differences based on dissimilarities between a reference image and its distorted version. In texture retrieval, the filters need to assess similarity between texture images to retrieve closest matching textures. Based on experiments, we show that the filter responses span a set in which a monotonicity-based metric can measure both the perceptual dissimilarity of natural images and the similarity of texture images. In addition, we corrupt the images in the test set and demonstrate that the proposed method leads to robust and reliable retrieval performance compared to existing methods.
IVNov 21, 2018
A Comparative Study of Quality and Content-Based Spatial Pooling Strategies in Image Quality AssessmentDogancan Temel, Ghassan AlRegib
The process of quantifying image quality consists of engineering the quality features and pooling these features to obtain a value or a map. There has been a significant research interest in designing the quality features but pooling is usually overlooked compared to feature design. In this work, we compare the state of the art quality and content-based spatial pooling strategies and show that although features are the key in any image quality assessment, pooling also matters. We also propose a quality-based spatial pooling strategy that is based on linearly weighted percentile pooling (WPP). Pooling strategies are analyzed for squared error, SSIM and PerSIM in LIVE, multiply distorted LIVE and TID2013 image databases.
IVNov 21, 2018
Boosting in Image Quality AssessmentDogancan Temel, Ghassan AlRegib
In this paper, we analyze the effect of boosting in image quality assessment through multi-method fusion. Existing multi-method studies focus on proposing a single quality estimator. On the contrary, we investigate the generalizability of multi-method fusion as a framework. In addition to support vector machines that are commonly used in the multi-method fusion, we propose using neural networks in the boosting. To span different types of image quality assessment algorithms, we use quality estimators based on fidelity, perceptually-extended fidelity, structural similarity, spectral similarity, color, and learning. In the experiments, we perform k-fold cross validation using the LIVE, the multiply distorted LIVE, and the TID 2013 databases and the performance of image quality assessment algorithms are measured via accuracy-, linearity-, and ranking-based metrics. Based on the experiments, we show that boosting methods generally improve the performance of image quality assessment and the level of improvement depends on the type of the boosting algorithm. Our experimental results also indicate that boosting the worst performing quality estimator with two or more additional methods leads to statistically significant performance enhancements independent of the boosting technique and neural network-based boosting outperforms support vector machine-based boosting when two or more methods are fused.
IVNov 21, 2018
Coding of 3D Videos Based on Visual DiscomfortDogancan Temel, Ghassan AlRegib
We propose a rate-distortion optimization method for 3D videos based on visual discomfort estimation. We calculate visual discomfort in the encoded depth maps using two indexes: temporal outliers (TO) and spatial outliers (SO). These two indexes are used to measure the difference between the processed depth map and the ground truth depth map. These indexes implicitly depend on the amount of edge information within a frame and on the amount of motion between frames. Moreover, we fuse these indexes considering the temporal and spatial complexities of the content. We test the proposed method on a number of videos and compare the results with the default rate-distortion algorithms in the H.264/AVC codec. We evaluate rate-distortion algorithms by comparing achieved bit-rates, visual degradations in the depth sequences and the fidelity of the depth videos measured by SSIM and PSNR.
IVNov 21, 2018
Effectiveness of 3VQM in Capturing Depth InconsistenciesDogancan Temel, Ghassan AlRegib
The 3D video quality metric (3VQM) was proposed to evaluate the temporal and spatial variation of the depth errors for the depth values that would lead to inconsistencies between left and right views, fast changing disparities, and geometric distortions. Previously, we evaluated 3VQM against subjective scores. In this paper, we show the effectiveness of 3VQM in capturing errors and inconsistencies that exist in the rendered depth-based 3D videos. We further investigate how 3VQM could measure excessive disparities, fast changing disparities, geometric distortions, temporal flickering and/or spatial noise in the form of depth cues inconsistency. Results show that 3VQM best captures the depth inconsistencies based on errors in the reference views. However, the metric is not sensitive to depth map mild errors such as those resulting from blur. We also performed a subjective quality test and showed that 3VQM performs better than PSNR, weighted PSNR and SSIM in terms of accuracy, coherency and consistency.
IVNov 19, 2018
A Comparative Study of Computational AestheticsDogancan Temel, Ghassan AlRegib
Objective metrics model image quality by quantifying image degradations or estimating perceived image quality. However, image quality metrics do not model what makes an image more appealing or beautiful. In order to quantify the aesthetics of an image, we need to take it one step further and model the perception of aesthetics. In this paper, we examine computational aesthetics models that use hand-crafted, generic and hybrid descriptors. We show that generic descriptors can perform as well as state of the art hand-crafted aesthetics models that use global features. However, neither generic nor hand-crafted features is sufficient to model aesthetics when we only use global features without considering spatial composition or distribution. We also follow a visual dictionary approach similar to state of the art methods and show that it performs poorly without the spatial pyramid step.
IVNov 18, 2018
PerSIM: Multi-resolution Image Quality Assessment in the Perceptually Uniform Color DomainDogancan Temel, Ghassan AlRegib
An average observer perceives the world in color instead of black and white. Moreover, the visual system focuses on structures and segments instead of individual pixels. Based on these observations, we propose a full reference objective image quality metric modeling visual system characteristics and chroma similarity in the perceptually uniform color domain (Lab). Laplacian of Gaussian features are obtained in the L channel to model the retinal ganglion cells in human visual system and color similarity is calculated over the a and b channels. In the proposed perceptual similarity index (PerSIM), a multi-resolution approach is followed to mimic the hierarchical nature of human visual system. LIVE and TID2013 databases are used in the validation and PerSIM outperforms all the compared metrics in the overall databases in terms of ranking, monotonic behavior and linearity.
IVNov 16, 2018
BLeSS: Bio-inspired Low-level Spatiochromatic Similarity Assisted Image Quality AssessmentDogancan Temel, Ghassan AlRegib
This paper proposes a biologically-inspired low-level spatiochromatic-model-based similarity method (BLeSS) to assist full-reference image-quality estimators that originally oversimplify color perception processes. More specifically, the spatiochromatic model is based on spatial frequency, spatial orientation, and surround contrast effects. The assistant similarity method is used to complement image-quality estimators based on phase congruency, gradient magnitude, and spectral residual. The effectiveness of BLeSS is validated using FSIM, FSIMc and SR-SIM methods on LIVE, Multiply Distorted LIVE, and TID 2013 databases. In terms of Spearman correlation, BLeSS enhances the performance of all quality estimators in color-based degradations and the enhancement is at 100% for both feature- and spectral residual-based similarity methods. Moreover, BleSS significantly enhances the performance of SR-SIM and FSIM in the full TID 2013 database.
IVNov 14, 2018
ReSIFT: Reliability-Weighted SIFT-based Image Quality AssessmentDogancan Temel, Ghassan AlRegib
This paper presents a full-reference image quality estimator based on SIFT descriptor matching over reliability-weighted feature maps. Reliability assignment includes a smoothing operation, a transformation to perceptual color domain, a local normalization stage, and a spectral residual computation with global normalization. The proposed method ReSIFT is tested on the LIVE and the LIVE Multiply Distorted databases and compared with 11 state-of-the-art full-reference quality estimators. In terms of the Pearson and the Spearman correlation, ReSIFT is the best performing quality estimator in the overall databases. Moreover, ReSIFT is the best performing quality estimator in at least one distortion group in compression, noise, and blur category.
CVOct 18, 2018
CURE-OR: Challenging Unreal and Real Environments for Object RecognitionDogancan Temel, Jinsol Lee, Ghassan AlRegib
In this paper, we introduce a large-scale, controlled, and multi-platform object recognition dataset denoted as Challenging Unreal and Real Environments for Object Recognition (CURE-OR). In this dataset, there are 1,000,000 images of 100 objects with varying size, color, and texture that are positioned in five different orientations and captured using five devices including a webcam, a DSLR, and three smartphone cameras in real-world (real) and studio (unreal) environments. The controlled challenging conditions include underexposure, overexposure, blur, contrast, dirty lens, image noise, resizing, and loss of color information. We utilize CURE-OR dataset to test recognition APIs-Amazon Rekognition and Microsoft Azure Computer Vision- and show that their performance significantly degrades under challenging conditions. Moreover, we investigate the relationship between object recognition and image quality and show that objective quality algorithms can estimate recognition performance under certain photometric challenging conditions. The dataset is publicly available at https://ghassanalregib.com/cure-or/.
CVOct 15, 2018
CSV: Image Quality Assessment Based on Color, Structure, and Visual SystemDogancan Temel, Ghassan AlRegib
This paper presents a full-reference image quality estimator based on color, structure, and visual system characteristics denoted as CSV. In contrast to the majority of existing methods, we quantify perceptual color degradations rather than absolute pixel-wise changes. We use the CIEDE2000 color difference formulation to quantify low-level color degradations and the Earth Mover's Distance between color name descriptors to measure significant color degradations. In addition to the perceptual color difference, CSV also contains structural and perceptual differences. Structural feature maps are obtained by mean subtraction and divisive normalization, and perceptual feature maps are obtained from contrast sensitivity formulations of retinal ganglion cells. The proposed quality estimator CSV is tested on the LIVE, the Multiply Distorted LIVE, and the TID 2013 databases, and it is always among the top two performing quality estimators in terms of at least ranking, monotonic behavior or linearity.
CVOct 15, 2018
Traffic Signs in the Wild: Highlights from the IEEE Video and Image Processing Cup 2017 Student Competition [SP Competitions]Dogancan Temel, Ghassan AlRegib
Robust and reliable traffic sign detection is necessary to bring autonomous vehicles onto our roads. State-of-the-art algorithms successfully perform traffic sign detection over existing databases that mostly lack severe challenging conditions. VIP Cup 2017 competition focused on detecting such traffic signs under challenging conditions. To facilitate such task and competition, we introduced a video dataset denoted as CURE-TSD that includes a variety of challenging conditions. The goal of this challenge was to implement traffic sign detection algorithms that can robustly perform under such challenging conditions. In this article, we share an overview of the VIP Cup 2017 experience including competition setup, teams, technical approaches, participation statistics, and competition experience through finalist teams members' and organizers' eyes.
CVOct 14, 2018
Perceptual Image Quality Assessment through Spectral Analysis of Error RepresentationsDogancan Temel, Ghassan AlRegib
In this paper, we analyze the statistics of error signals to assess the perceived quality of images. Specifically, we focus on the magnitude spectrum of error images obtained from the difference of reference and distorted images. Analyzing spectral statistics over grayscale images partially models interference in spatial harmonic distortion exhibited by the visual system but it overlooks color information, selective and hierarchical nature of visual system. To overcome these shortcomings, we introduce an image quality assessment algorithm based on the Spectral Understanding of Multi-scale and Multi-channel Error Representations, denoted as SUMMER. We validate the quality assessment performance over 3 databases with around 30 distortion types. These distortion types are grouped into 7 main categories as compression artifact, image noise, color artifact, communication error, blur, global and local distortions. In total, we benchmark the performance of 17 algorithms along with the proposed algorithm using 5 performance metrics that measure linearity, monotonicity, accuracy, and consistency. In addition to experiments with standard performance metrics, we analyze the distribution of objective and subjective scores with histogram difference metrics and scatter plots. Moreover, we analyze the classification performance of quality assessment algorithms along with their statistical significance tests. Based on our experiments, SUMMER significantly outperforms majority of the compared methods in all benchmark categories
CVDec 7, 2017
CURE-TSR: Challenging Unreal and Real Environments for Traffic Sign RecognitionDogancan Temel, Gukyeong Kwon, Mohit Prabhushankar et al.
In this paper, we investigate the robustness of traffic sign recognition algorithms under challenging conditions. Existing datasets are limited in terms of their size and challenging condition coverage, which motivated us to generate the Challenging Unreal and Real Environments for Traffic Sign Recognition (CURE-TSR) dataset. It includes more than two million traffic sign images that are based on real-world and simulator data. We benchmark the performance of existing solutions in real-world scenarios and analyze the performance variation with respect to challenging conditions. We show that challenging conditions can decrease the performance of baseline methods significantly, especially if these challenging conditions result in loss or misplacement of spatial information. We also investigate the effect of data augmentation and show that utilization of simulator data along with real-world data enhance the average recognition performance in real-world scenarios. The dataset is publicly available at https://ghassanalregib.com/cure-tsr/.