NEJul 27, 2023
A Survey on Reservoir Computing and its Interdisciplinary Applications Beyond Traditional Machine LearningHeng Zhang, Danilo Vasconcellos Vargas
Reservoir computing (RC), first applied to temporal signal processing, is a recurrent neural network in which neurons are randomly connected. Once initialized, the connection strengths remain unchanged. Such a simple structure turns RC into a non-linear dynamical system that maps low-dimensional inputs into a high-dimensional space. The model's rich dynamics, linear separability, and memory capacity then enable a simple linear readout to generate adequate responses for various applications. RC spans areas far beyond machine learning, since it has been shown that the complex dynamics can be realized in various physical hardware implementations and biological devices. This yields greater flexibility and shorter computation time. Moreover, the neuronal responses triggered by the model's dynamics shed light on understanding brain mechanisms that also exploit similar dynamical processes. While the literature on RC is vast and fragmented, here we conduct a unified review of RC's recent developments from machine learning to physics, biology, and neuroscience. We first review the early RC models, and then survey the state-of-the-art models and their applications. We further introduce studies on modeling the brain's mechanisms by RC. Finally, we offer new perspectives on RC development, including reservoir design, coding frameworks unification, physical RC implementations, and interaction between RC, cognitive neuroscience and evolution.
LGOct 16, 2023
Symmetrical SyncMap for Imbalanced General Chunking ProblemsHeng Zhang, Danilo Vasconcellos Vargas
Recently, SyncMap pioneered an approach to learn complex structures from sequences as well as adapt to any changes in underlying structures. This is achieved by using only nonlinear dynamical equations inspired by neuron group behaviors, i.e., without loss functions. Here we propose Symmetrical SyncMap that goes beyond the original work to show how to create dynamical equations and attractor-repeller points which are stable over the long run, even dealing with imbalanced continual general chunking problems (CGCPs). The main idea is to apply equal updates from negative and positive feedback loops by symmetrical activation. We then introduce the concept of memory window to allow for more positive updates. Our algorithm surpasses or ties other unsupervised state-of-the-art baselines in all 12 imbalanced CGCPs with various difficulties, including dynamically changing ones. To verify its performance in real-world scenarios, we conduct experiments on several well-studied structure learning problems. The proposed method surpasses substantially other methods in 3 out of 4 scenarios, suggesting that symmetrical activation plays a critical role in uncovering topological structures and even hierarchies encoded in temporal data.
NEJun 19, 2023
Generating Oscillation Activity with Echo State Network to Mimic the Behavior of a Simple Central Pattern GeneratorTham Yik Foong, Danilo Vasconcellos Vargas
This paper presents a method for reproducing a simple central pattern generator (CPG) using a modified Echo State Network (ESN). Conventionally, the dynamical reservoir needs to be damped to stabilize and preserve memory. However, we find that a reservoir that develops oscillatory activity without any external excitation can mimic the behaviour of a simple CPG in biological systems. We define the specific neuron ensemble required for generating oscillations in the reservoir and demonstrate how adjustments to the leaking rate, spectral radius, topology, and population size can increase the probability of reproducing these oscillations. The results of the experiments, conducted on the time series simulation tasks, demonstrate that the ESN is able to generate the desired waveform without any input. This approach offers a promising solution for the development of bio-inspired controllers for robotic systems.
CVNov 1, 2023
Improving Robustness for Vision Transformer with a Simple Dynamic Scanning AugmentationShashank Kotyan, Danilo Vasconcellos Vargas
Vision Transformer (ViT) has demonstrated promising performance in computer vision tasks, comparable to state-of-the-art neural networks. Yet, this new type of deep neural network architecture is vulnerable to adversarial attacks limiting its capabilities in terms of robustness. This article presents a novel contribution aimed at further improving the accuracy and robustness of ViT, particularly in the face of adversarial attacks. We propose an augmentation technique called `Dynamic Scanning Augmentation' that leverages dynamic input sequences to adaptively focus on different patches, thereby maintaining performance and robustness. Our detailed investigations reveal that this adaptability to the input sequence induces significant changes in the attention mechanism of ViT, even for the same image. We introduce four variations of Dynamic Scanning Augmentation, outperforming ViT in terms of both robustness to adversarial attacks and accuracy against natural images, with one variant showing comparable results. By integrating our augmentation technique, we observe a substantial increase in ViT's robustness, improving it from $17\%$ to $92\%$ measured across different types of adversarial attacks. These findings, together with other comprehensive tests, indicate that Dynamic Scanning Augmentation enhances accuracy and robustness by promoting a more adaptive type of attention. In conclusion, this work contributes to the ongoing research on Vision Transformers by introducing Dynamic Scanning Augmentation as a technique for improving the accuracy and robustness of ViT. The observed results highlight the potential of this approach in advancing computer vision tasks and merit further exploration in future studies.
CVNov 24, 2023
Synthetic Shifts to Initial Seed Vector Exposes the Brittle Nature of Latent-Based Diffusion ModelsMao Po-Yuan, Shashank Kotyan, Tham Yik Foong et al.
Recent advances in Conditional Diffusion Models have led to substantial capabilities in various domains. However, understanding the impact of variations in the initial seed vector remains an underexplored area of concern. Particularly, latent-based diffusion models display inconsistencies in image generation under standard conditions when initialized with suboptimal initial seed vectors. To understand the impact of the initial seed vector on generated samples, we propose a reliability evaluation framework that evaluates the generated samples of a diffusion model when the initial seed vector is subjected to various synthetic shifts. Our results indicate that slight manipulations to the initial seed vector of the state-of-the-art Stable Diffusion (Rombach et al., 2022) can lead to significant disturbances in the generated samples, consequently creating images without the effect of conditioning variables. In contrast, GLIDE (Nichol et al., 2022) stands out in generating reliable samples even when the initial seed vector is transformed. Thus, our study sheds light on the importance of the selection and the impact of the initial seed vector in the latent-based diffusion model.
AIFeb 4, 2023
Dynamical Equations With Bottom-up Self-Organizing Properties Learn Accurate Dynamical Hierarchies Without Any Loss FunctionDanilo Vasconcellos Vargas, Tham Yik Foong, Heng Zhang
Self-organization is ubiquitous in nature and mind. However, machine learning and theories of cognition still barely touch the subject. The hurdle is that general patterns are difficult to define in terms of dynamical equations and designing a system that could learn by reordering itself is still to be seen. Here, we propose a learning system, where patterns are defined within the realm of nonlinear dynamics with positive and negative feedback loops, allowing attractor-repeller pairs to emerge for each pattern observed. Experiments reveal that such a system can map temporal to spatial correlation, enabling hierarchical structures to be learned from sequential data. The results are accurate enough to surpass state-of-the-art unsupervised learning algorithms in seven out of eight experiments as well as two real-world problems. Interestingly, the dynamic nature of the system makes it inherently adaptive, giving rise to phenomena similar to phase transitions in chemistry/thermodynamics when the input structure changes. Thus, the work here sheds light on how self-organization can allow for pattern recognition and hints at how intelligent behavior might emerge from simple dynamic equations without any objective/loss function.
AIMar 10
LCA: Local Classifier Alignment for Continual LearningTung Tran, Danilo Vasconcellos Vargas, Khoat Than
A fundamental requirement for intelligent systems is the ability to learn continuously under changing environments. However, models trained in this regime often suffer from catastrophic forgetting. Leveraging pre-trained models has recently emerged as a promising solution, since their generalized feature extractors enable faster and more robust adaptation. While some earlier works mitigate forgetting by fine-tuning only on the first task, this approach quickly deteriorates as the number of tasks grows and the data distributions diverge. More recent research instead seeks to consolidate task knowledge into a unified backbone, or adapting the backbone as new tasks arrive. However, such approaches may create a (potential) \textit{mismatch} between task-specific classifiers and the adapted backbone. To address this issue, we propose a novel \textit{Local Classifier Alignment} (LCA) loss to better align the classifier with backbone. Theoretically, we show that this LCA loss can enable the classifier to not only generalize well for all observed tasks, but also improve robustness. Furthermore, we develop a complete solution for continual learning, following the model merging approach and using LCA. Extensive experiments on several standard benchmarks demonstrate that our method often achieves leading performance, sometimes surpasses the state-of-the-art methods with a large margin.
CVNov 22, 2023
The Challenges of Image Generation Models in Generating Multi-Component ImagesTham Yik Foong, Shashank Kotyan, Po Yuan Mao et al.
Recent advances in text-to-image generators have led to substantial capabilities in image generation. However, the complexity of prompts acts as a bottleneck in the quality of images generated. A particular under-explored facet is the ability of generative models to create high-quality images comprising multiple components given as a prior. In this paper, we propose and validate a metric called Components Inclusion Score (CIS) to evaluate the extent to which a model can correctly generate multiple components. Our results reveal that the evaluated models struggle to incorporate all the visual elements from prompts with multiple components (8.53% drop in CIS per component for all evaluated models). We also identify a significant decline in the quality of the images and context awareness within an image as the number of components increased (15.91% decrease in inception Score and 9.62% increase in Frechet Inception Distance). To remedy this issue, we fine-tuned Stable Diffusion V2 on a custom-created test dataset with multiple components, outperforming its vanilla counterpart. To conclude, these findings reveal a critical limitation in existing text-to-image generators, shedding light on the challenge of generating multiple components within a single image using a complex prompt.
CLJul 13, 2024
Beyond KV Caching: Shared Attention for Efficient LLMsBingli Liao, Danilo Vasconcellos Vargas
The efficiency of large language models (LLMs) remains a critical challenge, particularly in contexts where computational resources are limited. Traditional attention mechanisms in these models, while powerful, require significant computational and memory resources due to the necessity of recalculating and storing attention weights across different layers. This paper introduces a novel Shared Attention (SA) mechanism, designed to enhance the efficiency of LLMs by directly sharing computed attention weights across multiple layers. Unlike previous methods that focus on sharing intermediate Key-Value (KV) caches, our approach utilizes the isotropic tendencies of attention distributions observed in advanced LLMs post-pretraining to reduce both the computational flops and the size of the KV cache required during inference. We empirically demonstrate that implementing SA across various LLMs results in minimal accuracy loss on standard benchmarks. Our findings suggest that SA not only conserves computational resources but also maintains robust model performance, thereby facilitating the deployment of more efficient LLMs in resource-constrained environments.
CVAug 17, 2024
Linking Robustness and Generalization: A k* Distribution Analysis of Concept Clustering in Latent Space for Vision ModelsShashank Kotyan, Pin-Yu Chen, Danilo Vasconcellos Vargas
Most evaluations of vision models use indirect methods to assess latent space quality. These methods often involve adding extra layers to project the latent space into a new one. This projection makes it difficult to analyze and compare the original latent space. This article uses the k* Distribution, a local neighborhood analysis method, to examine the learned latent space at the level of individual concepts, which can be extended to examine the entire latent space. We introduce skewness-based true and approximate metrics for interpreting individual concepts to assess the overall quality of vision models' latent space. Our findings indicate that current vision models frequently fracture the distributions of individual concepts within the latent space. Nevertheless, as these models improve in generalization across multiple datasets, the degree of fracturing diminishes. A similar trend is observed in robust vision models, where increased robustness correlates with reduced fracturing. Ultimately, this approach enables a direct interpretation and comparison of the latent spaces of different vision models and reveals a relationship between a model's generalizability and robustness. Results show that as a model becomes more general and robust, it tends to learn features that result in better clustering of concepts. Project Website is available online at https://shashankkotyan.github.io/k-Distribution/
LGNov 16, 2023
Towards Improving Robustness Against Common Corruptions using Mixture of Class Specific ExpertsShashank Kotyan, Danilo Vasconcellos Vargas
Neural networks have demonstrated significant accuracy across various domains, yet their vulnerability to subtle input alterations remains a persistent challenge. Conventional methods like data augmentation, while effective to some extent, fall short in addressing unforeseen corruptions, limiting the adaptability of neural networks in real-world scenarios. In response, this paper introduces a novel paradigm known as the Mixture of Class-Specific Expert Architecture. The approach involves disentangling feature learning for individual classes, offering a nuanced enhancement in scalability and overall performance. By training dedicated network segments for each class and subsequently aggregating their outputs, the proposed architecture aims to mitigate vulnerabilities associated with common neural network structures. The study underscores the importance of comprehensive evaluation methodologies, advocating for the incorporation of benchmarks like the common corruptions benchmark. This inclusion provides nuanced insights into the vulnerabilities of neural networks, especially concerning their generalization capabilities and robustness to unforeseen distortions. The research aligns with the broader objective of advancing the development of highly robust learning systems capable of nuanced reasoning across diverse and challenging real-world scenarios. Through this contribution, the paper aims to foster a deeper understanding of neural network limitations and proposes a practical approach to enhance their resilience in the face of evolving and unpredictable conditions.
CVNov 14, 2023
Towards Improving Robustness Against Common Corruptions in Object Detectors Using Adversarial Contrastive LearningShashank Kotyan, Danilo Vasconcellos Vargas
Neural networks have revolutionized various domains, exhibiting remarkable accuracy in tasks like natural language processing and computer vision. However, their vulnerability to slight alterations in input samples poses challenges, particularly in safety-critical applications like autonomous driving. Current approaches, such as introducing distortions during training, fall short in addressing unforeseen corruptions. This paper proposes an innovative adversarial contrastive learning framework to enhance neural network robustness simultaneously against adversarial attacks and common corruptions. By generating instance-wise adversarial examples and optimizing contrastive loss, our method fosters representations that resist adversarial perturbations and remain robust in real-world scenarios. Subsequent contrastive learning then strengthens the similarity between clean samples and their adversarial counterparts, fostering representations resistant to both adversarial attacks and common distortions. By focusing on improving performance under adversarial and real-world conditions, our approach aims to bolster the robustness of neural networks in safety-critical applications, such as autonomous vehicles navigating unpredictable weather conditions. We anticipate that this framework will contribute to advancing the reliability of neural networks in challenging environments, facilitating their widespread adoption in mission-critical scenarios.
AIJun 14, 2020Code
Continual General Chunking Problem and SyncMapDanilo Vasconcellos Vargas, Toshitake Asabuki
Humans possess an inherent ability to chunk sequences into their constituent parts. In fact, this ability is thought to bootstrap language skills and learning of image patterns which might be a key to a more animal-like type of intelligence. Here, we propose a continual generalization of the chunking problem (an unsupervised problem), encompassing fixed and probabilistic chunks, discovery of temporal and causal structures and their continual variations. Additionally, we propose an algorithm called SyncMap that can learn and adapt to changes in the problem by creating a dynamic map which preserves the correlation between variables. Results of SyncMap suggest that the proposed algorithm learn near optimal solutions, despite the presence of many types of structures and their continual variation. When compared to Word2vec, PARSER and MRIL, SyncMap surpasses or ties with the best algorithm on $66\%$ of the scenarios while being the second best in the remaining $34\%$. SyncMap's model-free simple dynamics and the absence of loss functions reveal that, perhaps surprisingly, much can be done with self-organization alone. Code available at https://github.com/zweifel/SyncMap.
NEJun 27, 2019Code
Evolving Robust Neural Architectures to Defend from Adversarial AttacksShashank Kotyan, Danilo Vasconcellos Vargas
Neural networks are prone to misclassify slightly modified input images. Recently, many defences have been proposed, but none have improved the robustness of neural networks consistently. Here, we propose to use adversarial attacks as a function evaluation to search for neural architectures that can resist such attacks automatically. Experiments on neural architecture search algorithms from the literature show that although accurate, they are not able to find robust architectures. A significant reason for this lies in their limited search space. By creating a novel neural architecture search with options for dense layers to connect with convolution layers and vice-versa as well as the addition of concatenation layers in the search, we were able to evolve an architecture that is inherently accurate on adversarial samples. Interestingly, this inherent robustness of the evolved architecture rivals state-of-the-art defences such as adversarial training while being trained only on the non-adversarial samples. Moreover, the evolved architecture makes use of some peculiar traits which might be useful for developing even more robust ones. Thus, the results here confirm that more robust architectures exist as well as opens up a new realm of feasibilities for the development and exploration of neural networks. Code available at http://bit.ly/RobustArchitectureSearch.
CVJun 15, 2019Code
Representation Quality Of Neural Networks Links To Adversarial Attacks and DefencesShashank Kotyan, Danilo Vasconcellos Vargas, Moe Matsuki
Neural networks have been shown vulnerable to a variety of adversarial algorithms. A crucial step to understanding the rationale for this lack of robustness is to assess the potential of the neural networks' representation to encode the existing features. Here, we propose a method to understand the representation quality of the neural networks using a novel test based on Zero-Shot Learning, entitled Raw Zero-Shot. The principal idea is that, if an algorithm learns rich features, such features should be able to interpret "unknown" classes as an aggregate of previously learned features. This is because unknown classes usually share several regular features with recognised classes, given the features learned are general enough. We further introduce two metrics to assess these learned features to interpret unknown classes. One is based on inter-cluster validation technique (Davies-Bouldin Index), and the other is based on the distance to an approximated ground-truth. Experiments suggest that adversarial defences improve the representation of the classifiers, further suggesting that to improve the robustness of the classifiers, one has to improve the representation quality also. Experiments also reveal a strong association (a high Pearson Correlation and low p-value) between the metrics and adversarial attacks. Interestingly, the results indicate that dynamic routing networks such as CapsNet have better representation while current deeper neural networks are trading off representation quality for accuracy. Code available at http://bit.ly/RepresentationMetrics.
LGJun 14, 2019Code
Adversarial Robustness Assessment: Why both $L_0$ and $L_\infty$ Attacks Are NecessaryShashank Kotyan, Danilo Vasconcellos Vargas
There exists a vast number of adversarial attacks and defences for machine learning algorithms of various types which makes assessing the robustness of algorithms a daunting task. To make matters worse, there is an intrinsic bias in these adversarial algorithms. Here, we organise the problems faced: a) Model Dependence, b) Insufficient Evaluation, c) False Adversarial Samples, and d) Perturbation Dependent Results). Based on this, we propose a model agnostic dual quality assessment method, together with the concept of robustness levels to tackle them. We validate the dual quality assessment on state-of-the-art neural networks (WideResNet, ResNet, AllConv, DenseNet, NIN, LeNet and CapsNet) as well as adversarial defences for image classification problem. We further show that current networks and defences are vulnerable at all levels of robustness. The proposed robustness assessment reveals that depending on the metric used (i.e., $L_0$ or $L_\infty$), the robustness may vary significantly. Hence, the duality should be taken into account for a correct evaluation. Moreover, a mathematical derivation, as well as a counter-example, suggest that $L_1$ and $L_2$ metrics alone are not sufficient to avoid spurious adversarial samples. Interestingly, the threshold attack of the proposed assessment is a novel $L_\infty$ black-box adversarial method which requires even less perturbation than the One-Pixel Attack (only $12\%$ of One-Pixel Attack's amount of perturbation) to achieve similar results. Code is available at http://bit.ly/DualQualityAssessment.
NEJan 6, 2019Code
Spectrum-Diverse Neuroevolution with Unified Neural ModelsDanilo Vasconcellos Vargas, Junichi Murata
Learning algorithms are being increasingly adopted in various applications. However, further expansion will require methods that work more automatically. To enable this level of automation, a more powerful solution representation is needed. However, by increasing the representation complexity a second problem arises. The search space becomes huge and therefore an associated scalable and efficient searching algorithm is also required. To solve both problems, first a powerful representation is proposed that unifies most of the neural networks features from the literature into one representation. Secondly, a new diversity preserving method called Spectrum Diversity is created based on the new concept of chromosome spectrum that creates a spectrum out of the characteristics and frequency of alleles in a chromosome. The combination of Spectrum Diversity with a unified neuron representation enables the algorithm to either surpass or equal NeuroEvolution of Augmenting Topologies (NEAT) on all of the five classes of problems tested. Ablation tests justifies the good results, showing the importance of added new features in the unified neuron representation. Part of the success is attributed to the novelty-focused evolution and good scalability with chromosome size provided by Spectrum Diversity. Thus, this study sheds light on a new representation and diversity preserving mechanism that should impact algorithms and applications to come. To download the code please access the following https://github.com/zweifel/Physis-Shard.
NEJan 2, 2019Code
General Subpopulation Framework and Taming the Conflict Inside PopulationsDanilo Vasconcellos Vargas, Junichi Murata, Hirotaka Takano et al.
Structured evolutionary algorithms have been investigated for some time. However, they have been under-explored specially in the field of multi-objective optimization. Despite their good results, the use of complex dynamics and structures make their understanding and adoption rate low. Here, we propose the general subpopulation framework that has the capability of integrating optimization algorithms without restrictions as well as aid the design of structured algorithms. The proposed framework is capable of generalizing most of the structured evolutionary algorithms, such as cellular algorithms, island models, spatial predator-prey and restricted mating based algorithms under its formalization. Moreover, we propose two algorithms based on the general subpopulation framework, demonstrating that with the simple addition of a number of single-objective differential evolution algorithms for each objective the results improve greatly, even when the combined algorithms behave poorly when evaluated alone at the tests. Most importantly, the comparison between the subpopulation algorithms and their related panmictic algorithms suggests that the competition between different strategies inside one population can have deleterious consequences for an algorithm and reveal a strong benefit of using the subpopulation framework. The code for SAN, the proposed multi-objective algorithm which has the current best results in the hardest benchmark, is available at the following https://github.com/zweifel/zweifel
LGDec 7, 2023
k* Distribution: Evaluating the Latent Space of Deep Neural Networks using Local Neighborhood AnalysisShashank Kotyan, Tatsuya Ueda, Danilo Vasconcellos Vargas
Most examinations of neural networks' learned latent spaces typically employ dimensionality reduction techniques such as t-SNE or UMAP. These methods distort the local neighborhood in the visualization, making it hard to distinguish the structure of a subset of samples in the latent space. In response to this challenge, we introduce the {k*~distribution} and its corresponding visualization technique This method uses local neighborhood analysis to guarantee the preservation of the structure of sample distributions for individual classes within the subset of the learned latent space. This facilitates easy comparison of different k*~distributions, enabling analysis of how various classes are processed by the same neural network. Our study reveals three distinct distributions of samples within the learned latent space subset: a) Fractured, b) Overlapped, and c) Clustered, providing a more profound understanding of existing contemporary visualizations. Experiments show that the distribution of samples within the network's learned latent space significantly varies depending on the class. Furthermore, we illustrate that our analysis can be applied to explore the latent space of diverse neural network architectures, various layers within neural networks, transformations applied to input samples, and the distribution of training and testing data for neural networks. Thus, the k* distribution should aid in visualizing the structure inside neural networks and further foster their understanding. Project Website is available online at https://shashankkotyan.github.io/k-Distribution/.
CLMar 22, 2024
Extending Token Computation for LLM ReasoningBingli Liao, Danilo Vasconcellos Vargas
Large Language Models (LLMs) are pivotal in advancing natural language processing but often struggle with complex reasoning tasks due to inefficient attention distributions. In this paper, we explore the effect of increased computed tokens on LLM performance and introduce a novel method for extending computed tokens in the Chain-of-Thought (CoT) process, utilizing attention mechanism optimization. By fine-tuning an LLM on a domain-specific, highly structured dataset, we analyze attention patterns across layers, identifying inefficiencies caused by non-semantic tokens with outlier high attention scores. To address this, we propose an algorithm that emulates early layer attention patterns across downstream layers to re-balance skewed attention distributions and enhance knowledge abstraction. Our findings demonstrate that our approach not only facilitates a deeper understanding of the internal dynamics of LLMs but also significantly improves their reasoning capabilities, particularly in non-STEM domains. Our study lays the groundwork for further innovations in LLM design, aiming to create more powerful, versatile, and responsible models capable of tackling a broad range of real-world applications.
CVJun 19, 2025
SyncMapV2: Robust and Adaptive Unsupervised SegmentationHeng Zhang, Zikang Wan, Danilo Vasconcellos Vargas
Human vision excels at segmenting visual cues without the need for explicit training, and it remains remarkably robust even as noise severity increases. In contrast, existing AI algorithms struggle to maintain accuracy under similar conditions. Here, we present SyncMapV2, the first to solve unsupervised segmentation with state-of-the-art robustness. SyncMapV2 exhibits a minimal drop in mIoU, only 0.01%, under digital corruption, compared to a 23.8% drop observed in SOTA methods. This superior performance extends across various types of corruption: noise (7.3% vs. 37.7%), weather (7.5% vs. 33.8%), and blur (7.0% vs. 29.5%). Notably, SyncMapV2 accomplishes this without any robust training, supervision, or loss functions. It is based on a learning paradigm that uses self-organizing dynamical equations combined with concepts from random networks. Moreover, unlike conventional methods that require re-initialization for each new input, SyncMapV2 adapts online, mimicking the continuous adaptability of human vision. Thus, we go beyond the accurate and robust results, and present the first algorithm that can do all the above online, adapting to input rather than re-initializing. In adaptability tests, SyncMapV2 demonstrates near-zero performance degradation, which motivates and fosters a new generation of robust and adaptive intelligence in the near future.
LGApr 8, 2024
Adapting to Covariate Shift in Real-time by Encoding Trees with Motion EquationsTham Yik Foong, Heng Zhang, Mao Po Yuan et al.
Input distribution shift presents a significant problem in many real-world systems. Here we present Xenovert, an adaptive algorithm that can dynamically adapt to changes in input distribution. It is a perfect binary tree that adaptively divides a continuous input space into several intervals of uniform density while receiving a continuous stream of input. This process indirectly maps the source distribution to the shifted target distribution, preserving the data's relationship with the downstream decoder/operation, even after the shift occurs. In this paper, we demonstrated how a neural network integrated with Xenovert achieved better results in 4 out of 5 shifted datasets, saving the hurdle of retraining a machine learning model. We anticipate that Xenovert can be applied to many more applications that require adaptation to unforeseen input distribution shifts, even when the distribution shift is drastic.
CVFeb 7, 2024
Breaking Free: How to Hack Safety Guardrails in Black-Box Diffusion Models!Shashank Kotyan, Po-Yuan Mao, Pin-Yu Chen et al.
Deep neural networks can be exploited using natural adversarial samples, which do not impact human perception. Current approaches often rely on deep neural networks' white-box nature to generate these adversarial samples or synthetically alter the distribution of adversarial samples compared to the training distribution. In contrast, we propose EvoSeed, a novel evolutionary strategy-based algorithmic framework for generating photo-realistic natural adversarial samples. Our EvoSeed framework uses auxiliary Conditional Diffusion and Classifier models to operate in a black-box setting. We employ CMA-ES to optimize the search for an initial seed vector, which, when processed by the Conditional Diffusion Model, results in the natural adversarial sample misclassified by the Classifier Model. Experiments show that generated adversarial images are of high image quality, raising concerns about generating harmful content bypassing safety classifiers. Our research opens new avenues to understanding the limitations of current safety mechanisms and the risk of plausible attacks against classifier systems using image generation. Project Website can be accessed at: https://shashankkotyan.github.io/EvoSeed.
CVJun 10, 2021
Deep neural network loses attention to adversarial imagesShashank Kotyan, Danilo Vasconcellos Vargas
Adversarial algorithms have shown to be effective against neural networks for a variety of tasks. Some adversarial algorithms perturb all the pixels in the image minimally for the image classification task in image classification. In contrast, some algorithms perturb few pixels strongly. However, very little information is available regarding why these adversarial samples so diverse from each other exist. Recently, Vargas et al. showed that the existence of these adversarial samples might be due to conflicting saliency within the neural network. We test this hypothesis of conflicting saliency by analysing the Saliency Maps (SM) and Gradient-weighted Class Activation Maps (Grad-CAM) of original and few different types of adversarial samples. We also analyse how different adversarial samples distort the attention of the neural network compared to original samples. We show that in the case of Pixel Attack, perturbed pixels either calls the network attention to themselves or divert the attention from them. Simultaneously, the Projected Gradient Descent Attack perturbs pixels so that intermediate layers inside the neural network lose attention for the correct class. We also show that both attacks affect the saliency map and activation maps differently. Thus, shedding light on why some defences successful against some attacks remain vulnerable against other attacks. We hope that this analysis will improve understanding of the existence and the effect of adversarial samples and enable the community to develop more robust neural networks.
CVSep 2, 2020
Perceptual Deep Neural Networks: Adversarial Robustness through Input RecreationDanilo Vasconcellos Vargas, Bingli Liao, Takahiro Kanzaki
Adversarial examples have shown that albeit highly accurate, models learned by machines, differently from humans, have many weaknesses. However, humans' perception is also fundamentally different from machines, because we do not see the signals which arrive at the retina but a rather complex recreation of them. In this paper, we explore how machines could recreate the input as well as investigate the benefits of such an augmented perception. In this regard, we propose Perceptual Deep Neural Networks ($\varphi$DNN) which also recreate their own input before further processing. The concept is formalized mathematically and two variations of it are developed (one based on inpainting the whole image and the other based on a noisy resized super resolution recreation). Experiments reveal that $\varphi$DNNs and their adversarial training variations can increase the robustness substantially, surpassing both state-of-the-art defenses and pre-processing types of defenses in 100% of the tests. $\varphi$DNNs are shown to scale well to bigger image sizes, keeping a similar high accuracy throughout; while the state-of-the-art worsen up to 35%. Moreover, the recreation process intentionally corrupts the input image. Interestingly, we show by ablation tests that corrupting the input is, although counter-intuitive, beneficial. Thus, $\varphi$DNNs reveal that input recreation has strong benefits for artificial neural networks similar to biological ones, shedding light into the importance of purposely corrupting the input as well as pioneering an area of perception models based on GANs and autoencoders for robust recognition in artificial intelligence.
ROApr 26, 2019
Self Training Autonomous Driving AgentShashank Kotyan, Danilo Vasconcellos Vargas, Venkanna U
Intrinsically, driving is a Markov Decision Process which suits well the reinforcement learning paradigm. In this paper, we propose a novel agent which learns to drive a vehicle without any human assistance. We use the concept of reinforcement learning and evolutionary strategies to train our agent in a 2D simulation environment. Our model's architecture goes beyond the World Model's by introducing difference images in the auto encoder. This novel involvement of difference images in the auto-encoder gives better representation of the latent space with respect to the motion of vehicle and helps an autonomous agent to learn more efficiently how to drive a vehicle. Results show that our method requires fewer (96% less) total agents, (87.5% less) agents per generations, (70% less) generations and (90% less) rollouts than the original architecture while achieving the same accuracy of the original.
NEApr 18, 2019
Batch Tournament Selection for Genetic ProgrammingVinicius V. Melo, Danilo Vasconcellos Vargas, Wolfgang Banzhaf
Lexicase selection achieves very good solution quality by introducing ordered test cases. However, the computational complexity of lexicase selection can prohibit its use in many applications. In this paper, we introduce Batch Tournament Selection (BTS), a hybrid of tournament and lexicase selection which is approximately one order of magnitude faster than lexicase selection while achieving a competitive quality of solutions. Tests on a number of regression datasets show that BTS compares well with lexicase selection in terms of mean absolute error while having a speed-up of up to 25 times. Surprisingly, BTS and lexicase selection have almost no difference in both diversity and performance. This reveals that batches and ordered test cases are completely different mechanisms which share the same general principle fostering the specialization of individuals. This work introduces an efficient algorithm that sheds light onto the main principles behind the success of lexicase, potentially opening up a new range of possibilities for algorithms to come.
CYMar 6, 2019
Tackling Unit Commitment and Load Dispatch Problems Considering All Constraints with Evolutionary ComputationDanilo Vasconcellos Vargas, Junichi Murata, Hirotaka Takano
Unit commitment and load dispatch problems are important and complex problems in power system operations that have being traditionally solved separately. In this paper, both problems are solved together without approximations or simplifications. In fact, the problem solved has a massive amount of grid-connected photovoltaic units, four pump-storage hydro plants as energy storage units and ten thermal power plants, each with its own set of operation requirements that need to be satisfied. To face such a complex constrained optimization problem an adaptive repair method is proposed. By including a given repair method itself as a parameter to be optimized, the proposed adaptive repair method avoid any bias in repair choices. Moreover, this results in a repair method that adapt to the problem and will improve together with the solution during optimization. Experiments are conducted revealing that the proposed method is capable of surpassing exact method solutions on a simplified version of the problem with approximations as well as solve the otherwise intractable complete problem without simplifications. Moreover, since the proposed approach can be applied to other problems in general and it may not be obvious how to choose the constraint handling for a certain constraint, a guideline is provided explaining the reasoning behind. Thus, this paper open further possibilities to deal with the ever changing types of generation units and other similarly complex operation/schedule optimization problems with many difficult constraints.
LGFeb 8, 2019
Understanding the One-Pixel Attack: Propagation Maps and Locality AnalysisDanilo Vasconcellos Vargas, Jiawei Su
Deep neural networks were shown to be vulnerable to single pixel modifications. However, the reason behind such phenomena has never been elucidated. Here, we propose Propagation Maps which show the influence of the perturbation in each layer of the network. Propagation Maps reveal that even in extremely deep networks such as Resnet, modification in one pixel easily propagates until the last layer. In fact, this initial local perturbation is also shown to spread becoming a global one and reaching absolute difference values that are close to the maximum value of the original feature maps in a given layer. Moreover, we do a locality analysis in which we demonstrate that nearby pixels of the perturbed one in the one-pixel attack tend to share the same vulnerability, revealing that the main vulnerability lies in neither neurons nor pixels but receptive fields. Hopefully, the analysis conducted in this work together with a new technique called propagation maps shall shed light into the inner workings of other adversarial samples and be the basis of new defense systems to come.
LGJan 22, 2019
Universal Rules for Fooling Deep Neural Networks based Text ClassificationDi Li, Danilo Vasconcellos Vargas, Sakurai Kouichi
Recently, deep learning based natural language processing techniques are being extensively used to deal with spam mail, censorship evaluation in social networks, among others. However, there is only a couple of works evaluating the vulnerabilities of such deep neural networks. Here, we go beyond attacks to investigate, for the first time, universal rules, i.e., rules that are sample agnostic and therefore could turn any text sample in an adversarial one. In fact, the universal rules do not use any information from the method itself (no information from the method, gradient information or training dataset information is used), making them black-box universal attacks. In other words, the universal rules are sample and method agnostic. By proposing a coevolutionary optimization algorithm we show that it is possible to create universal rules that can automatically craft imperceptible adversarial samples (only less than five perturbations which are close to misspelling are inserted in the text sample). A comparison with a random search algorithm further justifies the strength of the method. Thus, universal rules for fooling networks are here shown to exist. Hopefully, the results from this work will impact the development of yet more sample and model agnostic attacks as well as their defenses, culminating in perhaps a new age for artificial intelligence.
NENov 20, 2018
Self Organizing Classifiers and Niched FitnessDanilo Vasconcellos Vargas, Hirotaka Takano, Junichi Murata
Learning classifier systems are adaptive learning systems which have been widely applied in a multitude of application domains. However, there are still some generalization problems unsolved. The hurdle is that fitness and niching pressures are difficult to balance. Here, a new algorithm called Self Organizing Classifiers is proposed which faces this problem from a different perspective. Instead of balancing the pressures, both pressures are separated and no balance is necessary. In fact, the proposed algorithm possesses a dynamical population structure that self-organizes itself to better project the input space into a map. The niched fitness concept is defined along with its dynamical population structure, both are indispensable for the understanding of the proposed method. Promising results are shown on two continuous multi-step problems. One of which is yet more challenging than previous problems of this class in the literature.
NENov 20, 2018
Self Organizing Classifiers: First Steps in Structured Evolutionary Machine LearningDanilo Vasconcellos Vargas, Hirotaka Takano, Junichi Murata
Learning classifier systems (LCSs) are evolutionary machine learning algorithms, flexible enough to be applied to reinforcement, supervised and unsupervised learning problems with good performance. Recently, self organizing classifiers were proposed which are similar to LCSs but have the advantage that in its structured population no balance between niching and fitness pressure is necessary. However, more tests and analysis are required to verify its benefits. Here, a variation of the first algorithm is proposed which uses a parameterless self organizing map (SOM). This algorithm is applied in challenging problems such as big, noisy as well as dynamically changing continuous input-action mazes (growing and compressing mazes are included) with good performance. Moreover, a genetic operator is proposed which utilizes the topological information of the SOM's population structure, improving the results. Thus, the first steps in structured evolutionary machine learning are shown, nonetheless, the problems faced are more difficult than the state-of-art continuous input-action multi-step ones.
LGNov 20, 2018
Contingency TrainingDanilo Vasconcellos Vargas, Hirotaka Takano, Junichi Murata
When applied to high-dimensional datasets, feature selection algorithms might still leave dozens of irrelevant variables in the dataset. Therefore, even after feature selection has been applied, classifiers must be prepared to the presence of irrelevant variables. This paper investigates a new training method called Contingency Training which increases the accuracy as well as the robustness against irrelevant attributes. Contingency training is classifier independent. By subsampling and removing information from each sample, it creates a set of constraints. These constraints aid the method to automatically find proper importance weights of the dataset's features. Experiments are conducted with the contingency training applied to neural networks over traditional datasets as well as datasets with additional irrelevant variables. For all of the tests, contingency training surpassed the unmodified training on datasets with irrelevant variables and even outperformed slightly when only a few or no irrelevant variables were present.
AISep 19, 2018
Novelty-organizing team of classifiers in noisy and dynamic environmentsDanilo Vasconcellos Vargas, Hirotaka Takano, Junichi Murata
In the real world, the environment is constantly changing with the input variables under the effect of noise. However, few algorithms were shown to be able to work under those circumstances. Here, Novelty-Organizing Team of Classifiers (NOTC) is applied to the continuous action mountain car as well as two variations of it: a noisy mountain car and an unstable weather mountain car. These problems take respectively noise and change of problem dynamics into account. Moreover, NOTC is compared with NeuroEvolution of Augmenting Topologies (NEAT) in these problems, revealing a trade-off between the approaches. While NOTC achieves the best performance in all of the problems, NEAT needs less trials to converge. It is demonstrated that NOTC achieves better performance because of its division of the input space (creating easier problems). Unfortunately, this division of input space also requires a bit of time to bootstrap.
CVApr 19, 2018
Attacking Convolutional Neural Network using Differential EvolutionJiawei Su, Danilo Vasconcellos Vargas, Kouichi Sakurai
The output of Convolutional Neural Networks (CNN) has been shown to be discontinuous which can make the CNN image classifier vulnerable to small well-tuned artificial perturbations. That is, images modified by adding such perturbations(i.e. adversarial perturbations) that make little difference to human eyes, can completely alter the CNN classification results. In this paper, we propose a practical attack using differential evolution(DE) for generating effective adversarial perturbations. We comprehensively evaluate the effectiveness of different types of DEs for conducting the attack on different network structures. The proposed method is a black-box attack which only requires the miracle feedback of the target CNN systems. The results show that under strict constraints which simultaneously control the number of pixels changed and overall perturbation strength, attacking can achieve 72.29%, 78.24% and 61.28% non-targeted attack success rates, with 88.68%, 99.85% and 73.07% confidence on average, on three common types of CNNs. The attack only requires modifying 5 pixels with 20.44, 14.76 and 22.98 pixel values distortion. Thus, the result shows that the current DNNs are also vulnerable to such simpler black-box attacks even under very limited attack conditions.
CRFeb 11, 2018
Lightweight Classification of IoT Malware based on Image RecognitionJiawei Su, Danilo Vasconcellos Vargas, Sanjiva Prasad et al.
The Internet of Things (IoT) is an extension of the traditional Internet, which allows a very large number of smart devices, such as home appliances, network cameras, sensors and controllers to connect to one another to share information and improve user experiences. Current IoT devices are typically micro-computers for domain-specific computations rather than traditional functionspecific embedded devices. Therefore, many existing attacks, targeted at traditional computers connected to the Internet, may also be directed at IoT devices. For example, DDoS attacks have become very common in IoT environments, as these environments currently lack basic security monitoring and protection mechanisms, as shown by the recent Mirai and Brickerbot IoT botnets. In this paper, we propose a novel light-weight approach for detecting DDos malware in IoT environments.We firstly extract one-channel gray-scale images converted from binaries, and then utilize a lightweight convolutional neural network for classifying IoT malware families. The experimental results show that the proposed system can achieve 94.0% accuracy for the classification of goodware and DDoS malware, and 81.8% accuracy for the classification of goodware and two main malware families.
LGOct 24, 2017
One pixel attack for fooling deep neural networksJiawei Su, Danilo Vasconcellos Vargas, Sakurai Kouichi
Recent research has revealed that the output of Deep Neural Networks (DNN) can be easily altered by adding relatively small perturbations to the input vector. In this paper, we analyze an attack in an extremely limited scenario where only one pixel can be modified. For that we propose a novel method for generating one-pixel adversarial perturbations based on differential evolution (DE). It requires less adversarial information (a black-box attack) and can fool more types of networks due to the inherent features of DE. The results show that 67.97% of the natural images in Kaggle CIFAR-10 test dataset and 16.04% of the ImageNet (ILSVRC 2012) test images can be perturbed to at least one target class by modifying just one pixel with 74.03% and 22.91% confidence on average. We also show the same vulnerability on the original CIFAR-10 dataset. Thus, the proposed attack explores a different take on adversarial machine learning in an extreme limited scenario, showing that current DNNs are also vulnerable to such low dimension attacks. Besides, we also illustrate an important application of DE (or broadly speaking, evolutionary computation) in the domain of adversarial machine learning: creating tools that can effectively generate low-cost adversarial attacks against neural networks for evaluating robustness.