SYDec 14, 2017
Nonlinear Bayesian Estimation: From Kalman Filtering to a Broader HorizonHuazhen Fang, Ning Tian, Yebin Wang et al.
This article presents an up-to-date tutorial review of nonlinear Bayesian estimation. State estimation for nonlinear systems has been a challenge encountered in a wide range of engineering fields, attracting decades of research effort. To date, one of the most promising and popular approaches is to view and address the problem from a Bayesian probabilistic perspective, which enables estimation of the unknown state variables by tracking their probabilistic distribution or statistics (e.g., mean and covariance) conditioned on the system's measurement data. This article offers a systematic introduction of the Bayesian state estimation framework and reviews various Kalman filtering (KF) techniques, progressively from the standard KF for linear systems to extended KF, unscented KF and ensemble KF for nonlinear systems. It also overviews other prominent or emerging Bayesian estimation methods including the Gaussian filtering, Gaussian-sum filtering, particle filtering and moving horizon estimation and extends the discussion of state estimation forward to more complicated problems such as simultaneous state and parameter/input estimation.
NEJul 22, 2024
A Pairwise Comparison Relation-assisted Multi-objective Evolutionary Neural Architecture Search Method with Multi-population MechanismYu Xue, Pengcheng Jiang, Chenchen Zhu et al.
Neural architecture search (NAS) has emerged as a powerful paradigm that enables researchers to automatically explore vast search spaces and discover efficient neural networks. However, NAS suffers from a critical bottleneck, i.e. the evaluation of numerous architectures during the search process demands substantial computing resources and time. In order to improve the efficiency of NAS, a series of methods have been proposed to reduce the evaluation time of neural architectures. However, they are not efficient enough and still only focus on the accuracy of architectures. Beyond classification accuracy, real-world applications increasingly demand more efficient and compact network architectures that balance multiple performance criteria. To address these challenges, we propose the SMEMNAS, a pairwise comparison relation-assisted multi-objective evolutionary algorithm based on a multi-population mechanism. In the SMEMNAS, a surrogate model is constructed based on pairwise comparison relations to predict the accuracy ranking of architectures, rather than the absolute accuracy. Moreover, two populations cooperate with each other in the search process, i.e. a main population that guides the evolutionary process, while a vice population that enhances search diversity. Our method aims to discover high-performance models that simultaneously optimize multiple objectives. We conduct comprehensive experiments on CIFAR-10, CIFAR-100 and ImageNet datasets to validate the effectiveness of our approach. With only a single GPU searching for 0.17 days, competitive architectures can be found by SMEMNAS which achieves 78.91% accuracy with the MAdds of 570M on the ImageNet. This work makes a significant advancement in the field of NAS.
LGJul 11, 2023
Transaction Fraud Detection via an Adaptive Graph Neural NetworkYue Tian, Guanjun Liu, Jiacun Wang et al.
Many machine learning methods have been proposed to achieve accurate transaction fraud detection, which is essential to the financial security of individuals and banks. However, most existing methods leverage original features only or require manual feature engineering. They lack the ability to learn discriminative representations from transaction data. Moreover, criminals often commit fraud by imitating cardholders' behaviors, which causes the poor performance of existing detection models. In this paper, we propose an Adaptive Sampling and Aggregation-based Graph Neural Network (ASA-GNN) that learns discriminative representations to improve the performance of transaction fraud detection. A neighbor sampling strategy is performed to filter noisy nodes and supplement information for fraudulent nodes. Specifically, we leverage cosine similarity and edge weights to adaptively select neighbors with similar behavior patterns for target nodes and then find multi-hop neighbors for fraudulent nodes. A neighbor diversity metric is designed by calculating the entropy among neighbors to tackle the camouflage issue of fraudsters and explicitly alleviate the over-smoothing phenomena. Extensive experiments on three real financial datasets demonstrate that the proposed method ASA-GNN outperforms state-of-the-art ones.
LGMay 14
Edge-AI-Driven Learning-to-Rank for Decentralized Task Allocation in Circular Smart ManufacturingMohammadhossein Ghahramani, Yan Qiao, Mengchu Zhou
Task allocation in smart manufacturing systems needs to operate under decentralized decision-making, dynamic workloads, and shared resource constraints. In circular manufacturing settings, these challenges are further intensified by the need to balance operational efficiency with resource and energy sustainability. While learning-based approaches have been explored, many focus on predicting absolute performance metrics that do not necessarily translate into improved allocation outcomes, since decentralized assignment is governed by the relative ordering of candidate machines. This work proposes an Edge-AI-driven decentralized task allocation framework based on ranking-aware negotiation, where lightweight decision intelligence is embedded at the machine level to enable low-latency coordination without centralized control. The framework is developed progressively: a resource-aware heuristic first establishes the decentralized bidding structure, an Edge-AI-based regression model then provides learned local bid approximation, and a ranking-aware formulation finally reshapes the learning objective to align with the ordering-based nature of winner selection. Each machine evaluates incoming tasks using local information, including processing capability, queue state, and resource contention. The framework is evaluated via discrete-event simulation under high-load and tight-deadline scenarios using delay, deadline violations, throughput, and energy consumption. Results show improved delay and deadline adherence under high load, and enhanced energy efficiency under tighter constraints, leading to more resource-efficient operation aligned with circular manufacturing objectives. These findings demonstrate that aligning learning objectives with decentralized decision structures is critical for effective negotiation-driven task allocation.
LGMar 7Code
Combining Adam and its Inverse Counterpart to Enhance Generalization of Deep Learning OptimizersTao Shi, Liangming Chen, Long Jin et al.
In the training of neural networks, adaptive moment estimation (Adam) typically converges fast but exhibits suboptimal generalization performance. A widely accepted explanation for its defect in generalization is that it often tends to converge to sharp minima. To enhance its ability to find flat minima, we propose its new variant named inverse Adam (InvAdam). The key improvement of InvAdam lies in its parameter update mechanism, which is opposite to that of Adam. Specifically, it computes element-wise multiplication of the first-order and second-order moments, while Adam computes the element-wise division of these two moments. This modification aims to increase the step size of the parameter update when the elements in the second-order moments are large and vice versa, which helps the parameter escape sharp minima and stay at flat ones. However, InvAdam's update mechanism may face challenges in convergence. To address this challenge, we propose dual Adam (DualAdam), which integrates the update mechanisms of both Adam and InvAdam, ensuring convergence while enhancing generalization performance. Additionally, we introduce the diffusion theory to mathematically demonstrate InvAdam's ability to escape sharp minima. Extensive experiments are conducted on image classification tasks and large language model (LLM) fine-tuning. The results validate that DualAdam outperforms Adam and its state-of-the-art variants in terms of generalization performance. The code is publicly available at https://github.com/LongJin-lab/DualAdam.
CVMay 26, 2019Code
Learning Smooth Representation for Unsupervised Domain AdaptationGuanyu Cai, Lianghua He, Mengchu Zhou et al.
Typical adversarial-training-based unsupervised domain adaptation methods are vulnerable when the source and target datasets are highly-complex or exhibit a large discrepancy between their data distributions. Recently, several Lipschitz-constraint-based methods have been explored. The satisfaction of Lipschitz continuity guarantees a remarkable performance on a target domain. However, they lack a mathematical analysis of why a Lipschitz constraint is beneficial to unsupervised domain adaptation and usually perform poorly on large-scale datasets. In this paper, we take the principle of utilizing a Lipschitz constraint further by discussing how it affects the error bound of unsupervised domain adaptation. A connection between them is built and an illustration of how Lipschitzness reduces the error bound is presented. A \textbf{local smooth discrepancy} is defined to measure Lipschitzness of a target distribution in a pointwise way. When constructing a deep end-to-end model, to ensure the effectiveness and stability of unsupervised domain adaptation, three critical factors are considered in our proposed optimization strategy, i.e., the sample amount of a target domain, dimension and batchsize of samples. Experimental results demonstrate that our model performs well on several standard benchmarks. Our ablation study shows that the sample amount of a target domain, the dimension and batchsize of samples indeed greatly impact Lipschitz-constraint-based methods' ability to handle large-scale datasets. Code is available at https://github.com/CuthbertCai/SRDA.
CLJul 22, 2024
Compensate Quantization Errors+: Quantized Models Are Inquisitive LearnersYifei Gao, Jie Ou, Lei Wang et al.
The quantization of large language models (LLMs) has been a prominent research area aimed at enabling their lightweight deployment in practice. Existing research about LLM's quantization has mainly explored the interplay between weights and activations, or employing auxiliary components while neglecting the necessity of adjusting weights during quantization. Consequently, original weight distributions frequently fail to yield desired results after round-to-nearest (RTN) quantization. Even though incorporating techniques such as mixed precision and low-rank error approximation in LLM's quantization can yield improved results, they inevitably introduce additional computational overhead. On the other hand, traditional techniques for weight quantization, such as Generative Post-Training Quantization, rely on manually tweaking weight distributions to minimize local errors, but they fall short of achieving globally optimal outcomes. Although the recently proposed Learnable Singular-value Increment improves global weight quantization by modifying weight distributions, it disrupts the original distribution considerably. This introduces pronounced bias toward the training data and can degrade downstream task performance. In this paper, we introduce Singular-value Diagonal Expansion, a more nuanced approach to refining weight distributions to achieve better quantization alignment. Furthermore, we introduce Cross-layer Learning that improves overall quantization outcomes by distributing errors more evenly across layers. Our plug-and-play weight-quantization methods demonstrate substantial performance improvements over state-of-the-art approaches, including OmniQuant, DuQuant, and PrefixQuant.
QUANT-PHJan 13, 2024
Quantum Generative Diffusion Model: A Fully Quantum-Mechanical Model for Generating Quantum State EnsembleChuangtao Chen, Qinglin Zhao, MengChu Zhou et al.
Classical diffusion models have shown superior generative results. Exploring them in the quantum domain can advance the field of quantum generative learning. This work introduces Quantum Generative Diffusion Model (QGDM) as their simple and elegant quantum counterpart. Through a non-unitary forward process, any target quantum state can be transformed into a completely mixed state that has the highest entropy and maximum uncertainty about the system. A trainable backward process is used to recover the former from the latter. The design requirements for its backward process includes non-unitarity and small parameter count. We introduce partial trace operations to enforce non-unitary and reduce the number of trainable parameters by using a parameter-sharing strategy and incorporating temporal information as an input in the backward process. We present QGDM's resource-efficient version to reduce auxiliary qubits while preserving generative capabilities. QGDM exhibits faster convergence than Quantum Generative Adversarial Network (QGAN) because its adopted convex-based optimization can result in better convergence. The results of comparing it with QGAN demonstrate its effectiveness in generating both pure and mixed quantum states. It can achieve 53.02% higher fidelity in mixed-state generation than QGAN. The results highlight its great potential to tackle challenging quantum generation tasks.
CVApr 3, 2025
HQViT: Hybrid Quantum Vision Transformer for Image ClassificationHui Zhang, Qinglin Zhao, Mengchu Zhou et al.
Transformer-based architectures have revolutionized the landscape of deep learning. In computer vision domain, Vision Transformer demonstrates remarkable performance on par with or even surpassing that of convolutional neural networks. However, the quadratic computational complexity of its self-attention mechanism poses challenges for classical computing, making model training with high-dimensional input data, e.g., images, particularly expensive. To address such limitations, we propose a Hybrid Quantum Vision Transformer (HQViT), that leverages the principles of quantum computing to accelerate model training while enhancing model performance. HQViT introduces whole-image processing with amplitude encoding to better preserve global image information without additional positional encoding. By leveraging quantum computation on the most critical steps and selectively handling other components in a classical way, we lower the cost of quantum resources for HQViT. The qubit requirement is minimized to $O(log_2N)$ and the number of parameterized quantum gates is only $O(log_2d)$, making it well-suited for Noisy Intermediate-Scale Quantum devices. By offloading the computationally intensive attention coefficient matrix calculation to the quantum framework, HQViT reduces the classical computational load by $O(T^2d)$. Extensive experiments across various computer vision datasets demonstrate that HQViT outperforms existing models, achieving a maximum improvement of up to $10.9\%$ (on the MNIST 10-classification task) over the state of the art. This work highlights the great potential to combine quantum and classical computing to cope with complex image classification tasks.
QUANT-PHMay 8, 2025
Overcoming Dimensional Factorization Limits in Discrete Diffusion Models through Quantum Joint Distribution LearningChuangtao Chen, Qinglin Zhao, MengChu Zhou et al.
Discrete diffusion models represent a significant advance in generative modeling, demonstrating remarkable success in synthesizing complex, high-quality discrete data. However, to avoid exponential computational costs, they typically rely on calculating per-dimension transition probabilities when learning high-dimensional distributions. In this study, we rigorously prove that this approach leads to a worst-case linear scaling of Kullback-Leibler (KL) divergence with data dimension. To address this, we propose a Quantum Discrete Denoising Diffusion Probabilistic Model (QD3PM), which enables joint probability learning through diffusion and denoising in exponentially large Hilbert spaces, offering a theoretical pathway to faithfully capture the true joint distribution. By deriving posterior states through quantum Bayes' theorem, similar to the crucial role of posterior probabilities in classical diffusion models, and by learning the joint probability, we establish a solid theoretical foundation for quantum-enhanced diffusion models. For denoising, we design a quantum circuit that utilizes temporal information for parameter sharing and incorporates learnable classical-data-controlled rotations for encoding. Exploiting joint distribution learning, our approach enables single-step sampling from pure noise, eliminating iterative requirements of existing models. Simulations demonstrate the proposed model's superior accuracy in modeling complex distributions compared to factorization methods. Hence, this paper establishes a new theoretical paradigm in generative models by leveraging the quantum advantage in joint distribution learning.
LGMar 6
CLAIRE: Compressed Latent Autoencoder for Industrial Representation and Evaluation -- A Deep Learning Framework for Smart ManufacturingMohammadhossein Ghahramani, Mengchu Zhou
Accurate fault detection in high-dimensional industrial environments remains a major challenge due to the inherent complexity, noise, and redundancy in sensor data. This paper introduces CLAIRE, i.e., a hybrid end-to-end learning framework that integrates unsupervised deep representation learning with supervised classification for intelligent quality control in smart manufacturing systems. It employs an optimized deep autoencoder to transform raw input into a compact latent space, effectively capturing the intrinsic data structure while suppressing irrelevant or noisy features. The learned representations are then fed into a downstream classifier to perform binary fault prediction. Experimental results on a high-dimensional dataset demonstrate that CLAIRE significantly outperforms conventional classifiers trained directly on raw features. Moreover, the framework incorporates a post hoc phase, using a game-theory-based interpretability technique, to analyze the latent space and identify the most informative input features contributing to fault predictions. The proposed framework highlights the potential of integrating explainable AI with feature-aware regularization for robust fault detection. The modular and interpretable nature of the proposed framework makes it highly adaptable, offering promising applications in other domains characterized by complex, high-dimensional data, such as healthcare, finance, and environmental monitoring.
LGFeb 1
Adaptive Dual-Weighting Framework for Federated Learning via Out-of-Distribution DetectionZhiwei Ling, Hailiang Zhao, Chao Zhang et al.
Federated Learning (FL) enables collaborative model training across large-scale distributed service nodes while preserving data privacy, making it a cornerstone of intelligent service systems in edge-cloud environments. However, in real-world service-oriented deployments, data generated by heterogeneous users, devices, and application scenarios are inherently non-IID. This severe data heterogeneity critically undermines the convergence stability, generalization ability, and ultimately the quality of service delivered by the global model. To address this challenge, we propose FLood, a novel FL framework inspired by out-of-distribution (OOD) detection. FLood dynamically counteracts the adverse effects of heterogeneity through a dual-weighting mechanism that jointly governs local training and global aggregation. At the client level, it adaptively reweights the supervised loss by upweighting pseudo-OOD samples, thereby encouraging more robust learning from distributionally misaligned or challenging data. At the server level, it refines model aggregation by weighting client contributions according to their OOD confidence scores, prioritizing updates from clients with higher in-distribution consistency and enhancing the global model's robustness and convergence stability. Extensive experiments across multiple benchmarks under diverse non-IID settings demonstrate that FLood consistently outperforms state-of-the-art FL methods in both accuracy and generalization. Furthermore, FLood functions as an orthogonal plug-in module: it seamlessly integrates with existing FL algorithms to boost their performance under heterogeneity without modifying their core optimization logic. These properties make FLood a practical and scalable solution for deploying reliable intelligent services in real-world federated environments.
SDSep 28, 2025
Generalizable Speech Deepfake Detection via Information Bottleneck Enhanced Adversarial AlignmentPu Huang, Shouguang Wang, Siya Yao et al.
Neural speech synthesis techniques have enabled highly realistic speech deepfakes, posing major security risks. Speech deepfake detection is challenging due to distribution shifts across spoofing methods and variability in speakers, channels, and recording conditions. We explore learning shared discriminative features as a path to robust detection and propose Information Bottleneck enhanced Confidence-Aware Adversarial Network (IB-CAAN). Confidence-guided adversarial alignment adaptively suppresses attack-specific artifacts without erasing discriminative cues, while the information bottleneck removes nuisance variability to preserve transferable features. Experiments on ASVspoof 2019/2021, ASVspoof 5, and In-the-Wild demonstrate that IB-CAAN consistently outperforms baseline and achieves state-of-the-art performance on many benchmarks.
CVJan 10, 2025
SeMi: When Imbalanced Semi-Supervised Learning Meets Mining Hard ExamplesYin Wang, Zixuan Wang, Hao Lu et al.
Semi-Supervised Learning (SSL) can leverage abundant unlabeled data to boost model performance. However, the class-imbalanced data distribution in real-world scenarios poses great challenges to SSL, resulting in performance degradation. Existing class-imbalanced semi-supervised learning (CISSL) methods mainly focus on rebalancing datasets but ignore the potential of using hard examples to enhance performance, making it difficult to fully harness the power of unlabeled data even with sophisticated algorithms. To address this issue, we propose a method that enhances the performance of Imbalanced Semi-Supervised Learning by Mining Hard Examples (SeMi). This method distinguishes the entropy differences among logits of hard and easy examples, thereby identifying hard examples and increasing the utility of unlabeled data, better addressing the imbalance problem in CISSL. In addition, we maintain a class-balanced memory bank with confidence decay for storing high-confidence embeddings to enhance the pseudo-labels' reliability. Although our method is simple, it is effective and seamlessly integrates with existing approaches. We perform comprehensive experiments on standard CISSL benchmarks and experimentally demonstrate that our proposed SeMi outperforms existing state-of-the-art methods on multiple benchmarks, especially in reversed scenarios, where our best result shows approximately a 54.8\% improvement over the baseline methods.
AIDec 14, 2023
A Sparse Cross Attention-based Graph Convolution Network with Auxiliary Information Awareness for Traffic Flow PredictionLingqiang Chen, Qinglin Zhao, Guanghui Li et al.
Deep graph convolution networks (GCNs) have recently shown excellent performance in traffic prediction tasks. However, they face some challenges. First, few existing models consider the influence of auxiliary information, i.e., weather and holidays, which may result in a poor grasp of spatial-temporal dynamics of traffic data. Second, both the construction of a dynamic adjacent matrix and regular graph convolution operations have quadratic computation complexity, which restricts the scalability of GCN-based models. To address such challenges, this work proposes a deep encoder-decoder model entitled AIMSAN. It contains an auxiliary information-aware module (AIM) and sparse cross attention-based graph convolution network (SAN). The former learns multi-attribute auxiliary information and obtains its embedded presentation of different time-window sizes. The latter uses a cross-attention mechanism to construct dynamic adjacent matrices by fusing traffic data and embedded auxiliary data. Then, SAN applies diffusion GCN on traffic data to mine rich spatial-temporal dynamics. Furthermore, AIMSAN considers and uses the spatial sparseness of traffic nodes to reduce the quadratic computation complexity. Experimental results on three public traffic datasets demonstrate that the proposed method outperforms other counterparts in terms of various performance indices. Specifically, the proposed method has competitive performance with the state-of-the-art algorithms but saves 35.74% of GPU memory usage, 42.25% of training time, and 45.51% of validation time on average.
LGJan 20, 2022
Transfer Learning for Fault Diagnosis of Transmission LinesFatemeh Mohammadi Shakiba, Milad Shojaee, S. Mohsen Azizi et al.
Recent artificial intelligence-based methods have shown great promise in the use of neural networks for real-time sensing and detection of transmission line faults and estimation of their locations. The expansion of power systems including transmission lines with various lengths have made a fault detection, classification, and location estimation process more challenging. Transmission line datasets are stream data which are continuously collected by various sensors and hence, require generalized and fast fault diagnosis approaches. Newly collected datasets including voltages and currents might not have enough and accurate labels (fault and no fault) that are useful to train neural networks. In this paper, a novel transfer learning framework based on a pre-trained LeNet-5 convolutional neural network is proposed. This method is able to diagnose faults for different transmission line lengths and impedances by transferring the knowledge from a source convolutional neural network to predict a dissimilar target dataset. By transferring this knowledge, faults from various transmission lines, without having enough labels, can be diagnosed faster and more efficiently compared to the existing methods. To prove the feasibility and effectiveness of this methodology, seven different datasets that include various lengths of transmission lines are used. The robustness of the proposed methodology against generator voltage fluctuation, variation in fault distance, fault inception angle, fault resistance, and phase difference between the two generators are well shown, thus proving its practical values in the fault diagnosis of transmission lines.
NEJan 18, 2022
Frequent Itemset-driven Search for Finding Minimum Node Separators in Complex NetworksYangming Zhou, Xiaze Zhang, Na Geng et al.
Finding an optimal set of critical nodes in a complex network has been a long-standing problem in the fields of both artificial intelligence and operations research. Potential applications include epidemic control, network security, carbon emission monitoring, emergence response, drug design, and vulnerability assessment. In this work, we consider the problem of finding a minimal node separator whose removal separates a graph into multiple different connected components with fewer than a limited number of vertices in each component. To solve it, we propose a frequent itemset-driven search approach, which integrates the concept of frequent itemset mining in data mining into the well-known memetic search framework. Starting from a high-quality population built by the solution construction and population repair procedures, it iteratively employs the frequent itemset recombination operator (to generate promising offspring solution based on itemsets that frequently occur in high-quality solutions), tabu search-based simulated annealing (to find high-quality local optima), population repair procedure (to modify the population), and rank-based population management strategy (to guarantee a healthy population). Extensive evaluations on 50 widely used benchmark instances show that it significantly outperforms state-of-the-art algorithms. In particular, it discovers 29 new upper bounds and matches 18 previous best-known bounds. Finally, experimental analyses are performed to confirm the effectiveness of key algorithmic modules of the proposed method.
AIJan 1, 2022
IoT-based Route Recommendation for an Intelligent Waste Management SystemMohammadhossein Ghahramani, Mengchu Zhou, Anna Molter et al.
The Internet of Things (IoT) is a paradigm characterized by a network of embedded sensors and services. These sensors are incorporated to collect various information, track physical conditions, e.g., waste bins' status, and exchange data with different centralized platforms. The need for such sensors is increasing; however, proliferation of technologies comes with various challenges. For example, how can IoT and its associated data be used to enhance waste management? In smart cities, an efficient waste management system is crucial. Artificial Intelligence (AI) and IoT-enabled approaches can empower cities to manage the waste collection. This work proposes an intelligent approach to route recommendation in an IoT-enabled waste management system given spatial constraints. It performs a thorough analysis based on AI-based methods and compares their corresponding results. Our solution is based on a multiple-level decision-making process in which bins' status and coordinates are taken into account to address the routing problem. Such AI-based models can help engineers design a sustainable infrastructure system.
LGDec 3, 2021
Learning to Detect Critical Nodes in Sparse Graphs via Feature Importance AwarenessXuwei Tan, Yangming Zhou, MengChu Zhou et al.
Detecting critical nodes in sparse graphs is important in a variety of application domains, such as network vulnerability assessment, epidemic control, and drug design. The critical node problem (CNP) aims to find a set of critical nodes from a network whose deletion maximally degrades the pairwise connectivity of the residual network. Due to its general NP-hard nature, state-of-the-art CNP solutions are based on heuristic approaches. Domain knowledge and trial-and-error are usually required when designing such approaches, thus consuming considerable effort and time. This work proposes a feature importance-aware graph attention network for node representation and combines it with dueling double deep Q-network to create an end-to-end algorithm to solve CNP for the first time. It does not need any problem-specific knowledge or labeled datasets as required by most of existing methods. Once the model is trained, it can be generalized to cope with various types of CNPs (with different sizes and topological structures) without re-training. Computational experiments on 28 real-world networks show that the proposed method is highly comparable to state-of-the-art methods. It does not require any problem-specific knowledge and, hence, can be applicable to many applications including those impossible ones by using the existing approaches. It can be combined with some local search methods to further improve its solution quality. Extensive comparison results are given to show its effectiveness in solving CNP.
CVMar 4, 2021
MOGAN: Morphologic-structure-aware Generative Learning from a Single ImageJinshu Chen, Qihui Xu, Qi Kang et al.
In most interactive image generation tasks, given regions of interest (ROI) by users, the generated results are expected to have adequate diversities in appearance while maintaining correct and reasonable structures in original images. Such tasks become more challenging if only limited data is available. Recently proposed generative models complete training based on only one image. They pay much attention to the monolithic feature of the sample while ignoring the actual semantic information of different objects inside the sample. As a result, for ROI-based generation tasks, they may produce inappropriate samples with excessive randomicity and without maintaining the related objects' correct structures. To address this issue, this work introduces a MOrphologic-structure-aware Generative Adversarial Network named MOGAN that produces random samples with diverse appearances and reliable structures based on only one image. For training for ROI, we propose to utilize the data coming from the original image being augmented and bring in a novel module to transform such augmented data into knowledge containing both structures and appearances, thus enhancing the model's comprehension of the sample. To learn the rest areas other than ROI, we employ binary masks to ensure the generation isolated from ROI. Finally, we set parallel and hierarchical branches of the mentioned learning process. Compared with other single image GAN schemes, our approach focuses on internal features including the maintenance of rational structures and variation on appearance. Experiments confirm a better capacity of our model on ROI-based image generation tasks than its competitive peers.
AIFeb 11, 2021
On the Philosophical, Cognitive and Mathematical Foundations of Symbiotic Autonomous Systems (SAS)Yingxu Wang, Fakhri Karray, Sam Kwong et al.
Symbiotic Autonomous Systems (SAS) are advanced intelligent and cognitive systems exhibiting autonomous collective intelligence enabled by coherent symbiosis of human-machine interactions in hybrid societies. Basic research in the emerging field of SAS has triggered advanced general AI technologies functioning without human intervention or hybrid symbiotic systems synergizing humans and intelligent machines into coherent cognitive systems. This work presents a theoretical framework of SAS underpinned by the latest advances in intelligence, cognition, computer, and system sciences. SAS are characterized by the composition of autonomous and symbiotic systems that adopt bio-brain-social-inspired and heterogeneously synergized structures and autonomous behaviors. This paper explores their cognitive and mathematical foundations. The challenge to seamless human-machine interactions in a hybrid environment is addressed. SAS-based collective intelligence is explored in order to augment human capability by autonomous machine intelligence towards the next generation of general AI, autonomous computers, and trustworthy mission-critical intelligent systems. Emerging paradigms and engineering applications of SAS are elaborated via an autonomous knowledge learning system that symbiotically works between humans and cognitive robots.
SEJan 15, 2021
Variable Petri Nets for MobilityZhijun Ding, Ru Yang, Puwen Cui et al.
Mobile computing systems, service-based systems and some other systems with mobile interacting components have recently received much attention. However, because of their characteristics such as mobility and disconnection, it is difficult to model and analyze them by using a structure-fixed model. This work proposes a new Petri net model called Variable Petri Net (VPN) for modeling and analyzing these systems. The definition, firing rule, and related analysis technology of VPN are introduced in detail. In a VPN, the possible interaction interfaces are abstracted as a new kind of places called virtual places, and the occurrences of (dis)connections are described by new functions, which makes it appropriate to describe the component collaboration in systems and realize the scalability and pluggability of systems. Moreover, to overcome the shortcoming that markings cannot reflect link capability of a system, VPNs add a constraint function along with a marking to represent a complete system configuration. Several examples are used to demonstrate the newly proposed model and method.
CYAug 29, 2020
Urban Sensing based on Mobile Phone Data: Approaches, Applications and ChallengesMohammadhossein Ghahramani, MengChu Zhou, Gang Wang
Data volume grows explosively with the proliferation of powerful smartphones and innovative mobile applications. The ability to accurately and extensively monitor and analyze these data is necessary. Much concern in mobile data analysis is related to human beings and their behaviours. Due to the potential value that lies behind these massive data, there have been different proposed approaches for understanding corresponding patterns. To that end, monitoring people's interactions, whether counting them at fixed locations or tracking them by generating origin-destination matrices is crucial. The former can be used to determine the utilization of assets like roads and city attractions. The latter is valuable when planning transport infrastructure. Such insights allow a government to predict the adoption of new roads, new public transport routes, modification of existing infrastructure, and detection of congestion zones, resulting in more efficient designs and improvement. Smartphone data exploration can help research in various fields, e.g., urban planning, transportation, health care, and business marketing. It can also help organizations in decision making, policy implementation, monitoring and evaluation at all levels. This work aims to review the methods and techniques that have been implemented to discover knowledge from mobile phone data. We classify these existing methods and present a taxonomy of the related work by discussing their pros and cons.
LGAug 29, 2020
AI-based Modeling and Data-driven Evaluation for Smart Manufacturing ProcessesMohammadhossein Ghahramani, Yan Qiao, MengChu Zhou et al.
Smart Manufacturing refers to optimization techniques that are implemented in production operations by utilizing advanced analytics approaches. With the widespread increase in deploying Industrial Internet of Things (IIoT) sensors in manufacturing processes, there is a progressive need for optimal and effective approaches to data management. Embracing Machine Learning and Artificial Intelligence to take advantage of manufacturing data can lead to efficient and intelligent automation. In this paper, we conduct a comprehensive analysis based on Evolutionary Computing and Deep Learning algorithms toward making semiconductor manufacturing smart. We propose a dynamic algorithm for gaining useful insights about semiconductor manufacturing processes and to address various challenges. We elaborate on the utilization of a Genetic Algorithm and Neural Network to propose an intelligent feature selection algorithm. Our objective is to provide an advanced solution for controlling manufacturing processes and to gain perspective on various dimensions that enable manufacturers to access effective predictive technologies.
CVJun 20, 2020
G-image Segmentation: Similarity-preserving Fuzzy C-Means with Spatial Information Constraint in Wavelet SpaceCong Wang, Witold Pedrycz, ZhiWu Li et al.
G-images refer to image data defined on irregular graph domains. This work elaborates a similarity-preserving Fuzzy C-Means (FCM) algorithm for G-image segmentation and aims to develop techniques and tools for segmenting G-images. To preserve the membership similarity between an arbitrary image pixel and its neighbors, a Kullback-Leibler divergence term on membership partition is introduced as a part of FCM. As a result, similarity-preserving FCM is developed by considering spatial information of image pixels for its robustness enhancement. Due to superior characteristics of a wavelet space, the proposed FCM is performed in this space rather than Euclidean one used in conventional FCM to secure its high robustness. Experiments on synthetic and real-world G-images demonstrate that it indeed achieves higher robustness and performance than the state-of-the-art FCM algorithms. Moreover, it requires less computation than most of them.
IVApr 15, 2020
Residual-driven Fuzzy C-Means Clustering for Image SegmentationCong Wang, Witold Pedrycz, ZhiWu Li et al.
Due to its inferior characteristics, an observed (noisy) image's direct use gives rise to poor segmentation results. Intuitively, using its noise-free image can favorably impact image segmentation. Hence, the accurate estimation of the residual between observed and noise-free images is an important task. To do so, we elaborate on residual-driven Fuzzy C-Means (FCM) for image segmentation, which is the first approach that realizes accurate residual estimation and leads noise-free image to participate in clustering. We propose a residual-driven FCM framework by integrating into FCM a residual-related fidelity term derived from the distribution of different types of noise. Built on this framework, we present a weighted $\ell_{2}$-norm fidelity term by weighting mixed noise distribution, thus resulting in a universal residual-driven FCM algorithm in presence of mixed or unknown noise. Besides, with the constraint of spatial information, the residual estimation becomes more reliable than that only considering an observed image itself. Supporting experiments on synthetic, medical, and real-world images are conducted. The results demonstrate the superior effectiveness and efficiency of the proposed algorithm over existing FCM-related algorithms.
CVFeb 21, 2020
Kullback-Leibler Divergence-Based Fuzzy $C$-Means Clustering Incorporating Morphological Reconstruction and Wavelet Frames for Image SegmentationCong Wang, Witold Pedrycz, ZhiWu Li et al.
Although spatial information of images usually enhance the robustness of the Fuzzy C-Means (FCM) algorithm, it greatly increases the computational costs for image segmentation. To achieve a sound trade-off between the segmentation performance and the speed of clustering, we come up with a Kullback-Leibler (KL) divergence-based FCM algorithm by incorporating a tight wavelet frame transform and a morphological reconstruction operation. To enhance FCM's robustness, an observed image is first filtered by using the morphological reconstruction. A tight wavelet frame system is employed to decompose the observed and filtered images so as to form their feature sets. Considering these feature sets as data of clustering, an modified FCM algorithm is proposed, which introduces a KL divergence term in the partition matrix into its objective function. The KL divergence term aims to make membership degrees of each image pixel closer to those of its neighbors, which brings that the membership partition becomes more suitable and the parameter setting of FCM becomes simplified. On the basis of the obtained partition matrix and prototypes, the segmented feature set is reconstructed by minimizing the inverse process of the modified objective function. To modify abnormal features produced in the reconstruction process, each reconstructed feature is reassigned to the closest prototype. As a result, the segmentation accuracy of KL divergence-based FCM is further improved. What's more, the segmented image is reconstructed by using a tight wavelet frame reconstruction operation. Finally, supporting experiments coping with synthetic, medical and color images are reported. Experimental results exhibit that the proposed algorithm works well and comes with better segmentation performance than other comparative algorithms. Moreover, the proposed algorithm requires less time than most of the FCM-related algorithms.
CVFeb 14, 2020
Residual-Sparse Fuzzy $C$-Means Clustering Incorporating Morphological Reconstruction and Wavelet framesCong Wang, Witold Pedrycz, ZhiWu Li et al.
Instead of directly utilizing an observed image including some outliers, noise or intensity inhomogeneity, the use of its ideal value (e.g. noise-free image) has a favorable impact on clustering. Hence, the accurate estimation of the residual (e.g. unknown noise) between the observed image and its ideal value is an important task. To do so, we propose an $\ell_0$ regularization-based Fuzzy $C$-Means (FCM) algorithm incorporating a morphological reconstruction operation and a tight wavelet frame transform. To achieve a sound trade-off between detail preservation and noise suppression, morphological reconstruction is used to filter an observed image. By combining the observed and filtered images, a weighted sum image is generated. Since a tight wavelet frame system has sparse representations of an image, it is employed to decompose the weighted sum image, thus forming its corresponding feature set. Taking it as data for clustering, we present an improved FCM algorithm by imposing an $\ell_0$ regularization term on the residual between the feature set and its ideal value, which implies that the favorable estimation of the residual is obtained and the ideal value participates in clustering. Spatial information is also introduced into clustering since it is naturally encountered in image segmentation. Furthermore, it makes the estimation of the residual more reliable. To further enhance the segmentation effects of the improved FCM algorithm, we also employ the morphological reconstruction to smoothen the labels generated by clustering. Finally, based on the prototypes and smoothed labels, the segmented image is reconstructed by using a tight wavelet frame reconstruction operation. Experimental results reported for synthetic, medical, and color images show that the proposed algorithm is effective and efficient, and outperforms other algorithms.
CVApr 25, 2018
Unsupervised Domain Adaptation with Adversarial Residual Transform NetworksGuanyu Cai, Yuqin Wang, Mengchu Zhou et al.
Domain adaptation is widely used in learning problems lacking labels. Recent studies show that deep adversarial domain adaptation models can make markable improvements in performance, which include symmetric and asymmetric architectures. However, the former has poor generalization ability whereas the latter is very hard to train. In this paper, we propose a novel adversarial domain adaptation method named Adversarial Residual Transform Networks (ARTNs) to improve the generalization ability, which directly transforms the source features into the space of target features. In this model, residual connections are used to share features and adversarial loss is reconstructed, thus making the model more generalized and easier to train. Moreover, a special regularization term is added to the loss function to alleviate a vanishing gradient problem, which enables its training process stable. A series of experiments based on Amazon review dataset, digits datasets and Office-31 image datasets are conducted to show that the proposed ARTN can be comparable with the methods of the state-of-the-art.