Daniel T. Chang

LG
h-index1
23papers
110citations
Novelty34%
AI Score31

23 Papers

QUANT-PHSep 28, 2022
Parameterized Quantum Circuits with Quantum Kernels for Machine Learning: A Hybrid Quantum-Classical Approach

Daniel T. Chang

Quantum machine learning (QML) is the use of quantum computing for the computation of machine learning algorithms. With the prevalence and importance of classical data, a hybrid quantum-classical approach to QML is called for. Parameterized Quantum Circuits (PQCs), and particularly Quantum Kernel PQCs, are generally used in the hybrid approach to QML. In this paper we discuss some important aspects of PQCs with quantum kernels including PQCs, quantum kernels, quantum kernels with quantum advantage, and the trainability of quantum kernels. We conclude that quantum kernels with hybrid kernel methods, a.k.a. quantum kernel methods, offer distinct advantages as a hybrid approach to QML. Not only do they apply to Noisy Intermediate-Scale Quantum (NISQ) devices, but they also can be used to solve all types of machine learning problems including regression, classification, clustering, and dimension reduction. Furthermore, beyond quantum utility, quantum advantage can be attained if the quantum kernels, i.e., the quantum feature encodings, are classically intractable.

LGMar 1, 2022
Dual Embodied-Symbolic Concept Representations for Deep Learning

Daniel T. Chang

Motivated by recent findings from cognitive neural science, we advocate the use of a dual-level model for concept representations: the embodied level consists of concept-oriented feature representations, and the symbolic level consists of concept graphs. Embodied concept representations are modality specific and exist in the form of feature vectors in a feature space. Symbolic concept representations, on the other hand, are amodal and language specific, and exist in the form of word / knowledge-graph embeddings in a concept / knowledge space. The human conceptual system comprises both embodied representations and symbolic representations, which typically interact to drive conceptual processing. As such, we further advocate the use of dual embodied-symbolic concept representations for deep learning. To demonstrate their usage and value, we discuss two important use cases: embodied-symbolic knowledge distillation for few-shot class incremental learning, and embodied-symbolic fused representation for image-text matching. Dual embodied-symbolic concept representations are the foundation for deep learning and symbolic AI integration. We discuss two important examples of such integration: scene graph generation with knowledge graph bridging, and multimodal knowledge graphs.

LGJun 29, 2023
Concept-Oriented Deep Learning with Large Language Models

Daniel T. Chang

Large Language Models (LLMs) have been successfully used in many natural-language tasks and applications including text generation and AI chatbots. They also are a promising new technology for concept-oriented deep learning (CODL). However, the prerequisite is that LLMs understand concepts and ensure conceptual consistency. We discuss these in this paper, as well as major uses of LLMs for CODL including concept extraction from text, concept graph extraction from text, and concept learning. Human knowledge consists of both symbolic (conceptual) knowledge and embodied (sensory) knowledge. Text-only LLMs, however, can represent only symbolic (conceptual) knowledge. Multimodal LLMs, on the other hand, are capable of representing the full range (conceptual and sensory) of human knowledge. We discuss conceptual understanding in visual-language LLMs, the most important multimodal LLMs, and major uses of them for CODL including concept extraction from image, concept graph extraction from image, and concept learning. While uses of LLMs for CODL are valuable standalone, they are particularly valuable as part of LLM applications such as AI chatbots.

CLMar 4, 2023
Variational Quantum Classifiers for Natural-Language Text

Daniel T. Chang

As part of the recent research effort on quantum natural language processing (QNLP), variational quantum sentence classifiers (VQSCs) have been implemented and supported in lambeq / DisCoPy, based on the DisCoCat model of sentence meaning. We discuss in some detail VQSCs, including category theory, DisCoCat for modeling sentence as string diagram, and DisCoPy for encoding string diagram as parameterized quantum circuit. Many NLP tasks, however, require the handling of text consisting of multiple sentences, which is not supported in lambeq / DisCoPy. A good example is sentiment classification of customer feedback or product review. We discuss three potential approaches to variational quantum text classifiers (VQTCs), in line with VQSCs. The first is a weighted bag-of-sentences approach which treats text as a group of independent sentences with task-specific sentence weighting. The second is a coreference resolution approach which treats text as a consolidation of its member sentences with coreferences among them resolved. Both approaches are based on the DisCoCat model and should be implementable in lambeq / DisCoCat. The third approach, on the other hand, is based on the DisCoCirc model which considers both ordering of sentences and interaction of words in composing text meaning from word and sentence meanings. DisCoCirc makes fundamental modification of DisCoCat since a sentence in DisCoCirc updates meanings of words, whereas all meanings are static in DisCoCat. It is not clear if DisCoCirc can be implemented in lambeq / DisCoCat without breaking DisCoCat.

QUANT-PHNov 8, 2022
Variational Quantum Kernels with Task-Specific Quantum Metric Learning

Daniel T. Chang

Quantum kernel methods, i.e., kernel methods with quantum kernels, offer distinct advantages as a hybrid quantum-classical approach to quantum machine learning (QML), including applicability to Noisy Intermediate-Scale Quantum (NISQ) devices and usage for solving all types of machine learning problems. Kernel methods rely on the notion of similarity between points in a higher (possibly infinite) dimensional feature space. For machine learning, the notion of similarity assumes that points close in the feature space should be close in the machine learning task space. In this paper, we discuss the use of variational quantum kernels with task-specific quantum metric learning to generate optimal quantum embeddings (a.k.a. quantum feature encodings) that are specific to machine learning tasks. Such task-specific optimal quantum embeddings, implicitly supporting feature selection, are valuable not only to quantum kernel methods in improving the latter's performance, but they can also be valuable to non-kernel QML methods based on parameterized quantum circuits (PQCs) as pretrained embeddings and for transfer learning. This further demonstrates the quantum utility, and quantum advantage (with classically-intractable quantum embeddings), of quantum kernel methods.

LGJul 16, 2022
Distance-Geometric Graph Attention Network (DG-GAT) for 3D Molecular Geometry

Daniel T. Chang

Deep learning for molecular science has so far mainly focused on 2D molecular graphs. Recently, however, there has been work to extend it to 3D molecular geometry, due to its scientific significance and critical importance in real-world applications. The 3D distance-geometric graph representation (DG-GR) adopts a unified scheme (distance) for representing the geometry of 3D graphs. It is invariant to rotation and translation of the graph, and it reflects pair-wise node interactions and their generally local nature, particularly relevant for 3D molecular geometry. To facilitate the incorporation of 3D molecular geometry in deep learning for molecular science, we adopt the new graph attention network with dynamic attention (GATv2) for use with DG-GR and propose the 3D distance-geometric graph attention network (DG-GAT). GATv2 is a great fit for DG-GR since the attention can vary by node and by distance between nodes. Experimental results of DG-GAT for the ESOL and FreeSolv datasets show major improvement (31% and 38%, respectively) over those of the standard graph convolution network based on 2D molecular graphs. The same is true for the QM9 dataset. Our work demonstrates the utility and value of DG-GAT for deep learning based on 3D molecular geometry.

LGMay 13, 2022
Embodied-Symbolic Contrastive Graph Self-Supervised Learning for Molecular Graphs

Daniel T. Chang

Dual embodied-symbolic concept representations are the foundation for deep learning and symbolic AI integration. We discuss the use of dual embodied-symbolic concept representations for molecular graph representation learning, specifically with exemplar-based contrastive self-supervised learning (SSL). The embodied representations are learned from molecular graphs, and the symbolic representations are learned from the corresponding Chemical knowledge graph (KG). We use the Chemical KG to enhance molecular graphs with symbolic (semantic) knowledge and generate their augmented molecular graphs. We treat a molecular graph and its semantically augmented molecular graph as exemplars of the same semantic class, and use the pairs as positive pairs in exemplar-based contrastive SSL.

LGDec 11, 2019Code
Bayesian Hyperparameter Optimization with BoTorch, GPyTorch and Ax

Daniel T. Chang

Deep learning models are full of hyperparameters, which are set manually before the learning process can start. To find the best configuration for these hyperparameters in such a high dimensional space, with time-consuming and expensive model training / validation, is not a trivial challenge. Bayesian optimization is a powerful tool for the joint optimization of hyperparameters, efficiently trading off exploration and exploitation of the hyperparameter space. In this paper, we discuss Bayesian hyperparameter optimization, including hyperparameter optimization, Bayesian optimization, and Gaussian processes. We also review BoTorch, GPyTorch and Ax, the new open-source frameworks that we use for Bayesian optimization, Gaussian process inference and adaptive experimentation, respectively. For experimentation, we apply Bayesian hyperparameter optimization, for optimizing group weights, to weighted group pooling, which couples unsupervised tiered graph autoencoders learning and supervised graph prediction learning for molecular graphs. We find that Ax, BoTorch and GPyTorch together provide a simple-to-use but powerful framework for Bayesian hyperparameter optimization, using Ax's high-level API that constructs and runs a full optimization loop and returns the best hyperparameter configuration.

LGMay 14, 2024
Hypergraph: A Unified and Uniform Definition with Application to Chemical Hypergraph and More

Daniel T. Chang

The conventional definition of hypergraph has two major issues: (1) there is not a standard definition of directed hypergraph and (2) there is not a formal definition of nested hypergraph. To resolve these issues, we propose a new definition of hypergraph that unifies the concepts of undirected, directed and nested hypergraphs, and that is uniform in using hyperedge as a single construct for representing high-order correlations among things, i.e., nodes and hyperedges. Specifically, we define a hyperedge to be a simple hyperedge, a nesting hyperedge, or a directed hyperedge. With this new definition, a hypergraph is nested if it has nesting hyperedge(s), and is directed if it has directed hyperedge(s). Otherwise, a hypergraph is a simple hypergraph. The uniformity and power of this new definition, with visualization, should facilitate the use of hypergraph for representing (hierarchical) high-order correlations in general and chemical systems in particular. Graph has been widely used as a mathematical structure for machine learning on molecular structures and 3D molecular geometries. However, graph has a major limitation: it can represent only pairwise correlations between nodes. Hypergraph extends graph with high-order correlations among nodes. This extension is significant or essential for machine learning on chemical systems. For molecules, this is significant as it allows the direct, explicit representation of multicenter bonds and molecular substructures. For chemical reactions, this is essential since most chemical reactions involve multiple participants. We propose the use of chemical hypergraph, a multilevel hypergraph with simple, nesting and directed hyperedges, as a single mathematical structure for representing chemical systems. We apply the new definition of hypergraph to chemical hypergraph and, as simplified versions, molecular hypergraph and chemical reaction hypergraph.

AIJun 17, 2025
Individual Causal Inference with Structural Causal Model

Daniel T. Chang

Individual causal inference (ICI) uses causal inference methods to understand and predict the effects of interventions on individuals, considering their specific characteristics / facts. It aims to estimate individual causal effect (ICE), which varies across individuals. Estimating ICE can be challenging due to the limited data available for individuals, and the fact that most causal inference methods are population-based. Structural Causal Model (SCM) is fundamentally population-based. Therefore, causal discovery (structural learning and parameter learning), association queries and intervention queries are all naturally population-based. However, exogenous variables (U) in SCM can encode individual variations and thus provide the mechanism for individualized population per specific individual characteristics / facts. Based on this, we propose ICI with SCM as a "rung 3" causal inference, because it involves "imagining" what would be the causal effect of a hypothetical intervention on an individual, given the individual's observed characteristics / facts. Specifically, we propose the indiv-operator, indiv(W), to formalize/represent the population individualization process, and the individual causal query, P(Y | indiv(W), do(X), Z), to formalize/represent ICI. We show and argue that ICI with SCM is inference on individual alternatives (possible), not individual counterfactuals (non-actual).

LGFeb 5, 2022
Exemplar-Based Contrastive Self-Supervised Learning with Few-Shot Class Incremental Learning

Daniel T. Chang

Humans are capable of learning new concepts from only a few (labeled) exemplars, incrementally and continually. This happens within the context that we can differentiate among the exemplars, and between the exemplars and large amounts of other data (unlabeled and labeled). This suggests, in human learning, supervised learning of concepts based on exemplars takes place within the larger context of contrastive self-supervised learning (CSSL) based on unlabeled and labeled data. We discuss extending CSSL (1) to be based mainly on exemplars and only secondly on data augmentation, and (2) to apply to both unlabeled data (a large amount is available in general) and labeled data (a few exemplars can be obtained with valuable supervised knowledge). A major benefit of the extensions is that exemplar-based CSSL, with supervised finetuning, supports few-shot class incremental learning (CIL). Specifically, we discuss exemplar-based CSSL including: nearest-neighbor CSSL, neighborhood CSSL with supervised pretraining, and exemplar CSSL with supervised finetuning. We further discuss using exemplar-based CSSL to facilitate few-shot learning and, in particular, few-shot CIL.

LGDec 10, 2021
Concept Representation Learning with Contrastive Self-Supervised Learning

Daniel T. Chang

Concept-oriented deep learning (CODL) is a general approach to meet the future challenges for deep learning: (1) learning with little or no external supervision, (2) coping with test examples that come from a different distribution than the training examples, and (3) integrating deep learning with symbolic AI. In CODL, as in human learning, concept representations are learned based on concept exemplars. Contrastive self-supervised learning (CSSL) provides a promising approach to do so, since it: (1) uses data-driven associations, to get away from semantic labels, (2) supports incremental and continual learning, to get away from (large) fixed datasets, and (3) accommodates emergent objectives, to get away from fixed objectives (tasks). We discuss major aspects of concept representation learning using CSSL. These include dual-level concept representations, CSSL for feature representations, exemplar similarity measures and self-supervised relational reasoning, incremental and continual CSSL, and contrastive self-supervised concept (class) incremental learning. The discussion leverages recent findings from cognitive neural science and CSSL.

LGJul 14, 2021
Hybrid Bayesian Neural Networks with Functional Probabilistic Layers

Daniel T. Chang

Bayesian neural networks provide a direct and natural way to extend standard deep neural networks to support probabilistic deep learning through the use of probabilistic layers that, traditionally, encode weight (and bias) uncertainty. In particular, hybrid Bayesian neural networks utilize standard deterministic layers together with few probabilistic layers judicially positioned in the networks for uncertainty estimation. A major aspect and benefit of Bayesian inference is that priors, in principle, provide the means to encode prior knowledge for use in inference and prediction. However, it is difficult to specify priors on weights since the weights have no intuitive interpretation. Further, the relationships of priors on weights to the functions computed by networks are difficult to characterize. In contrast, functions are intuitive to interpret and are direct since they map inputs to outputs. Therefore, it is natural to specify priors on functions to encode prior knowledge, and to use them in inference and prediction based on functions. To support this, we propose hybrid Bayesian neural networks with functional probabilistic layers that encode function (and activation) uncertainty. We discuss their foundations in functional Bayesian inference, functional variational inference, sparse Gaussian processes, and sparse variational Gaussian processes. We further perform few proof-of-concept experiments using GPflus, a new library that provides Gaussian process layers and supports their use with deterministic Keras layers to form hybrid neural network and Gaussian process models.

LGJun 22, 2021
Bayesian Neural Networks: Essentials

Daniel T. Chang

Bayesian neural networks utilize probabilistic layers that capture uncertainty over weights and activations, and are trained using Bayesian inference. Since these probabilistic layers are designed to be drop-in replacement of their deterministic counter parts, Bayesian neural networks provide a direct and natural way to extend conventional deep neural networks to support probabilistic deep learning. However, it is nontrivial to understand, design and train Bayesian neural networks due to their complexities. We discuss the essentials of Bayesian neural networks including duality (deep neural networks, probabilistic models), approximate Bayesian inference, Bayesian priors, Bayesian posteriors, and deep variational learning. We use TensorFlow Probability APIs and code examples for illustration. The main problem with Bayesian neural networks is that the architecture of deep neural networks makes it quite redundant, and costly, to account for uncertainty for a large number of successive layers. Hybrid Bayesian neural networks, which use few probabilistic layers judicially positioned in the networks, provide a practical solution.

LGMay 31, 2021
Probabilistic Deep Learning with Probabilistic Neural Networks and Deep Probabilistic Models

Daniel T. Chang

Probabilistic deep learning is deep learning that accounts for uncertainty, both model uncertainty and data uncertainty. It is based on the use of probabilistic models and deep neural networks. We distinguish two approaches to probabilistic deep learning: probabilistic neural networks and deep probabilistic models. The former employs deep neural networks that utilize probabilistic layers which can represent and process uncertainty; the latter uses probabilistic models that incorporate deep neural network components which capture complex non-linear stochastic relationships between the random variables. We discuss some major examples of each approach including Bayesian neural networks and mixture density networks (for probabilistic neural networks), and variational autoencoders, deep Gaussian processes and deep mixed effects models (for deep probabilistic models). TensorFlow Probability is a library for probabilistic modeling and inference which can be used for both approaches of probabilistic deep learning. We include its code examples for illustration.

CVJul 6, 2020
Distance-Geometric Graph Convolutional Network (DG-GCN) for Three-Dimensional (3D) Graphs

Daniel T. Chang

The distance-geometric graph representation adopts a unified scheme (distance) for representing the geometry of three-dimensional(3D) graphs. It is invariant to rotation and translation of the graph and it reflects pair-wise node interactions and their generally local nature. To facilitate the incorporation of geometry in deep learning on 3D graphs, we propose a message-passing graph convolutional network based on the distance-geometric graph representation: DG-GCN (distance-geometric graph convolution network). It utilizes continuous-filter convolutional layers, with filter-generating networks, that enable learning of filter weights from distances, thereby incorporating the geometry of 3D graphs in graph convolutions. Our results for the ESOL and FreeSolv datasets show major improvement over those of standard graph convolutions. They also show significant improvement over those of geometric graph convolutions employing edge weight / edge distance power laws. Our work demonstrates the utility and value of DG-GCN for end-to-end deep learning on 3D graphs, particularly molecular graphs.

CVJun 2, 2020
Geometric Graph Representations and Geometric Graph Convolutions for Deep Learning on Three-Dimensional (3D) Graphs

Daniel T. Chang

The geometry of three-dimensional (3D) graphs, consisting of nodes and edges, plays a crucial role in many important applications. An excellent example is molecular graphs, whose geometry influences important properties of a molecule including its reactivity and biological activity. To facilitate the incorporation of geometry in deep learning on 3D graphs, we define three types of geometric graph representations: positional, angle-geometric and distance-geometric. For proof of concept, we use the distance-geometric graph representation for geometric graph convolutions. Further, to utilize standard graph convolution networks, we employ a simple edge weight / edge distance correlation scheme, whose parameters can be fixed using reference values or determined through Bayesian hyperparameter optimization. The results of geometric graph convolutions, for the ESOL and Freesol datasets, show significant improvement over those of standard graph convolutions. Our work demonstrates the feasibility and promise of incorporating geometry, using the distance-geometric graph representation, in deep learning on 3D graphs.

LGOct 24, 2019
Deep Learning for Molecular Graphs with Tiered Graph Autoencoders and Graph Prediction

Daniel T. Chang

Tiered graph autoencoders provide the architecture and mechanisms for learning tiered latent representations and latent spaces for molecular graphs that explicitly represent and utilize groups (e.g., functional groups). This enables the utilization and exploration of tiered molecular latent spaces, either individually - the node (atom) tier, the group tier, or the graph (molecule) tier - or jointly, as well as navigation across the tiers. In this paper, we discuss the use of tiered graph autoencoders together with graph prediction for molecular graphs. We show features of molecular graphs used, and groups in molecular graphs identified for some sample molecules. We briefly review graph prediction and the QM9 dataset for background information, and discuss the use of tiered graph embeddings for graph prediction, particularly weighted group pooling. We find that functional groups and ring groups effectively capture and represent the chemical essence of molecular graphs (structures). Further, tiered graph autoencoders and graph prediction together provide effective, efficient and interpretable deep learning for molecular graphs, with the former providing unsupervised, transferable learning and the latter providing supervised, task-optimized learning.

LGAug 22, 2019
Tiered Graph Autoencoders with PyTorch Geometric for Molecular Graphs

Daniel T. Chang

Tiered latent representations and latent spaces for molecular graphs provide a simple but effective way to explicitly represent and utilize groups (e.g., functional groups), which consist of the atom (node) tier, the group tier and the molecule (graph) tier. They can be learned using the tiered graph autoencoder architecture. In this paper we discuss adapting tiered graph autoencoders for use with PyTorch Geometric, for both the deterministic tiered graph autoencoder model and the probabilistic tiered variational graph autoencoder model. We also discuss molecular structure information sources that can be accessed to extract training data for molecular graphs. To support transfer learning, a critical consideration is that the information must utilize standard unique molecule and constituent atom identifiers. As a result of using tiered graph autoencoders for deep learning, each molecular graph possesses tiered latent representations. At each tier, the latent representation consists of: node features, edge indices, edge features, membership matrix, and node embeddings. This enables the utilization and exploration of tiered molecular latent spaces, either individually (the node tier, the group tier, or the graph tier) or jointly, as well as navigation across the tiers.

LGMar 21, 2019
Tiered Latent Representations and Latent Spaces for Molecular Graphs

Daniel T. Chang

Molecular graphs generally contain subgraphs (known as groups) that are identifiable and significant in composition, functionality, geometry, etc. Flat latent representations (node embeddings or graph embeddings) fail to represent, and support the use of, groups. Fully hierarchical latent representations, on the other hand, are difficult to learn and, even if learned, may be too complex to use or interpret. We propose tiered latent representations and latent spaces for molecular graphs as a simple way to explicitly represent and utilize groups, which consist of the atom (node) tier, the group tier and the molecule (graph) tier. Specifically, we propose an architecture for learning tiered latent representations and latent spaces using graph autoencoders, graph neural networks, differentiable group pooling and the membership matrix. We discuss its various components, major challenges and related work, for both a deterministic and a probabilistic model. We also briefly discuss the usage and exploration of tiered latent spaces. The tiered approach is applicable to other types of structured graphs similar in nature to molecular graphs.

LGFeb 11, 2019
Probabilistic Generative Deep Learning for Molecular Design

Daniel T. Chang

Probabilistic generative deep learning for molecular design involves the discovery and design of new molecules and analysis of their structure, properties and activities by probabilistic generative models using the deep learning approach. It leverages the existing huge databases and publications of experimental results, and quantum-mechanical calculations, to learn and explore molecular structure, properties and activities. We discuss the major components of probabilistic generative deep learning for molecular design, which include molecular structure, molecular representations, deep generative models, molecular latent representations and latent space, molecular structure-property and structure-activity relationships, molecular similarity and molecular design. We highlight significant recent work using or applicable to this new approach.

LGDec 26, 2018
Latent Variable Modeling for Generative Concept Representations and Deep Generative Models

Daniel T. Chang

Latent representations are the essence of deep generative models and determine their usefulness and power. For latent representations to be useful as generative concept representations, their latent space must support latent space interpolation, attribute vectors and concept vectors, among other things. We investigate and discuss latent variable modeling, including latent variable models, latent representations and latent spaces, particularly hierarchical latent representations and latent space vectors and geometry. Our focus is on that used in variational autoencoders and generative adversarial networks.

LGNov 15, 2018
Concept-Oriented Deep Learning: Generative Concept Representations

Daniel T. Chang

Generative concept representations have three major advantages over discriminative ones: they can represent uncertainty, they support integration of learning and reasoning, and they are good for unsupervised and semi-supervised learning. We discuss probabilistic and generative deep learning, which generative concept representations are based on, and the use of variational autoencoders and generative adversarial networks for learning generative concept representations, particularly for concepts whose data are sequences, structured data or graphs.