Derek Xu

LG
h-index24
8papers
313citations
Novelty48%
AI Score31

8 Papers

CLNov 15, 2022
Introducing Semantics into Speech Encoders

Derek Xu, Shuyan Dong, Changhan Wang et al. · meta-ai, mit

Recent studies find existing self-supervised speech encoders contain primarily acoustic rather than semantic information. As a result, pipelined supervised automatic speech recognition (ASR) to large language model (LLM) systems achieve state-of-the-art results on semantic spoken language tasks by utilizing rich semantic representations from the LLM. These systems come at the cost of labeled audio transcriptions, which is expensive and time-consuming to obtain. We propose a task-agnostic unsupervised way of incorporating semantic information from LLMs into self-supervised speech encoders without labeled audio transcriptions. By introducing semantics, we improve existing speech encoder spoken language understanding performance by over 10\% on intent classification, with modest gains in named entity resolution and slot filling, and spoken question answering FF1 score by over 2\%. Our unsupervised approach achieves similar performance as supervised methods trained on over 100 hours of labeled audio transcripts, demonstrating the feasibility of unsupervised semantic augmentations to existing speech encoders.

LGSep 15, 2023
Unveiling Invariances via Neural Network Pruning

Derek Xu, Yizhou Sun, Wei Wang

Invariance describes transformations that do not alter data's underlying semantics. Neural networks that preserve natural invariance capture good inductive biases and achieve superior performance. Hence, modern networks are handcrafted to handle well-known invariances (ex. translations). We propose a framework to learn novel network architectures that capture data-dependent invariances via pruning. Our learned architectures consistently outperform dense neural networks on both vision and tabular datasets in both efficiency and effectiveness. We demonstrate our framework on multiple deep learning models across 3 vision and 40 tabular datasets.

MTRL-SCIAug 12, 2024
Inverse designing metamaterials with programmable nonlinear functional responses in graph space

Marco Maurizi, Derek Xu, Yu-Tong Wang et al.

Material responses to static and dynamic stimuli, represented as nonlinear curves, are design targets for engineering functionalities like structural support, impact protection, and acoustic and photonic bandgaps. Three-dimensional metamaterials offer significant tunability due to their internal structure, yet existing methods struggle to capture their complex behavior-to-structure relationships. We present GraphMetaMat, a graph-based framework capable of designing three-dimensional metamaterials with programmable responses and arbitrary manufacturing constraints. Integrating graph networks, physics biases, reinforcement learning, and tree search, GraphMetaMat can target stress-strain curves spanning four orders of magnitude and complex behaviors, as well as viscoelastic transmission responses with varying attenuation gaps. GraphMetaMat can create cushioning materials for protective equipment and vibration-damping panels for electric vehicles, outperforming commercial materials, and enabling the automatic design of materials with on-demand functionalities.

LGJul 21, 2022
Detecting Small Query Graphs in A Large Graph via Neural Subgraph Search

Yunsheng Bai, Derek Xu, Yizhou Sun et al.

Recent advances have shown the success of using reinforcement learning and search to solve NP-hard graph-related tasks, such as Traveling Salesman Optimization, Graph Edit Distance computation, etc. However, it remains unclear how one can efficiently and accurately detect the occurrences of a small query graph in a large target graph, which is a core operation in graph database search, biomedical analysis, social group finding, etc. This task is called Subgraph Matching which essentially performs subgraph isomorphism check between a query graph and a large target graph. One promising approach to this classical problem is the "learning-to-search" paradigm, where a reinforcement learning (RL) agent is designed with a learned policy to guide a search algorithm to quickly find the solution without any solved instances for supervision. However, for the specific task of Subgraph Matching, though the query graph is usually small given by the user as input, the target graph is often orders-of-magnitude larger. It poses challenges to the neural network design and can lead to solution and reward sparsity. In this paper, we propose NSUBS with two innovations to tackle the challenges: (1) A novel encoder-decoder neural network architecture to dynamically compute the matching information between the query and the target graphs at each search state; (2) A novel look-ahead loss function for training the policy network. Experiments on six large real-world target graphs show that NSUBS can significantly improve the subgraph matching performance.

LGJan 18, 2022Code
Ray Based Distributed Autonomous Vehicle Research Platform

Derek Xu

My project tackles the question of whether Ray can be used to quickly train autonomous vehicles using a simulator (Carla), and whether a platform robust enough for further research purposes can be built around it. Ray is an open-source framework that enables distributed machine learning applications. Distributed computing is a technique which parallelizes computational tasks, such as training a model, among many machines. Ray abstracts away the complex coordination of these machines, making it rapidly scalable. Carla is a vehicle simulator that generates data used to train a model. The bulk of the project was writing the training logic that Ray would use to train my distributed model. Imitation learning is the best fit for autonomous vehicles. Imitation learning is an alternative to reinforcement learning and it works by trying to learn the optimal policy by imitating an expert (usually a human) given a set of demonstrations. A key deliverable for the project was showcasing my trained agent in a few benchmark tests, such as navigating a complex turn through traffic. Beyond that, the broader ambition was to develop a research platform where others could quickly train and run experiments on huge amounts of Carla vehicle data. Thus, my end product is not a single model, but a large-scale, open-source research platform (RayCarla) for autonomous vehicle researchers to utilize.

LGFeb 2, 2024
A Survey on Self-Supervised Learning for Non-Sequential Tabular Data

Wei-Yao Wang, Wei-Wei Du, Derek Xu et al.

Self-supervised learning (SSL) has been incorporated into many state-of-the-art models in various domains, where SSL defines pretext tasks based on unlabeled datasets to learn contextualized and robust representations. Recently, SSL has become a new trend in exploring the representation learning capability in the realm of tabular data, which is more challenging due to not having explicit relations for learning descriptive representations. This survey aims to systematically review and summarize the recent progress and challenges of SSL for non-sequential tabular data (SSL4NS-TD). We first present a formal definition of NS-TD and clarify its correlation to related studies. Then, these approaches are categorized into three groups - predictive learning, contrastive learning, and hybrid learning, with their motivations and strengths of representative methods in each direction. Moreover, application issues of SSL4NS-TD are presented, including automatic data engineering, cross-table transferability, and domain knowledge integration. In addition, we elaborate on existing benchmarks and datasets for NS-TD applications to analyze the performance of existing tabular models. Finally, we discuss the challenges of SSL4NS-TD and provide potential directions for future research. We expect our work to be useful in terms of encouraging more research on lowering the barrier to entry SSL for the tabular domain, and of improving the foundations for implicit tabular data.

SEDec 3, 2024
Does Few-Shot Learning Help LLM Performance in Code Synthesis?

Derek Xu, Tong Xie, Botao Xia et al.

Large language models (LLMs) have made significant strides at code generation through improved model design, training, and chain-of-thought. However, prompt-level optimizations remain an important yet under-explored aspect of LLMs for coding. This work focuses on the few-shot examples present in most code generation prompts, offering a systematic study on whether few-shot examples improve LLM's coding capabilities, which few-shot examples have the largest impact, and how to select impactful examples. Our work offers 2 approaches for selecting few-shot examples, a model-free method, CODEEXEMPLAR-FREE, and a model-based method, CODEEXEMPLAR-BASED. The 2 methods offer a trade-off between improved performance and reliance on training data and interpretability. Both methods significantly improve CodeLlama's coding ability across the popular HumanEval+ coding benchmark. In summary, our work provides valuable insights into how to pick few-shot examples in code generation prompts to improve LLM code generation capabilities.

LGFeb 8, 2020
GLSearch: Maximum Common Subgraph Detection via Learning to Search

Yunsheng Bai, Derek Xu, Yizhou Sun et al.

Detecting the Maximum Common Subgraph (MCS) between two input graphs is fundamental for applications in drug synthesis, malware detection, cloud computing, etc. However, MCS computation is NP-hard, and state-of-the-art MCS solvers rely on heuristic search algorithms which in practice cannot find good solution for large graph pairs given a limited computation budget. We propose GLSearch, a Graph Neural Network (GNN) based learning to search model. Our model is built upon the branch and bound algorithm, which selects one pair of nodes from the two input graphs to expand at a time. Instead of using heuristics, we propose a novel GNN-based Deep Q-Network (DQN) to select the node pair, allowing the search process faster and more adaptive. To further enhance the training of DQN, we leverage the search process to provide supervision in a pre-training stage and guide our agent during an imitation learning stage. Experiments on synthetic and real-world large graph pairs demonstrate that our model learns a search strategy that is able to detect significantly larger common subgraphs given the same computation budget. Our GLSearch can be potentially extended to solve many other combinatorial problems with constraints on graphs.