COMar 20, 2023
Seven open problems in applied combinatoricsSinan G. Aksoy, Ryan Bennink, Yuzhou Chen et al.
We present and discuss seven different open problems in applied combinatorics. The application areas relevant to this compilation include quantum computing, algorithmic differentiation, topological data analysis, iterative methods, hypergraph cut algorithms, and power systems.
LGFeb 9
What do Geometric Hallucination Detection Metrics Actually Measure?Eric Yeats, John Buckheit, Sarah Scullen et al.
Hallucination remains a barrier to deploying generative models in high-consequence applications. This is especially true in cases where external ground truth is not readily available to validate model outputs. This situation has motivated the study of geometric signals in the internal state of an LLM that are predictive of hallucination and require limited external knowledge. Given that there are a range of factors that can lead model output to be called a hallucination (e.g., irrelevance vs incoherence), in this paper we ask what specific properties of a hallucination these geometric statistics actually capture. To assess this, we generate a synthetic dataset which varies distinct properties of output associated with hallucination. This includes output correctness, confidence, relevance, coherence, and completeness. We find that different geometric statistics capture different types of hallucinations. Along the way we show that many existing geometric detection methods have substantial sensitivity to shifts in task domain (e.g., math questions vs. history questions). Motivated by this, we introduce a simple normalization method to mitigate the effect of domain shift on geometric statistics, leading to AUROC gains of +34 points in multi-domain settings.
NEApr 28, 2021
Neuromorphic Computing is Turing-CompletePrasanna Date, Catherine Schuman, Bill Kay et al.
Neuromorphic computing is a non-von Neumann computing paradigm that performs computation by emulating the human brain. Neuromorphic systems are extremely energy-efficient and known to consume thousands of times less power than CPUs and GPUs. They have the potential to drive critical use cases such as autonomous vehicles, edge computing and internet of things in the future. For this reason, they are sought to be an indispensable part of the future computing landscape. Neuromorphic systems are mainly used for spike-based machine learning applications, although there are some non-machine learning applications in graph theory, differential equations, and spike-based simulations. These applications suggest that neuromorphic computing might be capable of general-purpose computing. However, general-purpose computability of neuromorphic computing has not been established yet. In this work, we prove that neuromorphic computing is Turing-complete and therefore capable of general-purpose computing. Specifically, we present a model of neuromorphic computing, with just two neuron parameters (threshold and leak), and two synaptic parameters (weight and delay). We devise neuromorphic circuits for computing all the μ-recursive functions (i.e., constant, successor and projection functions) and all the μ-recursive operators (i.e., composition, primitive recursion and minimization operators). Given that the μ-recursive functions and operators are precisely the ones that can be computed using a Turing machine, this work establishes the Turing-completeness of neuromorphic computing.
CRDec 29, 2020
A Comprehensive Guide to CAN IDS Data & Introduction of the ROAD DatasetMiki E. Verma, Robert A. Bridges, Michael D. Iannacone et al.
Although ubiquitous in modern vehicles, Controller Area Networks (CANs) lack basic security properties and are easily exploitable. A rapidly growing field of CAN security research has emerged that seeks to detect intrusions on CANs. Producing vehicular CAN data with a variety of intrusions is out of reach for most researchers as it requires expensive assets and expertise. To assist researchers, we present the first comprehensive guide to the existing open CAN intrusion datasets, including a quality analysis of each dataset and an enumeration of each's benefits, drawbacks, and suggested use case. Current public CAN IDS datasets are limited to real fabrication (simple message injection) attacks and simulated attacks often in synthetic data, which lack fidelity. In general, the physical effects of attacks on the vehicle are not verified in the available datasets. Only one dataset provides signal-translated data but not a corresponding raw binary version. Overall, the available data pigeon-holes CAN IDS works into testing on limited, often inappropriate data (usually with attacks that are too easily detectable to truly test the method), and this lack data has stymied comparability and reproducibility of results. As our primary contribution, we present the ROAD (Real ORNL Automotive Dynamometer) CAN Intrusion Dataset, consisting of over 3.5 hours of one vehicle's CAN data. ROAD contains ambient data recorded during a diverse set of activities, and attacks of increasing stealth with multiple variants and instances of real fuzzing, fabrication, and unique advanced attacks, as well as simulated masquerade attacks. To facilitate benchmarking CAN IDS methods that require signal-translated inputs, we also provide the signal time series format for many of the CAN captures. Our contributions aim to facilitate appropriate benchmarking and needed comparability in the CAN IDS field.
NEApr 21, 2020
Hyperparameter Optimization in Binary Communication Networks for Neuromorphic DeploymentMaryam Parsa, Catherine D. Schuman, Prasanna Date et al.
Training neural networks for neuromorphic deployment is non-trivial. There have been a variety of approaches proposed to adapt back-propagation or back-propagation-like algorithms appropriate for training. Considering that these networks often have very different performance characteristics than traditional neural networks, it is often unclear how to set either the network topology or the hyperparameters to achieve optimal performance. In this work, we introduce a Bayesian approach for optimizing the hyperparameters of an algorithm for training binary communication networks that can be deployed to neuromorphic hardware. We show that by optimizing the hyperparameters on this algorithm for each dataset, we can achieve improvements in accuracy over the previous state-of-the-art for this algorithm on each dataset (by up to 15 percent). This jump in performance continues to emphasize the potential when converting traditional neural networks to binary communication applicable to neuromorphic hardware.