Nicholas R. Allgood

2papers

2 Papers

5.8QUANT-PHApr 4
Adaptive Quantum Optimized Centroid Initialization

Nicholas R. Allgood, Ajinkya Borle, Charles K. Nicholas

Prototype-based clustering algorithms such as k-means are sensitive to the selection of initial cluster centroids, with poor initialization leading to slower convergence and suboptimal solutions trapped in local minima. We present Adaptive Quantum Optimized Centroid Initialization (AQOCI), a method that formulates the centroid initialization problem as a Quadratic Unconstrained Binary Optimization (QUBO) problem and solves it using quantum annealing or quantum-inspired solvers. AQOCI extends a prior method (QOCI) by introducing an iterative refinement mechanism inspired by the Gauss-Seidel and Jacobi methods, enabling the recovery of real-valued centroid coordinates from binary solver outputs through adaptive scaling and offset adjustments. We evaluate AQOCI using three solver backends: TABU search, simulated annealing, and D-Wave's HybridBQM on synthetic Gaussian data with controlled sweeps over cluster separation, cluster count, dimensionality, and sample size, as well as on the MOTIF malware classification dataset, comparing against standard k-means with random initialization and k-means++ initialization. On the MOTIF dataset, AQOCI produces clusterings that are competitive with and, at smaller sample sizes, superior to k-means++, with V-measure improvements of up to 26\%. On synthetic data with heavily overlapping clusters, AQOCI--SimAnn outperforms k-means++ in V-measure. On well-separated synthetic data, k-means++ is clearly superior, and AQOCI exhibits a consistent performance plateau attributable to the binary encoding resolution. The dimensionality sweep demonstrates scalability to at least $d = 10$ without degradation.

QUANT-PHMay 6, 2020
A Quantum Algorithm To Locate Unknown Hashgrams

Nicholas R. Allgood, Charles K. Nicholas

Quantum computing has evolved quickly in recent years and is showing significant benefits in a variety of fields, especially in the realm of cybersecurity. The combination of software used to locate the most frequent hashes and $n$-grams that identify malicious software could greatly benefit from a quantum algorithm. By loading the table of hashes and $n$-grams into a quantum computer we can speed up the process of mapping $n$-grams to their hashes. The first phase will be to use KiloGram to find the top-$k$ hashes and $n$-grams for a large malware corpus. From here, the resulting hash table is then loaded into a quantum simulator. A quantum search algorithm is then used search among every permutation of the entangled key and value pairs to find the desired hash value. This prevents one from having to re-compute hashes for a set of $n$-grams, which can take on average $O(MN)$ time, whereas the quantum algorithm could take $O(\sqrt{N})$ in the number of table lookups to find the desired hash values.