Gregory Morse

CR
7papers
9citations
Novelty34%
AI Score46

7 Papers

PLApr 17
Emulation-Completeness of Programming Languages

Gregory Morse, Tamás Kozsik

We study when a programming language can emulate programs written in that same language without delegating the guest program back to the host evaluator or compiler. We call this property emulation-completeness. The central observation is that Turing-completeness by itself is not enough: a self-emulator must not only compute the guest program's result, but must also account for the guest-visible state on which realistic programs depend, including control flow, exceptions, callbacks, timing, memory usage, and runtime metadata such as stack traces or line numbers. This paper is a systematization paper. Its contribution is not a new emulator implementation, but a precise vocabulary and a structured taxonomy for reasoning about self-emulation. We distinguish source-level evaluation from compiled-code emulation, define syntactic and compiled-code emulation-completeness, and separate weak from strong emulation-completeness according to how much observable runtime behavior must be preserved. We then organize the requirements into two classes: language-side requirements, which determine whether the guest semantics can be represented explicitly inside the language, and emulator-side requirements, which determine whether the resulting emulator can faithfully mask or reproduce relevant observations. The discussion is grounded by concrete examples, including publicly documented details from Erlang, where argument limits, bitstring pattern matching, and message reception expose subtle mismatches between direct execution and self-emulation. The resulting framework is intended as guidance for language designers, implementers of evaluators and emulators, and researchers interested in secure sandboxing, decompilation, and reflective execution.

DSApr 15
Fully Dynamic Maintenance of Loop Nesting Forests in Reducible Flow Graphs

Gregory Morse, Tamás Kozsik

Loop nesting forests (LNFs) are a fundamental abstraction for reasoning about control-flow structure, enabling applications such as compiler optimizations, program analysis, and dominator computation. While efficient static algorithms for constructing LNFs are well understood, maintaining them under dynamic graph updates has remained largely unexplored due to the lack of efficient dynamic depth-first search (DFS) maintenance. In this paper, we present the first fully dynamic algorithm for maintaining loop nesting forests in reducible control-flow graphs. Our approach leverages recent advances in dynamic DFS maintenance to incrementally update loop structure under edge insertions and deletions. We show that updates can be confined to local regions of the depth-first spanning tree, avoiding global recomputation. We provide formal invariants, correctness arguments, and complexity analysis, and demonstrate how the maintained LNF enables efficient derivation of dominance information. Our results establish LNFs as a practical dynamic abstraction for modern compiler and analysis pipelines.

CRApr 15
A compact QUBO encoding of computational logic formulae demonstrated on cryptography constructions

Gregory Morse, Tamás Kozsik, Oskar Mencer et al.

We aim to advance the state-of-the-art in Quadratic Unconstrained Binary Optimization formulation with a focus on cryptography algorithms. As the minimal QUBO encoding of the linear constraints of optimization problems emerges as the solution of integer linear programming (ILP) problems, by solving special boolean logic formulas (like ANF and DNF) for their integer coefficients it is straightforward to handle any normal form, or any substitution for multi-input AND, OR or XOR operations in a QUBO form. To showcase the efficiency of the proposed approach we considered the most widespread cryptography algorithms including AES-128/192/256, MD5, SHA1 and SHA256. For each of these, we achieved QUBO instances reduced by thousands of logical variables compared to previously published results, while keeping the QUBO matrix sparse and the magnitude of the coefficients low. In the particular case of AES-256 cryptography function we obtained more than 8x reduction in variable count compared to previous results. The demonstrated reduction in QUBO sizes notably increases the vulnerability of cryptography algorithms against future quantum annealers, capable of embedding around $30$ thousands of logical variables.

PLApr 15
Erlang Binary and Source Code Obfuscation

Gregory Morse, Tamás Kozsik

This paper studies obfuscation techniques for Erlang programs at the source, abstract syntax tree, BEAM assembly, and BEAM bytecode levels. We focus on transformations that complicate reverse engineering, decompilation, and recompilation while remaining grounded in the actual behavior of the Erlang compiler, validator, loader, and virtual machine. The paper categorizes opcode-level dependency tricks, receive-based loop encodings, irregular control-flow constructions, mutability-oriented performance obfuscation, and self-modifying code enabled by dynamic module loading. A recurring theme is that effective obfuscation in BEAM often arises not from arbitrary corruption, but from exploiting representational gaps between high-level Erlang semantics and the lower-level execution model accepted by the toolchain and runtime.

DSApr 14
Fully Dynamic Breadth First Search and Spanning Trees in Directed Graphs

Gregory Morse, Tamás Kozsik

We study the problem of maintaining a breadth-first spanning tree and the induced BFS ordering in a directed graph under edge updates. While semi-dynamic algorithms are known, maintaining the spanning tree, level information, and numbering together in the fully dynamic setting is less developed. This preprint presents a framework for fully dynamic BFS in directed graphs together with supporting data structures for maintaining the BFS tree, single-source shortest paths, and single-source reachability under both insertions and deletions.

CRApr 14
Tamper-Proofing with Self-Modifying Code

Gregory Morse, Tamás Kozsik

Classical computability theory tells us that self-modifying code (SMC) on a deterministic universal Turing machine can be simulated by non-SMC code on the same model. That abstraction, however, omits the external timing inputs, concurrency, and microarchitectural state that dominate practical execution on modern processors. We argue that once timing, ordering, and self-introspective effects are treated as observables, a practically faithful non-SMC reproduction of timed SMC becomes detectably expensive on commodity systems. We present a tamper-proofing model that combines introspective and polymorphic SMC, reliable clocks, and runtime timing predicates to bind integrity checks to execution behavior. We distinguish static and dynamic SMC generation, characterize the timing semantics needed to avoid catastrophic pipeline clears, and give x86-64 design primitives for checksum-driven self-patching. We also report timer measurements, performance comparisons, and performance-monitoring counter evidence showing that careful engineering -- especially loop unrolling and cross-page modification -- substantially reduces the overhead of SMC while preserving its tamper-detection value. The paper concludes with an efficiency analysis, a threat model, and deployment guidance for trusted code executing in untrusted environments.

NEJun 6, 2014
Unsupervised Feature Learning through Divergent Discriminative Feature Accumulation

Paul A. Szerlip, Gregory Morse, Justin K. Pugh et al.

Unlike unsupervised approaches such as autoencoders that learn to reconstruct their inputs, this paper introduces an alternative approach to unsupervised feature learning called divergent discriminative feature accumulation (DDFA) that instead continually accumulates features that make novel discriminations among the training set. Thus DDFA features are inherently discriminative from the start even though they are trained without knowledge of the ultimate classification problem. Interestingly, DDFA also continues to add new features indefinitely (so it does not depend on a hidden layer size), is not based on minimizing error, and is inherently divergent instead of convergent, thereby providing a unique direction of research for unsupervised feature learning. In this paper the quality of its learned features is demonstrated on the MNIST dataset, where its performance confirms that indeed DDFA is a viable technique for learning useful features.