Tad Hogg

RO
h-index45
14papers
973citations
Novelty37%
AI Score30

14 Papers

LGJan 24, 2025
Humanity's Last Exam

Long Phan, Alice Gatti, Ziwen Han et al. · amazon-science, apple-ml

Benchmarks are important tools for tracking the rapid advancements in large language model (LLM) capabilities. However, benchmarks are not keeping pace in difficulty: LLMs now achieve over 90\% accuracy on popular benchmarks like MMLU, limiting informed measurement of state-of-the-art LLM capabilities. In response, we introduce Humanity's Last Exam (HLE), a multi-modal benchmark at the frontier of human knowledge, designed to be the final closed-ended academic benchmark of its kind with broad subject coverage. HLE consists of 2,500 questions across dozens of subjects, including mathematics, humanities, and the natural sciences. HLE is developed globally by subject-matter experts and consists of multiple-choice and short-answer questions suitable for automated grading. Each question has a known solution that is unambiguous and easily verifiable, but cannot be quickly answered via internet retrieval. State-of-the-art LLMs demonstrate low accuracy and calibration on HLE, highlighting a significant gap between current LLM capabilities and the expert human frontier on closed-ended academic questions. To inform research and policymaking upon a clear understanding of model capabilities, we publicly release HLE at https://lastexam.ai.

ROJun 7, 2021
Acoustic Power Management by Swarms of Microscopic Robots

Tad Hogg

Microscopic robots in the body could harvest energy from ultrasound to provide on-board control of autonomous behaviors such as measuring and communicating diagnostic information and precisely delivering drugs. This paper evaluates the acoustic power available to micron-size robots that collect energy using pistons. Acoustic attenuation and viscous drag on the pistons are the major limitations on the available power. Frequencies around 100kHz can deliver hundreds of picowatts to a robot in low-attenuation tissue within about 10cm of transducers on the skin, but much less in high-attenuation tissue such as a lung. However, applications of microscopic robots could involve such large numbers that the robots significantly increase attenuation, thereby reducing power for robots deep in the body. This paper describes how robots can collectively manage where and when they harvest energy to mitigate this attenuation so that a swarm of a few hundred billion robots can provide tens of picowatts to each robot, on average.

QUANT-PHDec 8, 2020
Classical symmetries and the Quantum Approximate Optimization Algorithm

Ruslan Shaydulin, Stuart Hadfield, Tad Hogg et al.

We study the relationship between the Quantum Approximate Optimization Algorithm (QAOA) and the underlying symmetries of the objective function to be optimized. Our approach formalizes the connection between quantum symmetry properties of the QAOA dynamics and the group of classical symmetries of the objective function. The connection is general and includes but is not limited to problems defined on graphs. We show a series of results exploring the connection and highlight examples of hard problem classes where a nontrivial symmetry subgroup can be obtained efficiently. In particular we show how classical objective function symmetries lead to invariant measurement outcome probabilities across states connected by such symmetries, independent of the choice of algorithm parameters or number of layers. To illustrate the power of the developed connection, we apply machine learning techniques towards predicting QAOA performance based on symmetry considerations. We provide numerical evidence that a small set of graph symmetry properties suffices to predict the minimum QAOA depth required to achieve a target approximation ratio on the MaxCut problem, in a practically important setting where QAOA parameter schedules are constrained to be linear and hence easier to optimize.

SIOct 23, 2020
Origins of Algorithmic Instabilities in Crowdsourced Ranking

Keith Burghardt, Tad Hogg, Raissa M. D'Souza et al.

Crowdsourcing systems aggregate decisions of many people to help users quickly identify high-quality options, such as the best answers to questions or interesting news stories. A long-standing issue in crowdsourcing is how option quality and human judgement heuristics interact to affect collective outcomes, such as the perceived popularity of options. We address this limitation by conducting a controlled experiment where subjects choose between two ranked options whose quality can be independently varied. We use this data to construct a model that quantifies how judgement heuristics and option quality combine when deciding between two options. The model reveals popularity-ranking can be unstable: unless the quality difference between the two options is sufficiently high, the higher quality option is not guaranteed to be eventually ranked on top. To rectify this instability, we create an algorithm that accounts for judgement heuristics to infer the best option and rank it first. This algorithm is guaranteed to be optimal if data matches the model. When the data does not match the model, however, simulations show that in practice this algorithm performs better or at least as well as popularity-based and recency-based ranking for any two-choice question. Our work suggests that algorithms relying on inference of mathematical models of user behavior can substantially improve outcomes in crowdsourcing systems.

CROct 7, 2020
Privacy and Data Balkanization: Circumventing the Barriers

Bernardo A. Huberman, Tad Hogg

The rapid growth in digital data forms the basis for a wide range of new services and research, e.g, large-scale medical studies. At the same time, increasingly restrictive privacy concerns and laws are leading to significant overhead in arranging for sharing or combining different data sets to obtain these benefits. For new applications, where the benefit of combined data is not yet clear, this overhead can inhibit organizations from even trying to determine whether they can mutually benefit from sharing their data. In this paper, we discuss techniques to overcome this difficulty by employing private information transfer to determine whether there is a benefit from sharing data, and whether there is room to negotiate acceptable prices. These techniques involve cryptographic protocols. While currently considered secure, these protocols are potentially vulnerable to the development of quantum technology, particularly for ensuring privacy over significant periods of time into the future. To mitigate this concern, we describe how developments in practical quantum technology can improve the security of these protocols.

HCSep 20, 2019
Quantifying the Impact of Cognitive Biases in Question-Answering Systems

Keith Burghardt, Tad Hogg, Kristina Lerman

Crowdsourcing can identify high-quality solutions to problems; however, individual decisions are constrained by cognitive biases. We investigate some of these biases in an experimental model of a question-answering system. In both natural and controlled experiments, we observe a strong position bias in favor of answers appearing earlier in a list of choices. This effect is enhanced by three cognitive factors: the attention an answer receives, its perceived popularity, and cognitive load, measured by the number of choices a user has to process. While separately weak, these effects synergistically amplify position bias and decouple user choices of best answers from their intrinsic quality. We end our paper by discussing the novel ways we can apply these findings to substantially improve how high-quality answers are found in question-answering systems.

LGApr 23, 2019
Quantum-assisted associative adversarial network: Applying quantum annealing in deep learning

Max Wilson, Thomas Vandal, Tad Hogg et al.

We present an algorithm for learning a latent variable generative model via generative adversarial learning where the canonical uniform noise input is replaced by samples from a graphical model. This graphical model is learned by a Boltzmann machine which learns low-dimensional feature representation of data extracted by the discriminator. A quantum annealer, the D-Wave 2000Q, is used to sample from this model. This algorithm joins a growing family of algorithms that use a quantum annealing subroutine in deep learning, and provides a framework to test the advantages of quantum-assisted learning in GANs. Fully connected, symmetric bipartite and Chimera graph topologies are compared on a reduced stochastically binarized MNIST dataset, for both classical and quantum annealing sampling methods. The quantum-assisted associative adversarial network successfully learns a generative model of the MNIST dataset for all topologies, and is also applied to the LSUN dataset bedrooms class for the Chimera topology. Evaluated using the Fréchet inception distance and inception score, the quantum and classical versions of the algorithm are found to have equivalent performance for learning an implicit generative model of the MNIST dataset.

ROOct 17, 2018
Identifying Vessel Branching from Fluid Stresses on Microscopic Robots

Tad Hogg

Objects moving in fluids experience patterns of stress on their surfaces determined by the geometry of nearby boundaries. Flows at low Reynolds number, as occur in microscopic vessels such as capillaries in biological tissues, have relatively simple relations between stresses and nearby vessel geometry. Using these relations, this paper shows how a microscopic robot moving with such flows can use changes in stress on its surface to identify when it encounters vessel branches.

ROApr 2, 2018
Stress-Based Navigation for Microscopic Robots in Viscous Fluids

Tad Hogg

Objects moving in fluids experience patterns of stress on their surfaces determined by their motion and the geometry of nearby boundaries. Fish and underwater robots can use these patterns for navigation. This paper extends this stress-based navigation to microscopic robots in tiny vessels, where robots can exploit the physics of fluids at low Reynolds number. This applies, for instance, in vessels with sizes and flow speeds comparable to those of capillaries in biological tissues. We describe how a robot can use simple computations to estimate its motion, orientation and distance to nearby vessel walls from fluid-induced stresses on its surface. Numerically evaluating these estimates for a variety of vessel sizes and robot positions shows they are most accurate when robots are close to vessel walls.

SIJan 20, 2016
The DARPA Twitter Bot Challenge

V. S. Subrahmanian, Amos Azaria, Skylar Durst et al.

A number of organizations ranging from terrorist groups such as ISIS to politicians and nation states reportedly conduct explicit campaigns to influence opinion on social media, posing a risk to democratic processes. There is thus a growing need to identify and eliminate "influence bots" - realistic, automated identities that illicitly shape discussion on sites like Twitter and Facebook - before they get too influential. Spurred by such events, DARPA held a 4-week competition in February/March 2015 in which multiple teams supported by the DARPA Social Media in Strategic Communications program competed to identify a set of previously identified "influence bots" serving as ground truth on a specific topic within Twitter. Past work regarding influence bots often has difficulty supporting claims about accuracy, since there is limited ground truth (though some exceptions do exist [3,7]). However, with the exception of [3], no past work has looked specifically at identifying influence bots on a specific topic. This paper describes the DARPA Challenge and describes the methods used by the three top-ranked teams.

ROJul 4, 2015
Energy Dissipation by Metamorphic Micro-Robots in Viscous Fluids

Tad Hogg

Microscopic robots could perform tasks with high spatial precision, such as acting on precisely-targeted cells in biological tissues. Some tasks may benefit from robots that change shape, such as elongating to improve chemical gradient sensing or contracting to squeeze through narrow channels. This paper evaluates the energy dissipation for shape-changing (i.e., metamorphic) robots whose size is comparable to bacteria. Unlike larger robots, surface forces dominate the dissipation. Theoretical estimates indicate that the power likely to be available to the robots, as determined by previous studies, is sufficient to change shape fairly rapidly even in highly-viscous biological fluids. Achieving this performance will require significant improvements in manufacturing and material properties compared to current micromachines. Furthermore, optimally varying the speed of shape change only slightly reduces energy use compared to uniform speed, thereby simplifying robot controllers.

SOC-PHOct 24, 2014
Disentangling the Effects of Social Signals

Tad Hogg, Kristina Lerman

Peer recommendation is a crowdsourcing task that leverages the opinions of many to identify interesting content online, such as news, images, or videos. Peer recommendation applications often use social signals, e.g., the number of prior recommendations, to guide people to the more interesting content. How people react to social signals, in combination with content quality and its presentation order, determines the outcomes of peer recommendation, i.e., item popularity. Using Amazon Mechanical Turk, we experimentally measure the effects of social signals in peer recommendation. Specifically, after controlling for variation due to item content and its position, we find that social signals affect item popularity about half as much as position and content do. These effects are somewhat correlated, so social signals exacerbate the "rich get richer" phenomenon, which results in a wider variance of popularity. Further, social signals change individual preferences, creating a "herding" effect that biases people's judgments about the content. Despite this, we find that social signals improve the efficiency of peer recommendation by reducing the effort devoted to evaluating content while maintaining recommendation quality.

RONov 4, 2013
Using Surface-Motions for Locomotion of Microscopic Robots in Viscous Fluids

Tad Hogg

Microscopic robots could perform tasks with high spatial precision, such as acting in biological tissues on the scale of individual cells, provided they can reach precise locations. This paper evaluates the feasibility of in vivo locomotion for micron-size robots. Two appealing methods rely only on surface motions: steady tangential motion and small amplitude oscillations. These methods contrast with common microorganism propulsion based on flagella or cilia, which are more likely to damage nearby cells if used by robots made of stiff materials. The power potentially available to robots in tissue supports speeds ranging from one to hundreds of microns per second, over the range of viscosities found in biological tissue. We discuss design trade-offs among propulsion method, speed, power, shear forces and robot shape, and relate those choices to robot task requirements. This study shows that realizing such locomotion requires substantial improvements in fabrication capabilities and material properties over current technology.

ROFeb 2, 2012
Acoustic Communication for Medical Nanorobots

Tad Hogg, Robert A. Freitas

Communication among microscopic robots (nanorobots) can coordinate their activities for biomedical tasks. The feasibility of in vivo ultrasonic communication is evaluated for micron-size robots broadcasting into various types of tissues. Frequencies between 10MHz and 300MHz give the best tradeoff between efficient acoustic generation and attenuation for communication over distances of about 100 microns. Based on these results, we find power available from ambient oxygen and glucose in the bloodstream can readily support communication rates of about 10,000 bits/second between micron-sized robots. We discuss techniques, such as directional acoustic beams, that can increase this rate. The acoustic pressure fields enabling this communication are unlikely to damage nearby tissue, and short bursts at considerably higher power could be of therapeutic use.