CVOct 17, 2022
Understanding CNN Fragility When Learning With Imbalanced DataDamien Dablain, Kristen N. Jacobson, Colin Bellinger et al.
Convolutional neural networks (CNNs) have achieved impressive results on imbalanced image data, but they still have difficulty generalizing to minority classes and their decisions are difficult to interpret. These problems are related because the method by which CNNs generalize to minority classes, which requires improvement, is wrapped in a blackbox. To demystify CNN decisions on imbalanced data, we focus on their latent features. Although CNNs embed the pattern knowledge learned from a training set in model parameters, the effect of this knowledge is contained in feature and classification embeddings (FE and CE). These embeddings can be extracted from a trained model and their global, class properties (e.g., frequency, magnitude and identity) can be analyzed. We find that important information regarding the ability of a neural network to generalize to minority classes resides in the class top-K CE and FE. We show that a CNN learns a limited number of class top-K CE per category, and that their number and magnitudes vary based on whether the same class is balanced or imbalanced. This calls into question whether a CNN has learned intrinsic class features, or merely frequently occurring ones that happen to exist in the sampled class distribution. We also hypothesize that latent class diversity is as important as the number of class examples, which has important implications for re-sampling and cost-sensitive methods. These methods generally focus on rebalancing model weights, class numbers and margins; instead of diversifying class latent features through augmentation. We also demonstrate that a CNN has difficulty generalizing to test data if the magnitude of its top-K latent features do not match the training set. We use three popular image datasets and two cost-sensitive algorithms commonly employed in imbalanced learning for our experiments.
ROMar 11
POrTAL: Plan-Orchestrated Tree Assembly for LookaheadEvan Conway, David Porfirio, David Chan et al.
When tasking robots in partially observable environments, these robots must efficiently and robustly plan to achieve task goals under uncertainty. Although many probabilistic planning algorithms exist for this purpose, these algorithms can be inefficient if executed with the robot's limited computational resources, or may produce policies that take more steps than expected to achieve the goal. We therefore created a new, lightweight, probabilistic planning algorithm, Plan-Orchestrated Tree Assembly for Lookahead (POrTAL), that combines the strengths of two baseline planning algorithms, FF-Replan and POMCP. We demonstrate that POrTAL is an anytime algorithm that generally outperforms these baselines in terms of the final executed plan length given bounded computation time, especially for problems with only moderate levels of uncertainty.
AINov 10, 2025
Procedural Knowledge Improves Agentic LLM WorkflowsVincent Hsiao, Mark Roberts, Leslie Smith
Large language models (LLMs) often struggle when performing agentic tasks without substantial tool support, prom-pt engineering, or fine tuning. Despite research showing that domain-dependent, procedural knowledge can dramatically increase planning efficiency, little work evaluates its potential for improving LLM performance on agentic tasks that may require implicit planning. We formalize, implement, and evaluate an agentic LLM workflow that leverages procedural knowledge in the form of a hierarchical task network (HTN). Empirical results of our implementation show that hand-coded HTNs can dramatically improve LLM performance on agentic tasks, and using HTNs can boost a 20b or 70b parameter LLM to outperform a much larger 120b parameter LLM baseline. Furthermore, LLM-created HTNs improve overall performance, though less so. The results suggest that leveraging expertise--from humans, documents, or LLMs--to curate procedural knowledge will become another important tool for improving LLM workflows.
QMDec 19, 2023
New Horizons: Pioneering Pharmaceutical R&D with Generative AI from lab to the clinic -- an industry perspectiveGuy Doron, Sam Genway, Mark Roberts et al.
The rapid advance of generative AI is reshaping the strategic vision for R&D across industries. The unique challenges of pharmaceutical R&D will see applications of generative AI deliver value along the entire value chain from early discovery to regulatory approval. This perspective reviews these challenges and takes a three-horizon approach to explore the generative AI applications already delivering impact, the disruptive opportunities which are just around the corner, and the longer-term transformation which will shape the future of the industry. Selected applications are reviewed for their potential to drive increase productivity, accelerate timelines, improve the quality of research, data and decision making, and support a sustainable future for the industry. Recommendations are given for Pharma R&D leaders developing a generative AI strategy today which will lay the groundwork for getting real value from the technology and safeguarding future growth. Generative AI is today providing new, efficient routes to accessing and combining organisational data to drive productivity. Next, this impact will reach clinical development, enhancing the patient experience, driving operational efficiency, and unlocking digital innovation to better tackle the future burden of disease. Looking to the furthest horizon, rapid acquisition of rich multi-omics data, which capture the 'language of life', in combination with next generation AI technologies will allow organisations to close the loop around phases of the pipeline through rapid, automated generation and testing of hypotheses from bench to bedside. This provides a vision for the future of R&D with sustainability at the core, with reduced timescales and reduced dependency on resources, while offering new hope to patients to treat the untreatable and ultimately cure diseases.
AIAug 15, 2025
Landmark-Assisted Monte Carlo PlanningDavid H. Chan, Mark Roberts, Dana S. Nau
Landmarks$\unicode{x2013}$conditions that must be satisfied at some point in every solution plan$\unicode{x2013}$have contributed to major advancements in classical planning, but they have seldom been used in stochastic domains. We formalize probabilistic landmarks and adapt the UCT algorithm to leverage them as subgoals to decompose MDPs; core to the adaptation is balancing between greedy landmark achievement and final goal achievement. Our results in benchmark domains show that well-chosen landmarks can significantly improve the performance of UCT in online probabilistic planning, while the best balance of greedy versus long-term goal achievement is problem-dependent. The results suggest that landmarks can provide helpful guidance for anytime algorithms solving MDPs.
AIApr 22, 2025
HTN Plan Repair Algorithms Compared: Strengths and Weaknesses of Different MethodsPaul Zaidins, Robert P. Goldman, Ugur Kuter et al.
This paper provides theoretical and empirical comparisons of three recent hierarchical plan repair algorithms: SHOPFixer, IPyHOPPER, and Rewrite. Our theoretical results show that the three algorithms correspond to three different definitions of the plan repair problem, leading to differences in the algorithms' search spaces, the repair problems they can solve, and the kinds of repairs they can make. Understanding these distinctions is important when choosing a repair method for any given application. Building on the theoretical results, we evaluate the algorithms empirically in a series of benchmark planning problems. Our empirical results provide more detailed insight into the runtime repair performance of these systems and the coverage of the repair problems solved, based on algorithmic properties such as replanning, chronological backtracking, and backjumping over plan trees.
AIFeb 21, 2025
Automating Curriculum Learning for Reinforcement Learning using a Skill-Based Bayesian NetworkVincent Hsiao, Mark Roberts, Laura M. Hiatt et al.
A major challenge for reinforcement learning is automatically generating curricula to reduce training time or improve performance in some target task. We introduce SEBNs (Skill-Environment Bayesian Networks) which model a probabilistic relationship between a set of skills, a set of goals that relate to the reward structure, and a set of environment features to predict policy performance on (possibly unseen) tasks. We develop an algorithm that uses the inferred estimates of agent success from SEBN to weigh the possible next tasks by expected improvement. We evaluate the benefit of the resulting curriculum on three environments: a discrete gridworld, continuous control, and simulated robotics. The results show that curricula constructed using SEBN frequently outperform other baselines.
AIApr 9, 2024
Automatically Learning HTN Methods from LandmarksRuoxi Li, Dana Nau, Mark Roberts et al.
Hierarchical Task Network (HTN) planning usually requires a domain engineer to provide manual input about how to decompose a planning problem. Even HTN-MAKER, a well-known method-learning algorithm, requires a domain engineer to annotate the tasks with information about what to learn. We introduce CURRICULAMA, an HTN method learning algorithm that completely automates the learning process. It uses landmark analysis to compose annotated tasks and leverages curriculum learning to order the learning of methods from simpler to more complex. This eliminates the need for manual input, resolving a core issue with HTN-MAKER. We prove CURRICULAMA's soundness, and show experimentally that it has a substantially similar convergence rate in learning a complete set of methods to HTN-MAKER.
ROJan 30, 2024
Human-Centric Goal Reasoning with Ripple-Down RulesKenji Brameld, Germán Castro, Claude Sammut et al.
ActorSim is a goal reasoning framework developed at the Naval Research Laboratory. Originally, all goal reasoning rules were hand-crafted. This work extends ActorSim with the capability of learning by demonstration, that is, when a human trainer disagrees with a decision made by the system, the trainer can take over and show the system the correct decision. The learning component uses Ripple-Down Rules (RDR) to build new decision rules to correctly handle similar cases in the future. The system is demonstrated using the RoboCup Rescue Agent Simulation, which simulates a city-wide disaster, requiring emergency services, including fire, ambulance and police, to be dispatched to different sites to evacuate civilians from dangerous situations. The RDRs are implemented in a scripting language, FrameScript, which is used to mediate between ActorSim and the agent simulator. Using Ripple-Down Rules, ActorSim can scale to an order of magnitude more goals than the previous version.
AIJul 6, 2016
Cost-Optimal Algorithms for Planning with Procedural Control KnowledgeVikas Shivashankar, Ron Alford, Mark Roberts et al.
There is an impressive body of work on developing heuristics and other reasoning algorithms to guide search in optimal and anytime planning algorithms for classical planning. However, very little effort has been directed towards developing analogous techniques to guide search towards high-quality solutions in hierarchical planning formalisms like HTN planning, which allows using additional domain-specific procedural control knowledge. In lieu of such techniques, this control knowledge often needs to provide the necessary search guidance to the planning algorithm, which imposes a substantial burden on the domain author and can yield brittle or error-prone domain models. We address this gap by extending recent work on a new hierarchical goal-based planning formalism called Hierarchical Goal Network (HGN) Planning to develop the Hierarchically-Optimal Goal Decomposition Planner (HOpGDP), an HGN planning algorithm that computes hierarchically-optimal plans. HOpGDP is guided by $h_{HL}$, a new HGN planning heuristic that extends existing admissible landmark-based heuristics from classical planning to compute admissible cost estimates for HGN planning problems. Our experimental evaluation across three benchmark planning domains shows that HOpGDP compares favorably to both optimal classical planners due to its ability to use domain-specific procedural knowledge, and a blind-search version of HOpGDP due to the search guidance provided by $h_{HL}$.