Anthony Wang

RO
h-index11
5papers
266citations
Novelty33%
AI Score32

5 Papers

LGSep 23, 2023
Modeling Student Performance in Game-Based Learning Environments

Hyunbae Jeon, Harry He, Anthony Wang et al.

This study investigates game-based learning in the context of the educational game "Jo Wilder and the Capitol Case," focusing on predicting student performance using various machine learning models, including K-Nearest Neighbors (KNN), Multi-Layer Perceptron (MLP), and Random Forest. The research aims to identify the features most predictive of student performance and correct question answering. By leveraging gameplay data, we establish complete benchmarks for these models and explore the importance of applying proper data aggregation methods. By compressing all numeric data to min/max/mean/sum and categorical data to first, last, count, and nunique, we reduced the size of the original training data from 4.6 GB to 48 MB of preprocessed training data, maintaining high F1 scores and accuracy. Our findings suggest that proper preprocessing techniques can be vital in enhancing the performance of non-deep-learning-based models. The MLP model outperformed the current state-of-the-art French Touch model, achieving an F-1 score of 0.83 and an accuracy of 0.74, suggesting its suitability for this dataset. Future research should explore using larger datasets, other preprocessing techniques, more advanced deep learning techniques, and real-world applications to provide personalized learning recommendations to students based on their predicted performance. This paper contributes to the understanding of game-based learning and provides insights into optimizing educational game experiences for improved student outcomes and skill development.

LGJul 23, 2025
Rubrics as Rewards: Reinforcement Learning Beyond Verifiable Domains

Anisha Gunjal, Anthony Wang, Elaine Lau et al.

Reinforcement Learning with Verifiable Rewards (RLVR) has proven effective for complex reasoning tasks with clear correctness signals such as math and coding. However, extending it to real-world reasoning tasks is challenging, as evaluation depends on nuanced, multi-criteria judgments rather than binary correctness. Instance-specific rubrics have recently been used in evaluation benchmarks to capture such judgments, but their potential as reward signals for on-policy post-training remains underexplored. We introduce $\textbf{Rubrics as Rewards}$ (RaR), an on-policy reinforcement learning method that extends RLVR beyond verifiable domains by using rubric-based feedback. Across both medical and science domains, we evaluate multiple strategies for aggregating rubric feedback into rewards. The best RaR variant achieves relative improvements of up to $31\%$ on HealthBench and $7\%$ on GPQA-Diamond over popular LLM-as-judge baselines that rely on direct Likert-based rewards. These results demonstrate that RaR-trained policies adapt well to diverse evaluation formats, performing strongly on both rubric-based and multiple-choice tasks. Moreover, we find that using rubrics as structured reward signals yields better alignment for smaller judges and reduces performance variance across judge scales.

ROMay 6, 2024
Robot Air Hockey: A Manipulation Testbed for Robot Learning with Reinforcement Learning

Caleb Chuck, Carl Qi, Michael J. Munje et al.

Reinforcement Learning is a promising tool for learning complex policies even in fast-moving and object-interactive domains where human teleoperation or hard-coded policies might fail. To effectively reflect this challenging category of tasks, we introduce a dynamic, interactive RL testbed based on robot air hockey. By augmenting air hockey with a large family of tasks ranging from easy tasks like reaching, to challenging ones like pushing a block by hitting it with a puck, as well as goal-based and human-interactive tasks, our testbed allows a varied assessment of RL capabilities. The robot air hockey testbed also supports sim-to-real transfer with three domains: two simulators of increasing fidelity and a real robot system. Using a dataset of demonstration data gathered through two teleoperation systems: a virtualized control environment, and human shadowing, we assess the testbed with behavior cloning, offline RL, and RL from scratch.

ROJul 8, 2021
Design and Deployment of an Autonomous Unmanned Ground Vehicle for Urban Firefighting Scenarios

Kshitij Jindal, Anthony Wang, Dinesh Thakur et al.

Autonomous mobile robots have the potential to solve missions that are either too complex or dangerous to be accomplished by humans. In this paper, we address the design and autonomous deployment of a ground vehicle equipped with a robotic arm for urban firefighting scenarios. We describe the hardware design and algorithm approaches for autonomous navigation, planning, fire source identification and abatement in unstructured urban scenarios. The approach employs on-board sensors for autonomous navigation and thermal camera information for source identification. A custom electro{mechanical pump is responsible to eject water for fire abatement. The proposed approach is validated through several experiments, where we show the ability to identify and abate a sample heat source in a building. The whole system was developed and deployed during the Mohamed Bin Zayed International Robotics Challenge (MBZIRC) 2020, for Challenge No. 3 Fire Fighting Inside a High-Rise Building and during the Grand Challenge where our approach scored the highest number of points among all UGV solutions and was instrumental to win the first place.

RONov 16, 2020
Mobile Manipulator for Autonomous Localization, Grasping and Precise Placement of Construction Material in a Semi-structured Environment

Petr Stibinger, George Broughton, Filip Majer et al.

Mobile manipulators have the potential to revolutionize modern agriculture, logistics and manufacturing. In this work, we present the design of a ground-based mobile manipulator for automated structure assembly. The proposed system is capable of autonomous localization, grasping, transportation and deployment of construction material in a semi-structured environment. Special effort was put into making the system invariant to lighting changes, and not reliant on external positioning systems. Therefore, the presented system is self-contained and capable of operating in outdoor and indoor conditions alike. Finally, we present means to extend the perceptive radius of the vehicle by using it in cooperation with an autonomous drone, which provides aerial reconnaissance. Performance of the proposed system has been evaluated in a series of experiments conducted in real-world conditions.