ROJun 1
Dexterity-BEV: Aligning 3D World and Actions for Generalizable Robot Policies LearningHuayi Zhou, Wei Gao, Dekun Lu et al.
End-to-end manipulation policies, combined with web-scale pretrained Vision-Language Models (VLMs), show the promise for generalizable and dexterous robotic manipulation. However, they inherit two key limitations from 2D foundation models: 1) the reliance on 2D RGB inputs that ignores the intrinsically 3D nature of manipulation; and 2) the lack of spatial 3D alignment between input-output spaces as well as across diverse robot embodiments, camera setups, and trajectory datasets. In this paper, we present a series of contributions to address these issues. First, we introduce aligned vertex map and vertex spectrum -- a pixel-wise 3D representation that elevates 2D visual inputs to 3D, using camera calibration and optional depth. This novel input representation marries 3D awareness with the generalization of 2D large VLMs. Then, we propose to align the inputs and outputs of manipulation policies by expressing per-pixel 3D information of each camera view and robot actions to a shared coordinate. Based on this, we designate a canonical Bird's-Eye-View (BEV) alignment frame and innovatively propose to construct BEV images, producing a view-invariant representation robust to camera pose variations. To enable training and evaluation at scale, we develop a comprehensive data processing pipeline to perform such alignments; we also introduce a novel temporal alignment scheme for trajectories across diverse robots, human operators, and datasets. These contributions collectively mitigate input and output spatial-temporal misalignments, improving the consistency and generalization for real-world manipulation. Pretrained checkpoint, source code and data processing pipeline are available in https://hnuzhy.github.io/projects/Dex-BEV.
LGOct 23, 2025
Assessing the Feasibility of Early Cancer Detection Using Routine Laboratory Data: An Evaluation of Machine Learning Approaches on an Imbalanced DatasetShumin Li
The development of accessible screening tools for early cancer detection in dogs represents a significant challenge in veterinary medicine. Routine laboratory data offer a promising, low-cost source for such tools, but their utility is hampered by the non-specificity of individual biomarkers and the severe class imbalance inherent in screening populations. This study assesses the feasibility of cancer risk classification using the Golden Retriever Lifetime Study (GRLS) cohort under real-world constraints, including the grouping of diverse cancer types and the inclusion of post-diagnosis samples. A comprehensive benchmark evaluation was conducted, systematically comparing 126 analytical pipelines that comprised various machine learning models, feature selection methods, and data balancing techniques. Data were partitioned at the patient level to prevent leakage. The optimal model, a Logistic Regression classifier with class weighting and recursive feature elimination, demonstrated moderate ranking ability (AUROC = 0.815; 95% CI: 0.793-0.836) but poor clinical classification performance (F1-score = 0.25, Positive Predictive Value = 0.15). While a high Negative Predictive Value (0.98) was achieved, insufficient recall (0.79) precludes its use as a reliable rule-out test. Interpretability analysis with SHapley Additive exPlanations (SHAP) revealed that predictions were driven by non-specific features like age and markers of inflammation and anemia. It is concluded that while a statistically detectable cancer signal exists in routine lab data, it is too weak and confounded for clinically reliable discrimination from normal aging or other inflammatory conditions. This work establishes a critical performance ceiling for this data modality in isolation and underscores that meaningful progress in computational veterinary oncology will require integration of multi-modal data sources.
CYOct 13, 2025
The Adoption Paradox: A Comparative Analysis of Veterinary AI Adoption in China and the North AmericaShumin Li, Xiaoyun Lai
This study compares the perception, adoption, and application of artificial intelligence (AI) among veterinary professionals in China and North America (NA), testing the hypothesis that adoption patterns are shaped by regional market and demographic factors. A descriptive, cross-sectional survey was conducted with 455 veterinary professionals in China between May and July 2025. The results were compared with published data from a 2024 survey of 3,968 veterinary professionals in the United States and Canada. The Chinese cohort, primarily composed of clinicians (81.5%), showed a high AI adoption rate (71.0%) despite low familiarity (55.4%). Their AI use was focused on clinical tasks, such as disease diagnosis (50.1%) and prescription calculation (44.8%). In contrast, the NA cohort reported high familiarity (83.8%) but a lower adoption rate (39.2%). Their priorities were administrative, including imaging analysis (39.0%) and record-keeping (39.0%). Concerns about AI reliability and accuracy were the top barrier in both groups. Our findings reveal an "adoption paradox" where the Chinese market demonstrates a practitioner-driven, bottom-up adoption model focused on augmenting clinical efficacy, while the NA market shows a more cautious, structured, top-down integration aimed at improving administrative efficiency. This suggests that a one-size-fits-all approach to AI development and integration is insufficient, and tailored, region-specific strategies are necessary to responsibly incorporate AI into global veterinary practice.