CVMar 6
SurgFormer: Scalable Learning of Organ Deformation with Resection Support and Real-Time InferenceAshkan Shahbazi, Elaheh Akbari, Kyvia Pereira et al.
We introduce SurgFormer, a multiresolution gated transformer for data driven soft tissue simulation on volumetric meshes. High fidelity biomechanical solvers are often too costly for interactive use, so we train SurgFormer on solver generated data to predict nodewise displacement fields at near real time rates. SurgFormer builds a fixed mesh hierarchy and applies repeated multibranch blocks that combine local message passing, coarse global self attention, and pointwise feedforward updates, fused by learned per node, per channel gates to adaptively integrate local and long range information while remaining scalable on large meshes. For cut conditioned simulation, resection information is encoded as a learned cut embedding and provided as an additional input, enabling a unified model for both standard deformation prediction and topology altering cases. We also introduce two surgical simulation datasets generated under a unified protocol with XFEM based supervision: a cholecystectomy resection dataset and an appendectomy manipulation and resection dataset with cut and uncut cases. To our knowledge, this is the first learned volumetric surrogate setting to study XFEM supervised cut conditioned deformation within the same volumetric pipeline as standard deformation prediction. Across diverse baselines, SurgFormer achieves strong accuracy with favorable efficiency, making it a practical backbone for both tasks. {Code, data, and project page: \href{https://mint-vu.github.io/SurgFormer/}{available here}}
CVMar 19, 2025
Multi-Modal Gesture Recognition from Video and Surgical Tool Pose Information via Motion InvariantsJumanh Atoum, Garrison L. H. Johnston, Nabil Simaan et al.
Recognizing surgical gestures in real-time is a stepping stone towards automated activity recognition, skill assessment, intra-operative assistance, and eventually surgical automation. The current robotic surgical systems provide us with rich multi-modal data such as video and kinematics. While some recent works in multi-modal neural networks learn the relationships between vision and kinematics data, current approaches treat kinematics information as independent signals, with no underlying relation between tool-tip poses. However, instrument poses are geometrically related, and the underlying geometry can aid neural networks in learning gesture representation. Therefore, we propose combining motion invariant measures (curvature and torsion) with vision and kinematics data using a relational graph network to capture the underlying relations between different data streams. We show that gesture recognition improves when combining invariant signals with tool position, achieving 90.3\% frame-wise accuracy on the JIGSAWS suturing dataset. Our results show that motion invariant signals coupled with position are better representations of gesture motion compared to traditional position and quaternion representations. Our results highlight the need for geometric-aware modeling of kinematics for gesture recognition.
ROAug 11, 2020
Kinematic Modeling and Compliance Modulation of Redundant Manipulators Under Bracing ConstraintsGarrison L. H. Johnston, Andrew L. Orekhov, Nabil Simaan
Collaborative robots should ideally use low torque actuators for passive safety reasons. However, some applications require these collaborative robots to reach deep into confined spaces while assisting a human operator in physically demanding tasks. In this paper, we consider the use of in-situ collaborative robots (ISCRs) that balance the conflicting demands of passive safety dictating low torque actuation and the need to reach into deep confined spaces. We consider the judicious use of bracing as a possible solution to these conflicting demands and present a modeling framework that takes into account the constrained kinematics and the effect of bracing on the end-effector compliance. We then define a redundancy resolution framework that minimizes the directional compliance of the end-effector while maximizing end-effector dexterity. Kinematic simulation results show that the redundancy resolution strategy successfully decreases compliance and improves kinematic conditioning while satisfying the constraints imposed by the bracing task. Applications of this modeling framework can support future research on the choice of bracing locations and support the formation of an admittance control framework for collaborative control of ISCRs under bracing constraints. Such robots can benefit workers in the future by reducing the physiological burdens that contribute to musculoskeletal injury.