CVAug 7, 2023
Nerve Block Target Localization and Needle Guidance for Autonomous Robotic Ultrasound Guided Regional AnesthesiaAbhishek Tyagi, Abhay Tyagi, Manpreet Kaur et al.
Visual servoing for the development of autonomous robotic systems capable of administering UltraSound (US) guided regional anesthesia requires real-time segmentation of nerves, needle tip localization and needle trajectory extrapolation. First, we recruited 227 patients to build a large dataset of 41,000 anesthesiologist annotated images from US videos of brachial plexus nerves and developed models to localize nerves in the US images. Generalizability of the best suited model was tested on the datasets constructed from separate US scanners. Using these nerve segmentation predictions, we define automated anesthesia needle targets by fitting an ellipse to the nerve contours. Next, we developed an image analysis tool to guide the needle toward their targets. For the segmentation of the needle, a natural RGB pre-trained neural network was first fine-tuned on a large US dataset for domain transfer and then adapted for the needle using a small dataset. The segmented needle trajectory angle is calculated using Radon transformation and the trajectory is extrapolated from the needle tip. The intersection of the extrapolated trajectory with the needle target guides the needle navigation for drug delivery. The needle trajectory average error was within acceptable range of 5 mm as per experienced anesthesiologists. The entire dataset has been released publicly for further study by the research community at https://github.com/Regional-US/
AIMay 26
PolyFusionAgent: A Multimodal Foundation Model and Autonomous AI Assistant for Polymer Property Prediction and Inverse DesignManpreet Kaur, Xingying Zhang, Qian Liu
Polymer discovery is central to fields ranging from energy storage to biomedicine, but it is hindered by an astronomically large chemical design space and fragmented representations of structure, properties, and prior knowledge. This fragmentation leaves many AI models disconnected from physical and experimental reality, restricting their ability to support directly actionable design decisions. Here we introduce PolyFusionAgent, an interactive framework coupling a multimodal polymer foundation model (PolyFusion) with a tool-augmented, literature-grounded design agent (PolyAgent). PolyFusion aligns complementary polymer views including sequence, topology, 3D geometry, and fingerprints across millions of polymers to learn a shared latent space transferable across chemistries and data regimes, improving thermophysical property prediction and enabling property-conditioned generation of chemically valid, structurally novel polymers beyond the reference design space. PolyAgent closes the design loop by linking prediction and inverse design with evidence retrieval from the polymer literature, proposing, evaluating, and contextualizing hypotheses with explicit precedent in one workflow. Together, PolyFusionAgent enables interactive, evidence-linked polymer discovery combining large-scale representation learning, multimodal chemical knowledge, and verifiable scientific reasoning.
LGOct 10, 2025Code
WARC-Bench: Web Archive Based Benchmark for GUI Subtask ExecutionsSanjari Srivastava, Gang Li, Cheng Chang et al.
Training web agents to navigate complex, real-world websites requires them to master $\textit{subtasks}$ - short-horizon interactions on multiple UI components (e.g., choosing the correct date in a date picker, or scrolling in a container to extract information). We introduce WARC-Bench (Web Archive Benchmark), a novel web navigation benchmark featuring 438 tasks designed to evaluate multimodal AI agents on subtasks. WARC-Bench enables sandboxed interactions with dynamic and realistic webpages using Web ARChive files. We show that WARC-Bench is challenging for leading computer-use models, with the highest observed success rate being 64.8%. To improve open source models on subtask, we explore two common training techniques: supervised fine-tuning (SFT) and reinforcement learning with verifiable rewards (RLVR). Experiments show that SFT models obtain a 48.8% success rate on the benchmark. Training with RLVR over SFT checkpoints, even in data-scarce settings, improves the score to 52.8% on WARC-Bench, outperforming many frontier models. Our analysis concludes that mastering these subtasks is essential for robust web planning and navigation, and is a capability not extensively evaluated by existing benchmarks.
SEDec 23, 2014Code
Toward Refactoring of DMARF and GIPSY Case Studies -- a Team 9 SOEN6471-S14 Project ReportManpreet Kaur, Ravjeet Singh, Sukhveer Kaur et al.
Software architecture consists of series of decisions taken to give a structural solution that meets all the technical and operational requirements. The paper involves code refactoring. Code refactoring is a process of changing the internal structure of the code without altering its external behavior. This paper focuses over open source systems experimental studies that are DMARF and GIPSY. We have gone through various research papers and analyzed their architectures. Refactoring improves understandability, maintainability, extensibility of the code. Code smells were identified through various tools such as JDeodorant, Logiscope, and CodePro. Reverse engineering of DMARF and GIPSY were done for understanding the system. Tool used for this was Object Aid UML. For better understanding use cases, domain model, design class diagram are built.
CVDec 2, 2024
Mutli-View 3D Reconstruction using Knowledge DistillationAditya Dutt, Ishikaa Lunawat, Manpreet Kaur
Large Foundation Models like Dust3r can produce high quality outputs such as pointmaps, camera intrinsics, and depth estimation, given stereo-image pairs as input. However, the application of these outputs on tasks like Visual Localization requires a large amount of inference time and compute resources. To address these limitations, in this paper, we propose the use of a knowledge distillation pipeline, where we aim to build a student-teacher model with Dust3r as the teacher and explore multiple architectures of student models that are trained using the 3D reconstructed points output by Dust3r. Our goal is to build student models that can learn scene-specific representations and output 3D points with replicable performance such as Dust3r. The data set we used to train our models is 12Scenes. We test two main architectures of models: a CNN-based architecture and a Vision Transformer based architecture. For each architecture, we also compare the use of pre-trained models against models built from scratch. We qualitatively compare the reconstructed 3D points output by the student model against Dust3r's and discuss the various features learned by the student model. We also perform ablation studies on the models through hyperparameter tuning. Overall, we observe that the Vision Transformer presents the best performance visually and quantitatively.
LGDec 2, 2024
Cross Domain Adaptation using Adversarial networks with Cyclic lossManpreet Kaur, Ankur Tomar, Srijan Mishra et al.
Deep Learning methods are highly local and sensitive to the domain of data they are trained with. Even a slight deviation from the domain distribution affects prediction accuracy of deep networks significantly. In this work, we have investigated a set of techniques aimed at increasing accuracy of generator networks which perform translation from one domain to the other in an adversarial setting. In particular, we experimented with activations, the encoder-decoder network architectures, and introduced a Loss called cyclic loss to constrain the Generator network so that it learns effective source-target translation. This machine learning problem is motivated by myriad applications that can be derived from domain adaptation networks like generating labeled data from synthetic inputs in an unsupervised fashion, and using these translation network in conjunction with the original domain network to generalize deep learning networks across domains.
IRAug 26, 2020
Joint Modelling of Cyber Activities and Physical Context to Improve Prediction of Visitor BehaviorsManpreet Kaur, Flora D. Salim, Yongli Ren et al.
This paper investigates the Cyber-Physical behavior of users in a large indoor shopping mall by leveraging anonymized (opt in) Wi-Fi association and browsing logs recorded by the mall operators. Our analysis shows that many users exhibit a high correlation between their cyber activities and their physical context. To find this correlation, we propose a mechanism to semantically label a physical space with rich categorical information from DBPedia concepts and compute a contextual similarity that represents a user's activities with the mall context. We demonstrate the application of cyber-physical contextual similarity in two situations: user visit intent classification and future location prediction. The experimental results demonstrate that exploitation of contextual similarity significantly improves the accuracy of such applications.