CVOct 30, 2023Code
RayDF: Neural Ray-surface Distance Fields with Multi-view ConsistencyZhuoman Liu, Bo Yang, Yan Luximon et al.
In this paper, we study the problem of continuous 3D shape representations. The majority of existing successful methods are coordinate-based implicit neural representations. However, they are inefficient to render novel views or recover explicit surface points. A few works start to formulate 3D shapes as ray-based neural functions, but the learned structures are inferior due to the lack of multi-view geometry consistency. To tackle these challenges, we propose a new framework called RayDF. It consists of three major components: 1) the simple ray-surface distance field, 2) the novel dual-ray visibility classifier, and 3) a multi-view consistency optimization module to drive the learned ray-surface distances to be multi-view geometry consistent. We extensively evaluate our method on three public datasets, demonstrating remarkable performance in 3D surface point reconstruction on both synthetic and challenging real-world 3D scenes, clearly surpassing existing coordinate-based and ray-based baselines. Most notably, our method achieves a 1000x faster speed than coordinate-based methods to render an 800x800 depth image, showing the superiority of our method for 3D shape representation. Our code and data are available at https://github.com/vLAR-group/RayDF
18.6DCMay 1
SURGE: SuperBatch Unified Resource-efficient GPU Encoding for Heterogeneous Partitioned DataShashank Kapadia, Deep Narayan Mishra, Sujal Reddy Alugubelli et al.
We present SURGE, a streaming GPU encoding system deployed in production to generate embeddings for over 800 million texts across 40,000 logical partitions. Production embedding pipelines face a tension between logical data partitioning and efficient GPU utilization: processing each partition independently incurs $P$ inter-process communication (IPC) calls whose overhead limits throughput for compute-light models. Our contributions are analytical: (i) a cost model (Theorem 1) predicting throughput within 2% across three encoders spanning a 15$\times$ parameter range; (ii) a memory-safety bound (Lemma 3) enabling a streaming two-threshold policy with peak memory $O(B_{\min} + n_{\max})$ rather than $O(N)$; and (iii) a $ϕ$/CV decision framework characterizing when the pattern applies beyond our workload. The naive fix of batching at fixed size requires $O(N)$ peak memory (32.7 GB at 10M texts; infeasible beyond ~60M on 192 GB nodes), produces no output until all encoding completes, and offers no fault tolerance. SURGE achieves the same throughput with $O(B_{\min} + n_{\max})$ bounded memory (2.6 GB), 68$\times$ faster time-to-first-output, and crash recovery at SuperBatch granularity. On 10M texts with 4 NVIDIA L4 GPUs, SURGE delivers 26,413 texts/s -- matching fixed-batch throughput while using 12.6$\times$ less memory. We validate on bge-base (109M, $d$=768, error 1.3%) and across log-normal $σ$ in {1.0, 1.72, 2.5} (speedup invariant within $\pm$3%), and compare against a partition-batched baseline (PB-PBP-LB), against which SURGE retains a 7% throughput edge and 2.5$\times$ faster TTFO. Complementary engineering -- zero-copy Arrow serialization (22-25$\times$ speedup) and async I/O pipelining (up to 93% benefit) -- realizes the design but is not the contribution.
AIJul 30, 2024
A Scalable Tool For Analyzing Genomic Variants Of Humans Using Knowledge Graphs and Machine LearningShivika Prasanna, Ajay Kumar, Deepthi Rao et al.
The integration of knowledge graphs and graph machine learning (GML) in genomic data analysis offers several opportunities for understanding complex genetic relationships, especially at the RNA level. We present a comprehensive approach for leveraging these technologies to analyze genomic variants, specifically in the context of RNA sequencing (RNA-seq) data from COVID-19 patient samples. The proposed method involves extracting variant-level genetic information, annotating the data with additional metadata using SnpEff, and converting the enriched Variant Call Format (VCF) files into Resource Description Framework (RDF) triples. The resulting knowledge graph is further enhanced with patient metadata and stored in a graph database, facilitating efficient querying and indexing. We utilize the Deep Graph Library (DGL) to perform graph machine learning tasks, including node classification with GraphSAGE and Graph Convolutional Networks (GCNs). Our approach demonstrates significant utility using our proposed tool, VariantKG, in three key scenarios: enriching graphs with new VCF data, creating subgraphs based on user-defined features, and conducting graph machine learning for node classification.
CVNov 13, 2024
Towards More Accurate Fake Detection on Images Generated from Advanced Generative and Neural Rendering ModelsChengdong Dong, Vijayakumar Bhagavatula, Zhenyu Zhou et al.
The remarkable progress in neural-network-driven visual data generation, especially with neural rendering techniques like Neural Radiance Fields and 3D Gaussian splatting, offers a powerful alternative to GANs and diffusion models. These methods can produce high-fidelity images and lifelike avatars, highlighting the need for robust detection methods. In response, an unsupervised training technique is proposed that enables the model to extract comprehensive features from the Fourier spectrum magnitude, thereby overcoming the challenges of reconstructing the spectrum due to its centrosymmetric properties. By leveraging the spectral domain and dynamically combining it with spatial domain information, we create a robust multimodal detector that demonstrates superior generalization capabilities in identifying challenging synthetic images generated by the latest image synthesis techniques. To address the absence of a 3D neural rendering-based fake image database, we develop a comprehensive database that includes images generated by diverse neural rendering techniques, providing a robust foundation for evaluating and advancing detection methods.
CVJul 15, 2025
A Mixed-Primitive-based Gaussian Splatting Method for Surface ReconstructionHaoxuan Qu, Yujun Cai, Hossein Rahmani et al.
Recently, Gaussian Splatting (GS) has received a lot of attention in surface reconstruction. However, while 3D objects can be of complex and diverse shapes in the real world, existing GS-based methods only limitedly use a single type of splatting primitive (Gaussian ellipse or Gaussian ellipsoid) to represent object surfaces during their reconstruction. In this paper, we highlight that this can be insufficient for object surfaces to be represented in high quality. Thus, we propose a novel framework that, for the first time, enables Gaussian Splatting to incorporate multiple types of (geometrical) primitives during its surface reconstruction process. Specifically, in our framework, we first propose a compositional splatting strategy, enabling the splatting and rendering of different types of primitives in the Gaussian Splatting pipeline. In addition, we also design our framework with a mixed-primitive-based initialization strategy and a vertex pruning mechanism to further promote its surface representation learning process to be well executed leveraging different types of primitives. Extensive experiments show the efficacy of our framework and its accurate surface reconstruction performance.
QMMay 2, 2023
A Novel Deep Learning based Model for Erythrocytes Classification and Quantification in Sickle Cell DiseaseManish Bhatia, Balram Meena, Vipin Kumar Rathi et al.
The shape of erythrocytes or red blood cells is altered in several pathological conditions. Therefore, identifying and quantifying different erythrocyte shapes can help diagnose various diseases and assist in designing a treatment strategy. Machine Learning (ML) can be efficiently used to identify and quantify distorted erythrocyte morphologies. In this paper, we proposed a customized deep convolutional neural network (CNN) model to classify and quantify the distorted and normal morphology of erythrocytes from the images taken from the blood samples of patients suffering from Sickle cell disease ( SCD). We chose SCD as a model disease condition due to the presence of diverse erythrocyte morphologies in the blood samples of SCD patients. For the analysis, we used 428 raw microscopic images of SCD blood samples and generated the dataset consisting of 10, 377 single-cell images. We focused on three well-defined erythrocyte shapes, including discocytes, oval, and sickle. We used 18 layered deep CNN architecture to identify and quantify these shapes with 81% accuracy, outperforming other models. We also used SHAP and LIME for further interpretability. The proposed model can be helpful for the quick and accurate analysis of SCD blood samples by the clinicians and help them make the right decision for better management of SCD.
CVDec 31, 2019
Segmentation-Aware and Adaptive Iris RecognitionKuo Wang, Ajay Kumar
Iris recognition has emerged as one of the most accurate and convenient biometric for the human identification and has been increasingly employed in a wide range of e-security applications. The quality of iris images acquired at-a-distance or under less constrained imaging environments is known to degrade the iris matching accuracy. The periocular information is inherently embedded in such iris images and can be exploited to assist in the iris recognition under such non-ideal scenarios. Our analysis of such iris templates also indicates significant degradation and reduction in the region of interest, where the iris recognition can benefit from a similarity distance that can consider importance of different binary bits, instead of the direct use of Hamming distance in the literature. Periocular information can be dynamically reinforced, by incorporating the differences in the effective area of available iris regions, for more accurate iris recognition. This paper presents such a segmentation-assisted adaptive framework for more accurate less-constrained iris recognition. The effectiveness of this framework is evaluated on three publicly available iris databases using within-dataset and cross-dataset performance evaluation and validates the merit of the proposed iris recognition framework.
CVSep 13, 2019
A Collaborative Approach using Ridge-Valley Minutiae for More Accurate Contactless Fingerprint IdentificationRitesh Vyas, Ajay Kumar
Contactless fingerprint identification has emerged as an reliable and user friendly alternative for the personal identification in a range of e-business and law-enforcement applications. It is however quite known from the literature that the contactless fingerprint images deliver remarkably low matching accuracies as compared with those obtained from the contact-based fingerprint sensors. This paper develops a new approach to significantly improve contactless fingerprint matching capabilities available today. We systematically analyze the extent of complimentary ridge-valley information and introduce new approaches to achieve significantly higher matching accuracy over state-of-art fingerprint matchers commonly employed today. We also investigate least explored options for the fingerprint color-space conversions, which can play a key-role for more accurate contactless fingerprint matching. This paper presents experimental results from different publicly available contactless fingerprint databases using NBIS, MCC and COTS matchers. Our consistently outperforming results validate the effectiveness of the proposed approach for more accurate contactless fingerprint identification.
QUANT-PHMay 4, 2019
Matrix Product State Based Quantum ClassifierAmandeep Singh Bhatia, Mandeep Kaur Saggi, Ajay Kumar et al.
In recent years, interest in expressing the success of neural networks to the quantum computing has increased significantly. Tensor network theory has become increasingly popular and widely used to simulate strongly entangled correlated systems. Matrix product state (MPS) is the well-designed class of tensor network states, which plays an important role in processing of quantum information. In this paper, we have shown that matrix product state as one-dimensional array of tensors can be used to classify classical and quantum data. We have performed binary classification of classical machine learning dataset Iris encoded in a quantum state. Further, we have investigated the performance by considering different parameters on the ibmqx4 quantum computer and proved that MPS circuits can be used to attain better accuracy. Further, the learning ability of MPS quantum classifier is tested to classify evapotranspiration ($ET_{o}$) for Patiala meteorological station located in Northern Punjab (India), using three years of historical dataset (Agri). Furthermore, we have used different performance metrics of classification to measure its capability. Finally, the results are plotted and degree of correspondence among values of each sample is shown.
CVDec 29, 2018
A Deep Learning based Framework to Detect and Recognize Humans using Contactless Palmprints in the WildYang Liu, Ajay Kumar
Contactless and online palmprint identfication offers improved user convenience, hygiene, user-security and is highly desirable in a range of applications. This technical report details an accurate and generalizable deep learning-based framework to detect and recognize humans using contactless palmprint images in the wild. Our network is based on fully convolutional network that generates deeply learned residual features. We design a soft-shifted triplet loss function to more effectively learn discriminative palmprint features. Online palmprint identification also requires a contactless palm detector, which is adapted and trained from faster-R-CNN architecture, to detect palmprint region under varying backgrounds. Our reproducible experimental results on publicly available contactless palmprint databases suggest that the proposed framework consistently outperforms several classical and state-of-the-art palmprint recognition methods. More importantly, the model presented in this report offers superior generalization capability, unlike other popular methods in the literature, as it does not essentially require database-specific parameter tuning, which is another key advantage over other methods in the literature.
CRNov 15, 2018
McEliece Cryptosystem Based On Extended Golay CodeAmandeep Singh Bhatia, Ajay Kumar
With increasing advancements in technology, it is expected that the emergence of a quantum computer will potentially break many of the public-key cryptosystems currently in use. It will negotiate the confidentiality and integrity of communications. In this regard, we have privacy protectors (i.e. Post-Quantum Cryptography), which resists attacks by quantum computers, deals with cryptosystems that run on conventional computers and are secure against attacks by quantum computers. The practice of code-based cryptography is a trade-off between security and efficiency. In this chapter, we have explored The most successful McEliece cryptosystem, based on extended Golay code [24, 12, 8]. We have examined the implications of using an extended Golay code in place of usual Goppa code in McEliece cryptosystem. Further, we have implemented a McEliece cryptosystem based on extended Golay code using MATLAB. The extended Golay code has lots of practical applications. The main advantage of using extended Golay code is that it has codeword of length 24, a minimum Hamming distance of 8 allows us to detect 7-bit errors while correcting for 3 or fewer errors simultaneously and can be transmitted at high data rate.