27.3CVApr 19
DGSSM: Diffusion guided state-space models for multimodal salient object detectionSuklav Ghosh, Arijit Sur, Pinaki Mitra
Salient object detection (SOD) requires modeling both long-range contextual dependencies and fine-grained structural details, which remains challenging for convolutional, transformer-based, and Mamba-based state space models. While recent Mamba-based state space approaches enable efficient global reasoning, they often struggle to recover precise object boundaries. In contrast, diffusion models capture strong structural priors through iterative denoising, but their use in discriminative dense prediction is still limited due to computational cost and integration challenges. In this work, we propose DGSSM, a diffusion-guided state space (Mamba) framework that formulates multimodal salient object detection as a progressive denoising process. The framework integrates diffusion structural priors with multi-scale state space encoding, adaptive saliency prompting, and an iterative Mamba diffusion refinement mechanism to improve boundary accuracy. A boundary-aware refinement head and self-distillation strategy further enhance spatial coherence and feature consistency. Extensive experiments on 13 public benchmarks across RGB, RGB-D, and RGB-T settings demonstrate that DGSSM consistently outperforms state-of-the-art methods across multiple evaluation metrics while maintaining a compact model size. These results suggest that diffusion-guided state space modeling is an effective and generalizable paradigm for multimodal dense prediction tasks.
CRMar 30, 2024
Information Security and Privacy in the Digital World: Some Selected TopicsJaydip Sen, Joceli Mayer, Subhasis Dasgupta et al.
In the era of generative artificial intelligence and the Internet of Things, while there is explosive growth in the volume of data and the associated need for processing, analysis, and storage, several new challenges are faced in identifying spurious and fake information and protecting the privacy of sensitive data. This has led to an increasing demand for more robust and resilient schemes for authentication, integrity protection, encryption, non-repudiation, and privacy-preservation of data. The chapters in this book present some of the state-of-the-art research works in the field of cryptography and security in computing and communications.
MMJun 19, 2021
Multi-Contextual Design of Convolutional Neural Network for SteganalysisBrijesh Singh, Arijit Sur, Pinaki Mitra
In recent times, deep learning-based steganalysis classifiers became popular due to their state-of-the-art performance. Most deep steganalysis classifiers usually extract noise residuals using high-pass filters as preprocessing steps and feed them to their deep model for classification. It is observed that recent steganographic embedding does not always restrict their embedding in the high-frequency zone; instead, they distribute it as per embedding policy. Therefore, besides noise residual, learning the embedding zone is another challenging task. In this work, unlike the conventional approaches, the proposed model first extracts the noise residual using learned denoising kernels to boost the signal-to-noise ratio. After preprocessing, the sparse noise residuals are fed to a novel Multi-Contextual Convolutional Neural Network (M-CNET) that uses heterogeneous context size to learn the sparse and low-amplitude representation of noise residuals. The model performance is further improved by incorporating the Self-Attention module to focus on the areas prone to steganalytic embedding. A set of comprehensive experiments is performed to show the proposed scheme's efficacy over the prior arts. Besides, an ablation study is given to justify the contribution of various modules of the proposed architecture.
CRMar 20, 2020
The application of $σ$-LFSR in Key-Dependent Feedback Configuration for Word-Oriented Stream CiphersSubrata Nandi, Srinivasan Krishnaswamy, Behrouz Zolfaghari et al.
In this paper, we propose and evaluate a method for generating key-dependent feedback configurations (KDFC) for $σ$-LFSRs. $σ$-LFSRs with such configurations can be applied to any stream cipher that uses a word-based LFSR. Here, a configuration generation algorithm uses the secret key(K) and the initialization vector (IV) to generate a feedback configuration. We have mathematically analysed the feedback configurations generated by this method. As a test case, we have applied this method on SNOW 2.0 and have studied its impact on resistance to various attacks. Further, we have also tested the generated keystream for randomness and have briefly described its implementation and the challenges involved in the same.
CRAug 15, 2013
Privatizing user credential information of Web services in a shared user environmentPinaki Mitra, Rinku Das, Girish Sundaram
User credentials security is one of the most important tasks in Web World. Most Web sites on the Internet that support user accounts store the users credentials in a database. Now a days, most of the web browsers offer auto login feature for the favorite web sites such as yahoo, google, gmail etc. using these credential information. This facilitates the misuse of user credentials. Privatizing user credential information of web services in a shared user environment provides a feature enhancement where the root user will be able to privatize his stored credentials by enforcing some masking techniques such that even a user logs on to the system with root user credentials, he will not be able to access privatized data. In case of web browsers auto login feature, a root user can disable the feature manually by deleting entries from web browsers' saved password list. But this involves spending a considerable amount of time and the biggest problem is that he has to insert those credentials once again when he next visits these websites. This application resumes auto login feature whenever root user disable the masked mode. The application includes two parts: Masked Application Mode and Disabling the Masked Application Mode. When the system goes for masked application mode, the other user will not be able to use the credentials of the root user.If the other user tries to access any of the web pages which have been masked, the other user will have to authenticate with his own credentials. Disabling the masked mode requires authentication from the root user. As long as this credential is not shared, masked mode can be disabled only by the root user.
CLAug 14, 2013
System and Methods for Converting Speech to SQLSachin Kumar, Ashish Kumar, Pinaki Mitra et al.
This paper concerns with the conversion of a Spoken English Language Query into SQL for retrieving data from RDBMS. A User submits a query as speech signal through the user interface and gets the result of the query in the text format. We have developed the acoustic and language models using which a speech utterance can be converted into English text query and thus natural language processing techniques can be applied on this English text query to generate an equivalent SQL query. For conversion of speech into English text HTK and Julius tools have been used and for conversion of English text query into SQL query we have implemented a System which uses rule based translation to translate English Language Query into SQL Query. The translation uses lexical analyzer, parser and syntax directed translation techniques like in compilers. JFLex and BYACC tools have been used to build lexical analyzer and parser respectively. System is domain independent i.e. system can run on different database as it generates lex files from the underlying database.