Amit Kumar Jaiswal

IR
h-index11
15papers
300citations
Novelty36%
AI Score43

15 Papers

CLMay 23Code
HiMed: Incentivizing Hindi Reasoning in Medical LLMs

Dingfeng Jiang, Han Yan, Chenze Ma et al.

Medical large language models hold promise for reducing healthcare disparities, yet Hindi remains severely underrepresented. While medical LLMs excel in high-resource languages, their performance degrades sharply in Hindi, particularly on Indian systems of medicine. We argue that robust cross-lingual medical transfer requires Hindi reasoning. To this end, we introduce HiMed, a Hindi reasoning medical corpus and benchmark suite covering both Western and Indian medicine. We further propose HiMed-8B, a Hindi-form medical reasoning LLM, through the design of decaying scaffolding reward. Extensive experiments demonstrate improvement in Hindi medical reasoning performance and reduction in the English--Hindi accuracy gap. Ablation studies validate the contribution of each training stage and reward component. All data and code are available on GitHub: https://github.com/FreedomIntelligence/HiMed.

AIAug 20, 2024
Investigating Context Effects in Similarity Judgements in Large Language Models

Sagar Uprety, Amit Kumar Jaiswal, Haiming Liu et al.

Large Language Models (LLMs) have revolutionised the capability of AI models in comprehending and generating natural language text. They are increasingly being used to empower and deploy agents in real-world scenarios, which make decisions and take actions based on their understanding of the context. Therefore researchers, policy makers and enterprises alike are working towards ensuring that the decisions made by these agents align with human values and user expectations. That being said, human values and decisions are not always straightforward to measure and are subject to different cognitive biases. There is a vast section of literature in Behavioural Science which studies biases in human judgements. In this work we report an ongoing investigation on alignment of LLMs with human judgements affected by order bias. Specifically, we focus on a famous human study which showed evidence of order effects in similarity judgements, and replicate it with various popular LLMs. We report the different settings where LLMs exhibit human-like order effect bias and discuss the implications of these findings to inform the design and development of LLM based applications.

CLAug 16, 2023
Lightweight Adaptation of Neural Language Models via Subspace Embedding

Amit Kumar Jaiswal, Haiming Liu

Traditional neural word embeddings are usually dependent on a richer diversity of vocabulary. However, the language models recline to cover major vocabularies via the word embedding parameters, in particular, for multilingual language models that generally cover a significant part of their overall learning parameters. In this work, we present a new compact embedding structure to reduce the memory footprint of the pre-trained language models with a sacrifice of up to 4% absolute accuracy. The embeddings vectors reconstruction follows a set of subspace embeddings and an assignment procedure via the contextual relationship among tokens from pre-trained language models. The subspace embedding structure calibrates to masked language models, to evaluate our compact embedding structure on similarity and textual entailment tasks, sentence and paraphrase tasks. Our experimental evaluation shows that the subspace embeddings achieve compression rates beyond 99.8% in comparison with the original embeddings for the language models on XNLI and GLUE benchmark suites.

LGOct 20, 2023
Towards Subject Agnostic Affective Emotion Recognition

Amit Kumar Jaiswal, Haiming Liu, Prayag Tiwari

This paper focuses on affective emotion recognition, aiming to perform in the subject-agnostic paradigm based on EEG signals. However, EEG signals manifest subject instability in subject-agnostic affective Brain-computer interfaces (aBCIs), which led to the problem of distributional shift. Furthermore, this problem is alleviated by approaches such as domain generalisation and domain adaptation. Typically, methods based on domain adaptation confer comparatively better results than the domain generalisation methods but demand more computational resources given new subjects. We propose a novel framework, meta-learning based augmented domain adaptation for subject-agnostic aBCIs. Our domain adaptation approach is augmented through meta-learning, which consists of a recurrent neural network, a classifier, and a distributional shift controller based on a sum-decomposable function. Also, we present that a neural network explicating a sum-decomposable function can effectively estimate the divergence between varied domains. The network setting for augmented domain adaptation follows meta-learning and adversarial learning, where the controller promptly adapts to new domains employing the target data via a few self-adaptation steps in the test phase. Our proposed approach is shown to be effective in experiments on a public aBICs dataset and achieves similar performance to state-of-the-art domain adaptation methods while avoiding the use of additional computational resources.

IRAug 17, 2023
A Model-Agnostic Framework for Recommendation via Interest-aware Item Embeddings

Amit Kumar Jaiswal, Yu Xiong

Item representation holds significant importance in recommendation systems, which encompasses domains such as news, retail, and videos. Retrieval and ranking models utilise item representation to capture the user-item relationship based on user behaviours. While existing representation learning methods primarily focus on optimising item-based mechanisms, such as attention and sequential modelling. However, these methods lack a modelling mechanism to directly reflect user interests within the learned item representations. Consequently, these methods may be less effective in capturing user interests indirectly. To address this challenge, we propose a novel Interest-aware Capsule network (IaCN) recommendation model, a model-agnostic framework that directly learns interest-oriented item representations. IaCN serves as an auxiliary task, enabling the joint learning of both item-based and interest-based representations. This framework adopts existing recommendation models without requiring substantial redesign. We evaluate the proposed approach on benchmark datasets, exploring various scenarios involving different deep neural networks, behaviour sequence lengths, and joint learning ratios of interest-oriented item representations. Experimental results demonstrate significant performance enhancements across diverse recommendation models, validating the effectiveness of our approach.

IRFeb 23, 2024
Towards a Theoretical Understanding of Two-Stage Recommender Systems

Amit Kumar Jaiswal

Production-grade recommender systems rely heavily on a large-scale corpus used by online media services, including Netflix, Pinterest, and Amazon. These systems enrich recommendations by learning users' and items' embeddings projected in a low-dimensional space with two-stage models (two deep neural networks), which facilitate their embedding constructs to predict users' feedback associated with items. Despite its popularity for recommendations, its theoretical behaviors remain comprehensively unexplored. We study the asymptotic behaviors of the two-stage recommender that entail a strong convergence to the optimal recommender system. We establish certain theoretical properties and statistical assurance of the two-stage recommender. In addition to asymptotic behaviors, we demonstrate that the two-stage recommender system attains faster convergence by relying on the intrinsic dimensions of the input features. Finally, we show numerically that the two-stage recommender enables encapsulating the impacts of items' and users' attributes on ratings, resulting in better performance compared to existing methods conducted using synthetic and real-world data experiments.

LGAug 6, 2025
Multimodal RAG Enhanced Visual Description

Amit Kumar Jaiswal, Haiming Liu, Ingo Frommholz

Textual descriptions for multimodal inputs entail recurrent refinement of queries to produce relevant output images. Despite efforts to address challenges such as scaling model size and data volume, the cost associated with pre-training and fine-tuning remains substantial. However, pre-trained large multimodal models (LMMs) encounter a modality gap, characterised by a misalignment between textual and visual representations within a common embedding space. Although fine-tuning can potentially mitigate this gap, it is typically expensive and impractical due to the requirement for extensive domain-driven data. To overcome this challenge, we propose a lightweight training-free approach utilising Retrieval-Augmented Generation (RAG) to extend across the modality using a linear mapping, which can be computed efficiently. During inference, this mapping is applied to images embedded by an LMM enabling retrieval of closest textual descriptions from the training set. These textual descriptions, in conjunction with an instruction, cater as an input prompt for the language model to generate new textual descriptions. In addition, we introduce an iterative technique for distilling the mapping by generating synthetic descriptions via the language model facilitating optimisation for standard utilised image description measures. Experimental results on two benchmark multimodal datasets demonstrate significant improvements.

QMMay 2, 2023
A Novel Deep Learning based Model for Erythrocytes Classification and Quantification in Sickle Cell Disease

Manish Bhatia, Balram Meena, Vipin Kumar Rathi et al.

The shape of erythrocytes or red blood cells is altered in several pathological conditions. Therefore, identifying and quantifying different erythrocyte shapes can help diagnose various diseases and assist in designing a treatment strategy. Machine Learning (ML) can be efficiently used to identify and quantify distorted erythrocyte morphologies. In this paper, we proposed a customized deep convolutional neural network (CNN) model to classify and quantify the distorted and normal morphology of erythrocytes from the images taken from the blood samples of patients suffering from Sickle cell disease ( SCD). We chose SCD as a model disease condition due to the presence of diverse erythrocyte morphologies in the blood samples of SCD patients. For the analysis, we used 428 raw microscopic images of SCD blood samples and generated the dataset consisting of 10, 377 single-cell images. We focused on three well-defined erythrocyte shapes, including discocytes, oval, and sickle. We used 18 layered deep CNN architecture to identify and quantify these shapes with 81% accuracy, outperforming other models. We also used SHAP and LIME for further interpretability. The proposed model can be helpful for the quick and accurate analysis of SCD blood samples by the clinicians and help them make the right decision for better management of SCD.

CLDec 17, 2021
Overview of the HASOC Subtrack at FIRE 2021: Hate Speech and Offensive Content Identification in English and Indo-Aryan Languages

Thomas Mandl, Sandip Modha, Gautam Kishore Shahi et al.

The widespread of offensive content online such as hate speech poses a growing societal problem. AI tools are necessary for supporting the moderation process at online platforms. For the evaluation of these identification tools, continuous experimentation with data sets in different languages are necessary. The HASOC track (Hate Speech and Offensive Content Identification) is dedicated to develop benchmark data for this purpose. This paper presents the HASOC subtrack for English, Hindi, and Marathi. The data set was assembled from Twitter. This subtrack has two sub-tasks. Task A is a binary classification problem (Hate and Not Offensive) offered for all three languages. Task B is a fine-grained classification problem for three classes (HATE) Hate speech, OFFENSIVE and PROFANITY offered for English and Hindi. Overall, 652 runs were submitted by 65 teams. The performance of the best classification algorithms for task A are F1 measures 0.91, 0.78 and 0.83 for Marathi, Hindi and English, respectively. This overview presents the tasks and the data development as well as the detailed results. The systems submitted to the competition applied a variety of technologies. The best performing algorithms were mainly variants of transformer architectures.

CLAug 12, 2021
Overview of the HASOC track at FIRE 2020: Hate Speech and Offensive Content Identification in Indo-European Languages

Thomas Mandla, Sandip Modha, Gautam Kishore Shahi et al.

With the growth of social media, the spread of hate speech is also increasing rapidly. Social media are widely used in many countries. Also Hate Speech is spreading in these countries. This brings a need for multilingual Hate Speech detection algorithms. Much research in this area is dedicated to English at the moment. The HASOC track intends to provide a platform to develop and optimize Hate Speech detection algorithms for Hindi, German and English. The dataset is collected from a Twitter archive and pre-classified by a machine learning system. HASOC has two sub-task for all three languages: task A is a binary classification problem (Hate and Not Offensive) while task B is a fine-grained classification problem for three classes (HATE) Hate speech, OFFENSIVE and PROFANITY. Overall, 252 runs were submitted by 40 teams. The performance of the best classification algorithms for task A are F1 measures of 0.51, 0.53 and 0.52 for English, Hindi, and German, respectively. For task B, the best classification algorithms achieved F1 measures of 0.26, 0.33 and 0.29 for English, Hindi, and German, respectively. This article presents the tasks and the data development as well as the results. The best performing algorithms were mainly variants of the transformer architecture BERT. However, also other systems were applied with good success

IRAug 5, 2020
Reinforcement Learning-driven Information Seeking: A Quantum Probabilistic Approach

Amit Kumar Jaiswal, Haiming Liu, Ingo Frommholz

Understanding an information forager's actions during interaction is very important for the study of interactive information retrieval. Although information spread in uncertain information space is substantially complex due to the high entanglement of users interacting with information objects~(text, image, etc.). However, an information forager, in general, accompanies a piece of information (information diet) while searching (or foraging) alternative contents, typically subject to decisive uncertainty. Such types of uncertainty are analogous to measurements in quantum mechanics which follow the uncertainty principle. In this paper, we discuss information seeking as a reinforcement learning task. We then present a reinforcement learning-based framework to model forager exploration that treats the information forager as an agent to guide their behaviour. Also, our framework incorporates the inherent uncertainty of the foragers' action using the mathematical formalism of quantum mechanics.

IRJan 19, 2020
Information Foraging for Enhancing Implicit Feedback in Content-based Image Recommendation

Amit Kumar Jaiswal, Haiming Liu, Ingo Frommholz

User implicit feedback plays an important role in recommender systems. However, finding implicit features is a tedious task. This paper aims to identify users' preferences through implicit behavioural signals for image recommendation based on the Information Scent Model of Information Foraging Theory. In the first part, we hypothesise that the users' perception is improved with visual cues in the images as behavioural signals that provide users' information scent during information seeking. We designed a content-based image recommendation system to explore which image attributes (i.e., visual cues or bookmarks) help users find their desired image. We found that users prefer recommendations predicated by visual cues and therefore consider the visual cues as good information scent for their information seeking. In the second part, we investigated if visual cues in the images together with the images itself can be better perceived by the users than each of them on its own. We evaluated the information scent artifacts in image recommendation on the Pinterest image collection and the WikiArt dataset. We find our proposed image recommendation system supports the implicit signals through Information Foraging explanation of the information scent model.

IRJun 30, 2019
Effects of Foraging in Personalized Content-based Image Recommendation

Amit Kumar Jaiswal, Haiming Liu, Ingo Frommholz

A major challenge of recommender systems is to help users locating interesting items. Personalized recommender systems have become very popular as they attempt to predetermine the needs of users and provide them with recommendations to personalize their navigation. However, few studies have addressed the question of what drives the users' attention to specific content within the collection and what influences the selection of interesting items. To this end, we employ the lens of Information Foraging Theory (IFT) to image recommendation to demonstrate how the user could utilize visual bookmarks to locate interesting images. We investigate a personalized content-based image recommendation system to understand what affects user attention by reinforcing visual attention cues based on IFT. We further find that visual bookmarks (cues) lead to a stronger scent of the recommended image collection. Our evaluation is based on the Pinterest image collection.

CVJun 23, 2019
Semi-Supervised Learning for Cancer Detection of Lymph Node Metastases

Amit Kumar Jaiswal, Ivan Panshin, Dimitrij Shulkin et al.

Pathologists find tedious to examine the status of the sentinel lymph node on a large number of pathological scans. The examination process of such lymph node which encompasses metastasized cancer cells is histopathologically organized. However, the task of finding metastatic tissues is gradual which is often challenging. In this work, we present our deep convolutional neural network based model validated on PatchCamelyon (PCam) benchmark dataset for fundamental machine learning research in histopathology diagnosis. We find that our proposed model trained with a semi-supervised learning approach by using pseudo labels on PCam-level significantly leads to better performances to strong CNN baseline on the AUC metric.

CRJul 30, 2018
Parsec: A State Channel for the Internet of Value

Amit Kumar Jaiswal

We propose Parsec, a web-scale State channel for the Internet of Value to exterminate the consensus bottleneck in Blockchain by leveraging a network of state channels which enable to robustly transfer value off-chain. It acts as an infrastructure layer developed on top of Ethereum Blockchain, as a network protocol which allows coherent routing and interlocking channel transfers for trade-off between parties. A web-scale solution for state channels is implemented to enable a layer of value transfer to the internet. Existing network protocol on State Channels include Raiden for Ethereum and Lightning Network for Bitcoin. However, we intend to leverage existing web-scale technologies used by large Internet companies such as Uber, LinkedIn or Netflix. We use Apache Kafka to scale the global payment operation to trillions of operations per day enabling near-instant, low-fee, scalable, and privacy-sustainable payments. Our architecture follows Event Sourcing pattern which solves current issues of payment solutions such as scaling, transfer, interoperability, low-fees, micropayments and to name a few. To the best of knowledge, our proposed model achieve better performance than state-of-the-art lightning network on the Ethereum based (fork) cryptocoins.