Umang Jain

AI
h-index61
3papers
92citations
Novelty28%
AI Score29

3 Papers

AIMar 17, 2025
The Amazon Nova Family of Models: Technical Report and Model Card

Amazon AGI, Aaron Langford, Aayush Shah et al. · amazon-science

We present Amazon Nova, a new generation of state-of-the-art foundation models that deliver frontier intelligence and industry-leading price performance. Amazon Nova Pro is a highly-capable multimodal model with the best combination of accuracy, speed, and cost for a wide range of tasks. Amazon Nova Lite is a low-cost multimodal model that is lightning fast for processing images, video, documents and text. Amazon Nova Micro is a text-only model that delivers our lowest-latency responses at very low cost. Amazon Nova Canvas is an image generation model that creates professional grade images with rich customization controls. Amazon Nova Reel is a video generation model offering high-quality outputs, customization, and motion control. Our models were built responsibly and with a commitment to customer trust, security, and reliability. We report benchmarking results for core capabilities, agentic performance, long context, functional adaptation, runtime performance, and human evaluation.

CLOct 13, 2024
Investigating Implicit Bias in Large Language Models: A Large-Scale Study of Over 50 LLMs

Divyanshu Kumar, Umang Jain, Sahil Agarwal et al.

Large Language Models (LLMs) are being adopted across a wide range of tasks, including decision-making processes in industries where bias in AI systems is a significant concern. Recent research indicates that LLMs can harbor implicit biases even when they pass explicit bias evaluations. Building upon the frameworks of the LLM Implicit Association Test (IAT) Bias and LLM Decision Bias, this study highlights that newer or larger language models do not automatically exhibit reduced bias; in some cases, they displayed higher bias scores than their predecessors, such as in Meta's Llama series and OpenAI's GPT models. This suggests that increasing model complexity without deliberate bias mitigation strategies can unintentionally amplify existing biases. The variability in bias scores within and across providers underscores the need for standardized evaluation metrics and benchmarks for bias assessment. The lack of consistency indicates that bias mitigation is not yet a universally prioritized goal in model development, which can lead to unfair or discriminatory outcomes. By broadening the detection of implicit bias, this research provides a more comprehensive understanding of the biases present in advanced models and underscores the critical importance of addressing these issues to ensure the development of fair and responsible AI systems.

CVAug 22, 2025
A Multimodal-Multitask Framework with Cross-modal Relation and Hierarchical Interactive Attention for Semantic Comprehension

Mohammad Zia Ur Rehman, Devraj Raghuvanshi, Umang Jain et al.

A major challenge in multimodal learning is the presence of noise within individual modalities. This noise inherently affects the resulting multimodal representations, especially when these representations are obtained through explicit interactions between different modalities. Moreover, the multimodal fusion techniques while aiming to achieve a strong joint representation, can neglect valuable discriminative information within the individual modalities. To this end, we propose a Multimodal-Multitask framework with crOss-modal Relation and hIErarchical iNteractive aTtention (MM-ORIENT) that is effective for multiple tasks. The proposed approach acquires multimodal representations cross-modally without explicit interaction between different modalities, reducing the noise effect at the latent stage. To achieve this, we propose cross-modal relation graphs that reconstruct monomodal features to acquire multimodal representations. The features are reconstructed based on the node neighborhood, where the neighborhood is decided by the features of a different modality. We also propose Hierarchical Interactive Monomadal Attention (HIMA) to focus on pertinent information within a modality. While cross-modal relation graphs help comprehend high-order relationships between two modalities, HIMA helps in multitasking by learning discriminative features of individual modalities before late-fusing them. Finally, extensive experimental evaluation on three datasets demonstrates that the proposed approach effectively comprehends multimodal content for multiple tasks.