CVMay 31, 2025
Multimodal Generative AI with Autoregressive LLMs for Human Motion Understanding and Generation: A Way ForwardMuhammad Islam, Tao Huang, Euijoon Ahn et al.
This paper presents an in-depth survey on the use of multimodal Generative Artificial Intelligence (GenAI) and autoregressive Large Language Models (LLMs) for human motion understanding and generation, offering insights into emerging methods, architectures, and their potential to advance realistic and versatile motion synthesis. Focusing exclusively on text and motion modalities, this research investigates how textual descriptions can guide the generation of complex, human-like motion sequences. The paper explores various generative approaches, including autoregressive models, diffusion models, Generative Adversarial Networks (GANs), Variational Autoencoders (VAEs), and transformer-based models, by analyzing their strengths and limitations in terms of motion quality, computational efficiency, and adaptability. It highlights recent advances in text-conditioned motion generation, where textual inputs are used to control and refine motion outputs with greater precision. The integration of LLMs further enhances these models by enabling semantic alignment between instructions and motion, improving coherence and contextual relevance. This systematic survey underscores the transformative potential of text-to-motion GenAI and LLM architectures in applications such as healthcare, humanoids, gaming, animation, and assistive technologies, while addressing ongoing challenges in generating efficient and realistic human motion.
CLJun 2, 2025
Unified Large Language Models for Misinformation Detection in Low-Resource Linguistic SettingsMuhammad Islam, Javed Ali Khan, Mohammed Abaker et al.
The rapid expansion of social media platforms has significantly increased the dissemination of forged content and misinformation, making the detection of fake news a critical area of research. Although fact-checking efforts predominantly focus on English-language news, there is a noticeable gap in resources and strategies to detect news in regional languages, such as Urdu. Advanced Fake News Detection (FND) techniques rely heavily on large, accurately labeled datasets. However, FND in under-resourced languages like Urdu faces substantial challenges due to the scarcity of extensive corpora and the lack of validated lexical resources. Current Urdu fake news datasets are often domain-specific and inaccessible to the public. They also lack human verification, relying mainly on unverified English-to-Urdu translations, which compromises their reliability in practical applications. This study highlights the necessity of developing reliable, expert-verified, and domain-independent Urdu-enhanced FND datasets to improve fake news detection in Urdu and other resource-constrained languages. This paper presents the first benchmark large FND dataset for Urdu news, which is publicly available for validation and deep analysis. We also evaluate this dataset using multiple state-of-the-art pre-trained large language models (LLMs), such as XLNet, mBERT, XLM-RoBERTa, RoBERTa, DistilBERT, and DeBERTa. Additionally, we propose a unified LLM model that outperforms the others with different embedding and feature extraction techniques. The performance of these models is compared based on accuracy, F1 score, precision, recall, and human judgment for vetting the sample results of news.
IVMar 26, 2021
Evaluation of Preprocessing Techniques for U-Net Based Automated Liver SegmentationMuhammad Islam, Kaleem Nawaz Khan, Muhammad Salman Khan
To extract liver from medical images is a challenging task due to similar intensity values of liver with adjacent organs, various contrast levels, various noise associated with medical images and irregular shape of liver. To address these issues, it is important to preprocess the medical images, i.e., computerized tomography (CT) and magnetic resonance imaging (MRI) data prior to liver analysis and quantification. This paper investigates the impact of permutation of various preprocessing techniques for CT images, on the automated liver segmentation using deep learning, i.e., U-Net architecture. The study focuses on Hounsfield Unit (HU) windowing, contrast limited adaptive histogram equalization (CLAHE), z-score normalization, median filtering and Block-Matching and 3D (BM3D) filtering. The segmented results show that combination of three techniques; HU-windowing, median filtering and z-score normalization achieve optimal performance with Dice coefficient of 96.93%, 90.77% and 90.84% for training, validation and testing respectively.
CRFeb 19, 2021
Differential Privacy-based Permissioned Blockchain for Private Data Sharing in Industrial IoTMuhammad Islam, Mubashir Husain Rehmani, Jinjun Chen
Permissioned blockchain such as Hyperledger fabric enables a secure supply chain model in Industrial Internet of Things (IIoT) through multichannel and private data collection mechanisms. Sharing of Industrial data including private data exchange at every stage between supply chain partners helps to improve product quality, enable future forecast, and enhance management activities. However, the existing data sharing and querying mechanism in Hyperledger fabric is not suitable for supply chain environment in IIoT because the queries are evaluated on actual data stored on ledger which consists of sensitive information such as business secrets, and special discounts offered to retailers and individuals. To solve this problem, we propose a differential privacy-based permissioned blockchain using Hyperledger fabric to enable private data sharing in supply chain in IIoT (DH-IIoT). We integrate differential privacy into the chaindcode (smart contract) of Hyperledger fabric to achieve privacy preservation. As a result, the query response consists of perturbed data which protects the sensitive information in the ledger. The proposed work (DH-IIoT) is evaluated by simulating a permissioned blockchain using Hyperledger fabric. We compare our differential privacy integrated chaincode of Hyperledger fabric with the default chaincode setting of Hyperledger fabric for supply chain scenario. The results confirm that the proposed work maintains 96.15% of accuracy in the shared data while guarantees the protection of sensitive ledger's data.
AIJul 6, 2020
Fuzzy Integral = Contextual Linear Order StatisticDerek Anderson, Matthew Deardorff, Timothy Havens et al.
The fuzzy integral is a powerful parametric nonlin-ear function with utility in a wide range of applications, from information fusion to classification, regression, decision making,interpolation, metrics, morphology, and beyond. While the fuzzy integral is in general a nonlinear operator, herein we show that it can be represented by a set of contextual linear order statistics(LOS). These operators can be obtained via sampling the fuzzy measure and clustering is used to produce a partitioning of the underlying space of linear convex sums. Benefits of our approach include scalability, improved integral/measure acquisition, generalizability, and explainable/interpretable models. Our methods are both demonstrated on controlled synthetic experiments, and also analyzed and validated with real-world benchmark data sets.