Sheikh Izzal Azid

h-index11
2papers

2 Papers

11.8CVApr 5
A Physics-Informed, Behavior-Aware Digital Twin for Robust Multimodal Forecasting of Core Body Temperature in Precision Livestock Farming

Riasad Alvi, Mohaimenul Azam Khan Raiaan, Sadia Sultana Chowa et al.

Precision livestock farming requires accurate and timely heat stress prediction to ensure animal welfare and optimize farm management. This study presents a physics-informed digital twin (DT) framework combined with an uncertainty-aware, expert-weighted stacked ensemble for multimodal forecasting of Core Body Temperature (CBT) in dairy cattle. Using the high-frequency, heterogeneous MmCows dataset, the DT integrates an ordinary differential equation (ODE)-based thermoregulation model that simulates metabolic heat production and dissipation, a Gaussian process for capturing cow-specific deviations, a Kalman filter for aligning predictions with real-time sensor data, and a behavioral Markov chain that models activity-state transitions under varying environmental conditions. The DT outputs key physiological indicators, such as predicted CBT, heat stress probability, and behavioral state distributions are fused with raw sensor data and enriched through multi-scale temporal analysis and cross-modal feature engineering to form a comprehensive feature set. The predictive methodology is designed in a three-stage stacked ensemble, where stage 1 trains modality-specific LightGBM 'expert' models on distinct feature groups, stage 2 collects their predictions as meta-features, and at stage 3 Optuna-tuned LightGBM meta-model yields the final CBT forecast. Predictive uncertainty is quantified via bootstrapping and validated using Prediction Interval Coverage Probability (PICP). Ablation analysis confirms that incorporating DT-derived features and multimodal fusion substantially enhances performance. The proposed framework achieves a cross-validated R2 of 0.783, F1 score of 84.25% and PICP of 92.38% for 2-hour ahead forecasting, providing a robust, uncertainty-aware, and physically principled system for early heat stress detection and precision livestock management.

CVSep 15, 2025
CLAIRE: A Dual Encoder Network with RIFT Loss and Phi-3 Small Language Model Based Interpretability for Cross-Modality Synthetic Aperture Radar and Optical Land Cover Segmentation

Debopom Sutradhar, Arefin Ittesafun Abian, Mohaimenul Azam Khan Raiaan et al.

Accurate land cover classification from satellite imagery is crucial in environmental monitoring and sustainable resource management. However, it remains challenging due to the complexity of natural landscapes, the visual similarity between classes, and the significant class imbalance in the available datasets. To address these issues, we propose a dual encoder architecture that independently extracts modality-specific features from optical and Synthetic Aperture Radar (SAR) imagery, which are then fused using a cross-modality attention-fusion module named Cross-modality Land cover segmentation with Attention and Imbalance-aware Reasoning-Enhanced Explanations (CLAIRE). This fusion mechanism highlights complementary spatial and textural features, enabling the network to better capture detailed and diverse land cover patterns. We incorporate a hybrid loss function that utilizes Weighted Focal Loss and Tversky Loss named RIFT (Rare-Instance Focal-Tversky) to address class imbalance and improve segmentation performance across underrepresented categories. Our model achieves competitive performance across multiple benchmarks: a mean Intersection over Union (mIoU) of 56.02% and Overall Accuracy (OA) of 84.56% on the WHU-OPT-SAR dataset; strong generalization with a mIoU of 59.89% and OA of 73.92% on the OpenEarthMap-SAR dataset; and remarkable robustness under cloud-obstructed conditions, achieving an mIoU of 86.86% and OA of 94.58% on the PIE-RGB-SAR dataset. Additionally, we introduce a metric-driven reasoning module generated by a Small Language Model (Phi-3), which generates expert-level, sample-specific justifications for model predictions, thereby enhancing transparency and interpretability.