Alex Albert

h-index32

3papers

9citations

Novelty33%

AI Score24

Ranked #168,798 of 194,257 authors (top 87%)#54,022 in CV (top 91%)

3 Papers

8.4CVJun 9, 2025

Prompt to Protection: A Comparative Study of Multimodal LLMs in Construction Hazard Recognition

Nishi Chaudhary, S M Jamil Uddin, Sathvik Sharath Chandra et al.

The recent emergence of multimodal large language models (LLMs) has introduced new opportunities for improving visual hazard recognition on construction sites. Unlike traditional computer vision models that rely on domain-specific training and extensive datasets, modern LLMs can interpret and describe complex visual scenes using simple natural language prompts. However, despite growing interest in their applications, there has been limited investigation into how different LLMs perform in safety-critical visual tasks within the construction domain. To address this gap, this study conducts a comparative evaluation of five state-of-the-art LLMs: Claude-3 Opus, GPT-4.5, GPT-4o, GPT-o3, and Gemini 2.0 Pro, to assess their ability to identify potential hazards from real-world construction images. Each model was tested under three prompting strategies: zero-shot, few-shot, and chain-of-thought (CoT). Zero-shot prompting involved minimal instruction, few-shot incorporated basic safety context and a hazard source mnemonic, and CoT provided step-by-step reasoning examples to scaffold model thinking. Quantitative analysis was performed using precision, recall, and F1-score metrics across all conditions. Results reveal that prompting strategy significantly influenced performance, with CoT prompting consistently producing higher accuracy across models. Additionally, LLM performance varied under different conditions, with GPT-4.5 and GPT-o3 outperforming others in most settings. The findings also demonstrate the critical role of prompt design in enhancing the accuracy and consistency of multimodal LLMs for construction safety applications. This study offers actionable insights into the integration of prompt engineering and LLMs for practical hazard recognition, contributing to the development of more reliable AI-assisted safety systems.

1.8CVJan 30, 2019

Real-world Mapping of Gaze Fixations Using Instance Segmentation for Road Construction Safety Applications

Idris Jeelani, Khashayar Asadi, Hariharan Ramshankar et al.

Research studies have shown that a large proportion of hazards remain unrecognized, which expose construction workers to unanticipated safety risks. Recent studies have also found that a strong correlation exists between viewing patterns of workers, captured using eye-tracking devices, and their hazard recognition performance. Therefore, it is important to analyze the viewing patterns of workers to gain a better understanding of their hazard recognition performance. This paper proposes a method that can automatically map the gaze fixations collected using a wearable eye-tracker to the predefined areas of interests. The proposed method detects these areas or objects (i.e., hazards) of interests through a computer vision-based segmentation technique and transfer learning. The mapped fixation data is then used to analyze the viewing behaviors of workers and compute their attention distribution. The proposed method is implemented on an under construction road as a case study to evaluate the performance of the proposed method.

3.0HCAug 20, 2018

Automating Analysis of Construction Workers Viewing Patterns for Personalized Safety Training and Management

Idris Jeelani, Kevin Han, Alex Albert

Unrecognized hazards increase the likelihood of workplace fatalities and injuries substantially. However, recent research has demonstrated that a large proportion of hazards remain unrecognized in dynamic construction environments. Recent studies have suggested a strong correlation between viewing patterns of workers and their hazard recognition performance. Hence, it is important to study and analyze the viewing patterns of workers to gain a better understanding of their hazard recognition performance. The objective of this exploratory research is to explore hazard recognition as a visual search process to identifying various visual search factors that affect the process of hazard recognition. Further, the study also proposes a framework to develop a vision based tool capable of recording and analyzing viewing patterns of construction workers and generate feedback for personalized training and proactive safety management.