Ahmed Nasser

AI
h-index10
4papers
62citations
Novelty60%
AI Score39

4 Papers

CVDec 3, 2025
HieroGlyphTranslator: Automatic Recognition and Translation of Egyptian Hieroglyphs to English

Ahmed Nasser, Marwan Mohamed, Alaa Sherif et al.

Egyptian hieroglyphs, the ancient Egyptian writing system, are composed entirely of drawings. Translating these glyphs into English poses various challenges, including the fact that a single glyph can have multiple meanings. Deep learning translation applications are evolving rapidly, producing remarkable results that significantly impact our lives. In this research, we propose a method for the automatic recognition and translation of ancient Egyptian hieroglyphs from images to English. This study utilized two datasets for classification and translation: the Morris Franken dataset and the EgyptianTranslation dataset. Our approach is divided into three stages: segmentation (using Contour and Detectron2), mapping symbols to Gardiner codes, and translation (using the CNN model). The model achieved a BLEU score of 42.2, a significant result compared to previous research.

AIMay 29, 2025
Second Opinion Matters: Towards Adaptive Clinical AI via the Consensus of Expert Model Ensemble

Amit Kumthekar, Zion Tilley, Henry Duong et al.

Despite the growing clinical adoption of large language models (LLMs), current approaches heavily rely on single model architectures. To overcome risks of obsolescence and rigid dependence on single model systems, we present a novel framework, termed the Consensus Mechanism. Mimicking clinical triage and multidisciplinary clinical decision-making, the Consensus Mechanism implements an ensemble of specialized medical expert agents enabling improved clinical decision making while maintaining robust adaptability. This architecture enables the Consensus Mechanism to be optimized for cost, latency, or performance, purely based on its interior model configuration. To rigorously evaluate the Consensus Mechanism, we employed three medical evaluation benchmarks: MedMCQA, MedQA, and MedXpertQA Text, and the differential diagnosis dataset, DDX+. On MedXpertQA, the Consensus Mechanism achieved an accuracy of 61.0% compared to 53.5% and 45.9% for OpenAI's O3 and Google's Gemini 2.5 Pro. Improvement was consistent across benchmarks with an increase in accuracy on MedQA ($Δ\mathrm{Accuracy}_{\mathrm{consensus\text{-}O3}} = 3.4\%$) and MedMCQA ($Δ\mathrm{Accuracy}_{\mathrm{consensus\text{-}O3}} = 9.1\%$). These accuracy gains extended to differential diagnosis generation, where our system demonstrated improved recall and precision (F1$_\mathrm{consensus}$ = 0.326 vs. F1$_{\mathrm{O3\text{-}high}}$ = 0.2886) and a higher top-1 accuracy for DDX (Top1$_\mathrm{consensus}$ = 52.0% vs. Top1$_{\mathrm{O3\text{-}high}}$ = 45.2%).

CYApr 14, 2019
Boomerang: Rebounding the Consequences of Reputation Feedback on Crowdsourcing Platforms

Snehalkumar, S. Gaikwad, Durim Morina et al.

Paid crowdsourcing platforms suffer from low-quality work and unfair rejections, but paradoxically, most workers and requesters have high reputation scores. These inflated scores, which make high-quality work and workers difficult to find, stem from social pressure to avoid giving negative feedback. We introduce Boomerang, a reputation system for crowdsourcing that elicits more accurate feedback by rebounding the consequences of feedback directly back onto the person who gave it. With Boomerang, requesters find that their highly-rated workers gain earliest access to their future tasks, and workers find tasks from their highly-rated requesters at the top of their task feed. Field experiments verify that Boomerang causes both workers and requesters to provide feedback that is more closely aligned with their private opinions. Inspired by a game-theoretic notion of incentive-compatibility, Boomerang opens opportunities for interaction design to incentivize honest reporting over strategic dishonesty.

HCJul 18, 2017
Prototype Tasks: Improving Crowdsourcing Results through Rapid, Iterative Task Design

Snehalkumar "Neil" S. Gaikwad, Nalin Chhibber, Vibhor Sehgal et al.

Low-quality results have been a long-standing problem on microtask crowdsourcing platforms, driving away requesters and justifying low wages for workers. To date, workers have been blamed for low-quality results: they are said to make as little effort as possible, do not pay attention to detail, and lack expertise. In this paper, we hypothesize that requesters may also be responsible for low-quality work: they launch unclear task designs that confuse even earnest workers, under-specify edge cases, and neglect to include examples. We introduce prototype tasks, a crowdsourcing strategy requiring all new task designs to launch a small number of sample tasks. Workers attempt these tasks and leave feedback, enabling the re- quester to iterate on the design before publishing it. We report a field experiment in which tasks that underwent prototype task iteration produced higher-quality work results than the original task designs. With this research, we suggest that a simple and rapid iteration cycle can improve crowd work, and we provide empirical evidence that requester "quality" directly impacts result quality.