Alhim Vera

AI
h-index4
3papers
11citations
Novelty40%
AI Score30

3 Papers

CVSep 11, 2024
ODYSSEE: Oyster Detection Yielded by Sensor Systems on Edge Electronics

Xiaomin Lin, Vivek Mange, Arjun Suresh et al.

Oysters are a vital keystone species in coastal ecosystems, providing significant economic, environmental, and cultural benefits. As the importance of oysters grows, so does the relevance of autonomous systems for their detection and monitoring. However, current monitoring strategies often rely on destructive methods. While manual identification of oysters from video footage is non-destructive, it is time-consuming, requires expert input, and is further complicated by the challenges of the underwater environment. To address these challenges, we propose a novel pipeline using stable diffusion to augment a collected real dataset with realistic synthetic data. This method enhances the dataset used to train a YOLOv10-based vision model. The model is then deployed and tested on an edge platform in underwater robotics, achieving a state-of-the-art 0.657 mAP@50 for oyster detection on the Aqua2 platform.

AIOct 9, 2025
Multimodal Safety Evaluation in Generative Agent Social Simulations

Alhim Vera, Karen Sanchez, Carlos Hinojosa et al.

Can generative agents be trusted in multimodal environments? Despite advances in large language and vision-language models that enable agents to act autonomously and pursue goals in rich settings, their ability to reason about safety, coherence, and trust across modalities remains limited. We introduce a reproducible simulation framework for evaluating agents along three dimensions: (1) safety improvement over time, including iterative plan revisions in text-visual scenarios; (2) detection of unsafe activities across multiple categories of social situations; and (3) social dynamics, measured as interaction counts and acceptance ratios of social exchanges. Agents are equipped with layered memory, dynamic planning, multimodal perception, and are instrumented with SocialMetrics, a suite of behavioral and structural metrics that quantifies plan revisions, unsafe-to-safe conversions, and information diffusion across networks. Experiments show that while agents can detect direct multimodal contradictions, they often fail to align local revisions with global safety, reaching only a 55 percent success rate in correcting unsafe plans. Across eight simulation runs with three models - Claude, GPT-4o mini, and Qwen-VL - five agents achieved average unsafe-to-safe conversion rates of 75, 55, and 58 percent, respectively. Overall performance ranged from 20 percent in multi-risk scenarios with GPT-4o mini to 98 percent in localized contexts such as fire/heat with Claude. Notably, 45 percent of unsafe actions were accepted when paired with misleading visuals, showing a strong tendency to overtrust images. These findings expose critical limitations in current architectures and provide a reproducible platform for studying multimodal safety, coherence, and social dynamics.

AIMay 6, 2025
Is AI currently capable of identifying wild oysters? A comparison of human annotators against the AI model, ODYSSEE

Brendan Campbell, Alan Williams, Kleio Baxevani et al.

Oysters are ecologically and commercially important species that require frequent monitoring to track population demographics (e.g. abundance, growth, mortality). Current methods of monitoring oyster reefs often require destructive sampling methods and extensive manual effort. Therefore, they are suboptimal for small-scale or sensitive environments. A recent alternative, the ODYSSEE model, was developed to use deep learning techniques to identify live oysters using video or images taken in the field of oyster reefs to assess abundance. The validity of this model in identifying live oysters on a reef was compared to expert and non-expert annotators. In addition, we identified potential sources of prediction error. Although the model can make inferences significantly faster than expert and non-expert annotators (39.6 s, $2.34 \pm 0.61$ h, $4.50 \pm 1.46$ h, respectively), the model overpredicted the number of live oysters, achieving lower accuracy (63\%) in identifying live oysters compared to experts (74\%) and non-experts (75\%) alike. Image quality was an important factor in determining the accuracy of the model and the annotators. Better quality images improved human accuracy and worsened model accuracy. Although ODYSSEE was not sufficiently accurate, we anticipate that future training on higher-quality images, utilizing additional live imagery, and incorporating additional annotation training classes will greatly improve the model's predictive power based on the results of this analysis. Future research should address methods that improve the detection of living vs. dead oysters.