ROAug 16, 2024Code
S-RAF: A Simulation-Based Robustness Assessment Framework for Responsible Autonomous DrivingDaniel Omeiza, Pratik Somaiya, Jo-Ann Pattinson et al.
As artificial intelligence (AI) technology advances, ensuring the robustness and safety of AI-driven systems has become paramount. However, varying perceptions of robustness among AI developers create misaligned evaluation metrics, complicating the assessment and certification of safety-critical and complex AI systems such as autonomous driving (AD) agents. To address this challenge, we introduce Simulation-Based Robustness Assessment Framework (S-RAF) for autonomous driving. S-RAF leverages the CARLA Driving simulator to rigorously assess AD agents across diverse conditions, including faulty sensors, environmental changes, and complex traffic situations. By quantifying robustness and its relationship with other safety-critical factors, such as carbon emissions, S-RAF aids developers and stakeholders in building safe and responsible driving agents, and streamlining safety certification processes. Furthermore, S-RAF offers significant advantages, such as reduced testing costs, and the ability to explore edge cases that may be unsafe to test in the real world. The code for this framework is available here: https://github.com/cognitive-robots/rai-leaderboard
CYMar 6
What are AI researchers worried about?Cian O'Donovan, Sarp Gurakan, Ananya Karanam et al.
As AI attracts vast investment and attention, there are competing concerns about the technology's opportunities and uncertainties that blend technical and social questions. The public debate, dominated by a few powerful voices, tends to highlight extreme promises and threats. We wanted to know whether public discussions or technology companies' priorities were representative of AI researchers' opinions. Our survey of more than 4,000 AI researchers is, we think, the largest conducted to date. It was designed to understand attitudes to a variety of issues and include some comparisons with public attitudes derived from existing surveys. Most previous surveys of AI researchers have asked them for predictions of passing a technological threshold or the probabilities of some catastrophic event. These surveys mask the uncertainty and diversity that normally characterises scientific research. Our hypothesis was that the opinions of AI researchers would be markedly different from those of members of the public. While there are areas of divergence, particularly in attitudes to the technology's potential benefits, our survey shows some surprising convergence between researchers' and publics' opinions, particularly in the assessment and prioritisation of risk. Responses to an open text question 'What one thing most worries you about AI?' reveal that only 3% of AI researchers prioritise existential risks, despite the prominence given to these risks in media and policy. AI technologies and AI researchers seem to be much more 'normal' than public representations suggest. Our survey results suggest the possibility for new forms of public dialogue on AI's harms, risks and opportunities. Rather than speculating on future potential risks, policymakers and AI researchers should collaborate on understanding and mitigating the range of sociotechnical risks that are already of clear public concern.
CVAug 27, 2025
Assessing the Geolocation Capabilities, Limitations and Societal Risks of Generative Vision-Language ModelsOliver Grainge, Sania Waheed, Jack Stilgoe et al.
Geo-localization is the task of identifying the location of an image using visual cues alone. It has beneficial applications, such as improving disaster response, enhancing navigation, and geography education. Recently, Vision-Language Models (VLMs) are increasingly demonstrating capabilities as accurate image geo-locators. This brings significant privacy risks, including those related to stalking and surveillance, considering the widespread uses of AI models and sharing of photos on social media. The precision of these models is likely to improve in the future. Despite these risks, there is little work on systematically evaluating the geolocation precision of Generative VLMs, their limits and potential for unintended inferences. To bridge this gap, we conduct a comprehensive assessment of the geolocation capabilities of 25 state-of-the-art VLMs on four benchmark image datasets captured in diverse environments. Our results offer insight into the internal reasoning of VLMs and highlight their strengths, limitations, and potential societal risks. Our findings indicate that current VLMs perform poorly on generic street-level images yet achieve notably high accuracy (61\%) on images resembling social media content, raising significant and urgent privacy concerns.