Gonçalo Marcelino

MM
h-index22
4papers
8citations
Novelty20%
AI Score20

4 Papers

CLApr 1, 2025
Scraping the Shadows: Deep Learning Breakthroughs in Dark Web Intelligence

Ingmar Bakermans, Daniel De Pascale, Gonçalo Marcelino et al.

Darknet markets (DNMs) facilitate the trade of illegal goods on a global scale. Gathering data on DNMs is critical to ensuring law enforcement agencies can effectively combat crime. Manually extracting data from DNMs is an error-prone and time-consuming task. Aiming to automate this process we develop a framework for extracting data from DNMs and evaluate the application of three state-of-the-art Named Entity Recognition (NER) models, ELMo-BiLSTM \citep{ShahEtAl2022}, UniversalNER \citep{ZhouEtAl2024}, and GLiNER \citep{ZaratianaEtAl2023}, at the task of extracting complex entities from DNM product listing pages. We propose a new annotated dataset, which we use to train, fine-tune, and evaluate the models. Our findings show that state-of-the-art NER models perform well in information extraction from DNMs, achieving 91% Precision, 96% Recall, and an F1 score of 94%. In addition, fine-tuning enhances model performance, with UniversalNER achieving the best performance.

MMOct 13, 2021
Assisting News Media Editors with Cohesive Visual Storylines

Gonçalo Marcelino, David Semedo, André Mourão et al.

Creating a cohesive, high-quality, relevant, media story is a challenge that news media editors face on a daily basis. This challenge is aggravated by the flood of highly relevant information that is constantly pouring onto the newsroom. To assist news media editors in this daunting task, this paper proposes a framework to organize news content into cohesive, high-quality, relevant visual storylines. First, we formalize, in a nonsubjective manner, the concept of visual story transition. Leveraging it, we propose four graph-based methods of storyline creation, aiming for global story cohesiveness. These were created and implemented to take full advantage of existing graph algorithms, ensuring their correctness and good computational performance. They leverage a strong ensemble-based estimator which was trained to predict story transition quality based on both the semantic and visual features present in the pair of images under scrutiny. A user study covered a total of 28 curated stories about sports and cultural events. Experiments showed that (i) visual transitions in storylines can be learned with a quality above 90%, and (ii) the proposed graph methods can produce cohesive storylines with quality in the range of 88% to 96%.

MMAug 9, 2019
A Benchmark of Visual Storytelling in Social Media

Gonçalo Marcelino, David Semedo, André Mourão et al.

Media editors in the newsroom are constantly pressed to provide a "like-being there" coverage of live events. Social media provides a disorganised collection of images and videos that media professionals need to grasp before publishing their latest news updated. Automated news visual storyline editing with social media content can be very challenging, as it not only entails the task of finding the right content but also making sure that news content evolves coherently over time. To tackle these issues, this paper proposes a benchmark for assessing social media visual storylines. The SocialStories benchmark, comprised by total of 40 curated stories covering sports and cultural events, provides the experimental setup and introduces novel quantitative metrics to perform a rigorous evaluation of visual storytelling with social media data.

IROct 9, 2018
Ranking News-Quality Multimedia

Gonçalo Marcelino, Ricardo Pinto, João Magalhães

News editors need to find the photos that best illustrate a news piece and fulfill news-media quality standards, while being pressed to also find the most recent photos of live events. Recently, it became common to use social-media content in the context of news media for its unique value in terms of immediacy and quality. Consequently, the amount of images to be considered and filtered through is now too much to be handled by a person. To aid the news editor in this process, we propose a framework designed to deliver high-quality, news-press type photos to the user. The framework, composed of two parts, is based on a ranking algorithm tuned to rank professional media highly and a visual SPAM detection module designed to filter-out low-quality media. The core ranking algorithm is leveraged by aesthetic, social and deep-learning semantic features. Evaluation showed that the proposed framework is effective at finding high-quality photos (true-positive rate) achieving a retrieval MAP of 64.5% and a classification precision of 70%.