Luiz Pizzato

CL
h-index8
8papers
53citations
Novelty34%
AI Score46

8 Papers

CLMar 10, 2023
Detection of Abuse in Financial Transaction Descriptions Using Machine Learning

Anna Leontjeva, Genevieve Richards, Kaavya Sriskandaraja et al.

Since introducing changes to the New Payments Platform (NPP) to include longer messages as payment descriptions, it has been identified that people are now using it for communication, and in some cases, the system was being used as a targeted form of domestic and family violence. This type of tech-assisted abuse poses new challenges in terms of identification, actions and approaches to rectify this behaviour. Commonwealth Bank of Australia's Artificial Intelligence Labs team (CBA AI Labs) has developed a new system using advances in deep learning models for natural language processing (NLP) to create a powerful abuse detector that periodically scores all the transactions, and identifies cases of high-risk abuse in millions of records. In this paper, we describe the problem of tech-assisted abuse in the context of banking services, outline the developed model and its performance, and the operating framework more broadly.

CLMar 12
ConCISE: A Reference-Free Conciseness Evaluation Metric for LLM-Generated Answers

Seyed Mohssen Ghafari, Ronny Kol, Juan C. Quiroz et al.

Large language models (LLMs) frequently generate responses that are lengthy and verbose, filled with redundant or unnecessary details. This diminishes clarity and user satisfaction, and it increases costs for model developers, especially with well-known proprietary models that charge based on the number of output tokens. In this paper, we introduce a novel reference-free metric for evaluating the conciseness of responses generated by LLMs. Our method quantifies non-essential content without relying on gold standard references and calculates the average of three calculations: i) a compression ratio between the original response and an LLM abstractive summary; ii) a compression ratio between the original response and an LLM extractive summary; and iii) wordremoval compression, where an LLM removes as many non-essential words as possible from the response while preserving its meaning, with the number of tokens removed indicating the conciseness score. Experimental results demonstrate that our proposed metric identifies redundancy in LLM outputs, offering a practical tool for automated evaluation of response brevity in conversational AI systems without the need for ground truth human annotations.

AIMay 14
Prompt Segmentation and Annotation Optimisation: Controlling LLM Behaviour via Optimised Segment-Level Annotations

Devika Prasad, Luke Gerschwitz, Tong Li et al.

Prompt engineering is crucial for effective interaction with generative artificial intelligence systems, yet existing optimisation methods often operate over an unstructured and vast prompt space, leading to high computational costs and potential distortions of the original intent. We introduce Prompt Segmentation and Annotation Optimisation (PSAO), a structured prompt optimisation framework designed to improve prompt optimisation controllability and efficiency. PSAO decomposes a prompt into interpretable segments (e.g., sentences) and augments each with human-readable annotations (e.g., {not important}, {important}, {very important}). These annotations guide large language models (LLMs) in allocating focus and clarifying confusion during response generation. We formally define the segmentations and annotations and demonstrate that optimised segment-level annotations can lead to improved LLM responses, with the original prompt retained as a candidate in the optimisation space to prevent performance degradation. Empirical evaluations indicate that PSAO benefits from annotations in terms of improved reasoning accuracy and self-consistency. However, developing efficient methods for identifying optimal segmentations and annotations remains challenging and is reserved for future investigation. This work is intended as a proof of concept, demonstrating the feasibility and potential of segment-level annotation optimisation.

DBAug 5, 2025
A Robust and Efficient Pipeline for Enterprise-Level Large-Scale Entity Resolution

Sandeepa Kannangara, Arman Abrahamyan, Daniel Elias et al.

Entity resolution (ER) remains a significant challenge in data management, especially when dealing with large datasets. This paper introduces MERAI (Massive Entity Resolution using AI), a robust and efficient pipeline designed to address record deduplication and linkage issues in high-volume datasets at an enterprise level. The pipeline's resilience and accuracy have been validated through various large-scale record deduplication and linkage projects. To evaluate MERAI's performance, we compared it with two well-known entity resolution libraries, Dedupe and Splink. While Dedupe failed to scale beyond 2 million records due to memory constraints, MERAI successfully processed datasets of up to 15.7 million records and produced accurate results across all experiments. Experimental data demonstrates that MERAI outperforms both baseline systems in terms of matching accuracy, with consistently higher F1 scores in both deduplication and record linkage tasks. MERAI offers a scalable and reliable solution for enterprise-level large-scale entity resolution, ensuring data integrity and consistency in real-world applications.

LGAug 1, 2025
FeatureCuts: Feature Selection for Large Data by Optimizing the Cutoff

Andy Hu, Devika Prasad, Luiz Pizzato et al.

In machine learning, the process of feature selection involves finding a reduced subset of features that captures most of the information required to train an accurate and efficient model. This work presents FeatureCuts, a novel feature selection algorithm that adaptively selects the optimal feature cutoff after performing filter ranking. Evaluated on 14 publicly available datasets and one industry dataset, FeatureCuts achieved, on average, 15 percentage points more feature reduction and up to 99.6% less computation time while maintaining model performance, compared to existing state-of-the-art methods. When the selected features are used in a wrapper method such as Particle Swarm Optimization (PSO), it enables 25 percentage points more feature reduction, requires 66% less computation time, and maintains model performance when compared to PSO alone. The minimal overhead of FeatureCuts makes it scalable for large datasets typically seen in enterprise applications.

STOct 28, 2024
Do LLM Personas Dream of Bull Markets? Comparing Human and AI Investment Strategies Through the Lens of the Five-Factor Model

Harris Borman, Anna Leontjeva, Luiz Pizzato et al.

Large Language Models (LLMs) have demonstrated the ability to adopt a personality and behave in a human-like manner. There is a large body of research that investigates the behavioural impacts of personality in less obvious areas such as investment attitudes or creative decision making. In this study, we investigated whether an LLM persona with a specific Big Five personality profile would perform an investment task similarly to a human with the same personality traits. We used a simulated investment task to determine if these results could be generalised into actual behaviours. In this simulated environment, our results show these personas produced meaningful behavioural differences in all assessed categories, with these behaviours generally being consistent with expectations derived from human research. We found that LLMs are able to generalise traits into expected behaviours in three areas: learning style, impulsivity and risk appetite while environmental attitudes could not be accurately represented. In addition, we showed that LLMs produce behaviour that is more reflective of human behaviour in a simulation environment compared to a survey environment.

SIJul 17, 2020
Reciprocal Recommender Systems: Analysis of State-of-Art Literature, Challenges and Opportunities towards Social Recommendation

Ivan Palomares, Carlos Porcel, Luiz Pizzato et al.

There exist situations of decision-making under information overload in the Internet, where people have an overwhelming number of available options to choose from, e.g. products to buy in an e-commerce site, or restaurants to visit in a large city. Recommender systems arose as a data-driven personalized decision support tool to assist users in these situations: they are able to process user-related data, filtering and recommending items based on the users preferences, needs and/or behaviour. Unlike most conventional recommender approaches where items are inanimate entities recommended to the users and success is solely determined upon the end users reaction to the recommendation(s) received, in a Reciprocal Recommender System (RRS) users become the item being recommended to other users. Hence, both the end user and the user being recommended should accept the 'matching' recommendation to yield a successful RRS performance. The operation of an RRS entails not only predicting accurate preference estimates upon user interaction data as classical recommenders do, but also calculating mutual compatibility between (pairs of) users, typically by applying fusion processes on unilateral user-to-user preference information. This paper presents a snapshot-style analysis of the extant literature that summarizes the state-of-the-art RRS research to date, focusing on the algorithms, fusion processes and fundamental characteristics of RRS, both inherited from conventional user-to-item recommendation models and those inherent to this emerging family of approaches. Representative RRS models are likewise highlighted. Following this, we discuss the challenges and opportunities for future research on RRSs, with special focus on (i) fusion strategies to account for reciprocity and (ii) emerging application domains related to social recommendation.

IRMay 1, 2019
Beyond Personalization: Research Directions in Multistakeholder Recommendation

Himan Abdollahpouri, Gediminas Adomavicius, Robin Burke et al.

Recommender systems are personalized information access applications; they are ubiquitous in today's online environment, and effective at finding items that meet user needs and tastes. As the reach of recommender systems has extended, it has become apparent that the single-minded focus on the user common to academic research has obscured other important aspects of recommendation outcomes. Properties such as fairness, balance, profitability, and reciprocity are not captured by typical metrics for recommender system evaluation. The concept of multistakeholder recommendation has emerged as a unifying framework for describing and understanding recommendation settings where the end user is not the sole focus. This article describes the origins of multistakeholder recommendation, and the landscape of system designs. It provides illustrative examples of current research, as well as outlining open questions and research directions for the field.