Alejandro Buendia

CL
h-index35
4papers
1,393citations
Novelty52%
AI Score30

4 Papers

CLOct 19, 2022
TabLLM: Few-shot Classification of Tabular Data with Large Language Models

Stefan Hegselmann, Alejandro Buendia, Hunter Lang et al. · mit

We study the application of large language models to zero-shot and few-shot classification of tabular data. We prompt the large language model with a serialization of the tabular data to a natural-language string, together with a short description of the classification problem. In the few-shot setting, we fine-tune the large language model using some labeled examples. We evaluate several serialization methods including templates, table-to-text models, and large language models. Despite its simplicity, we find that this technique outperforms prior deep-learning-based tabular classification methods on several benchmark datasets. In most cases, even zero-shot classification obtains non-trivial performance, illustrating the method's ability to exploit prior knowledge encoded in large language models. Unlike many deep learning methods for tabular datasets, this approach is also competitive with strong traditional baselines like gradient-boosted trees, especially in the very-few-shot setting.

NAAug 8, 2018
Random Walk Laplacian and Network Centrality Measures

Daniel Boley, Alejandro Buendia, Golshan Golnari

Random walks over directed graphs are used to model activities in many domains, such as social networks, influence propagation, and Bayesian graphical models. They are often used to compute the importance or centrality of individual nodes according to a variety of different criteria. Here we show how the pseudoinverse of the "random walk" Laplacian can be used to quickly compute measures such as the average number of visits to a given node and various centrality and betweenness measures for individual nodes, both for the network in general and in the case a subset of nodes is to be avoided. We show that with a single matrix inversion it is possible to rapidly compute many such quantities.

HCJan 17, 2024
Impact of Large Language Model Assistance on Patients Reading Clinical Notes: A Mixed-Methods Study

Niklas Mannhardt, Elizabeth Bondi-Kelly, Barbara Lam et al. · microsoft-research

Large language models (LLMs) have immense potential to make information more accessible, particularly in medicine, where complex medical jargon can hinder patient comprehension of clinical notes. We developed a patient-facing tool using LLMs to make clinical notes more readable by simplifying, extracting information from, and adding context to the notes. We piloted the tool with clinical notes donated by patients with a history of breast cancer and synthetic notes from a clinician. Participants (N=200, healthy, female-identifying patients) were randomly assigned three clinical notes in our tool with varying levels of augmentations and answered quantitative and qualitative questions evaluating their understanding of follow-up actions. Augmentations significantly increased their quantitative understanding scores. In-depth interviews were conducted with participants (N=7, patients with a history of breast cancer), revealing both positive sentiments about the augmentations and concerns about AI. We also performed a qualitative clinician-driven analysis of the model's error modes.

CLJun 9, 2020
Examination and Extension of Strategies for Improving Personalized Language Modeling via Interpolation

Liqun Shao, Sahitya Mantravadi, Tom Manzini et al.

In this paper, we detail novel strategies for interpolating personalized language models and methods to handle out-of-vocabulary (OOV) tokens to improve personalized language models. Using publicly available data from Reddit, we demonstrate improvements in offline metrics at the user level by interpolating a global LSTM-based authoring model with a user-personalized n-gram model. By optimizing this approach with a back-off to uniform OOV penalty and the interpolation coefficient, we observe that over 80% of users receive a lift in perplexity, with an average of 5.2% in perplexity lift per user. In doing this research we extend previous work in building NLIs and improve the robustness of metrics for downstream tasks.