Whanhee Cho

2papers

2 Papers

40.0DBMay 19
Example-Driven Intent Synthesis for Constrained Data Bundle Retrieval: Focused Text Snippet Extraction and Beyond

Whanhee Cho, Kuangfei Long, Mahmood Jasim et al.

Selecting a bundle of items that collectively satisfies constraints is a fundamental task across databases, recommender systems, and text summarization. Unlike traditional retrieval that returns individual or top-k items, bundle retrieval is inherently combinatorial and, in general, NP-hard. Although package queries can efficiently retrieve bundles given a well-formed query, two key user-centric challenges remain: (1) expressing and tuning multi-dimensional bundle intent through a user-friendly interface, and (2) ensuring feasibility when the query yields empty results. We introduce Ex2Bundle, an Example-driven Bundle retrieval framework that enables users to specify their intent through example bundles and automatically synthesizes package queries that capture the intent implicit in those example bundles via aggregate constraints. Ex2Bundle also addresses a challenge unique to bundle retrieval: when inferred aggregate constraints are infeasible over the target data, our data-aware constraint relaxation minimally adjusts the constraint bounds while preserving alignment with user intent. We instantiate a specific application of focused text snippet extraction by example to demonstrate the efficacy of the Ex2Bundle framework. Extensive experiments over real-world datasets and a user study demonstrate that Ex2Bundle improves usability and consistently returns intent-aligned bundles even under distributional shifts of the target database.

CLNov 7, 2024Code
ACCIO: Table Understanding Enhanced via Contrastive Learning with Aggregations

Whanhee Cho

The attention to table understanding using recent natural language models has been growing. However, most related works tend to focus on learning the structure of the table directly. Just as humans improve their understanding of sentences by comparing them, they can also enhance their understanding by comparing tables. With this idea, in this paper, we introduce ACCIO, tAble understanding enhanCed via Contrastive learnIng with aggregatiOns, a novel approach to enhancing table understanding by contrasting original tables with their pivot summaries through contrastive learning. ACCIO trains an encoder to bring these table pairs closer together. Through validation via column type annotation, ACCIO achieves competitive performance with a macro F1 score of 91.1 compared to state-of-the-art methods. This work represents the first attempt to utilize pairs of tables for table embedding, promising significant advancements in table comprehension. Our code is available at https://github.com/whnhch/ACCIO/.