Vijay Saraswat

CL
5papers
1,899citations
Novelty48%
AI Score27

5 Papers

CLNov 24, 2021
Handling tree-structured text: parsing directory pages

Sarang Shrivastava, Afreen Shaikh, Shivani Shrivastava et al.

The determination of the reading sequence of text is fundamental to document understanding. This problem is easily solved in pages where the text is organized into a sequence of lines and vertical alignment runs the height of the page (producing multiple columns which can be read from left to right). We present a situation -- the directory page parsing problem -- where information is presented on the page in an irregular, visually-organized, two-dimensional format. Directory pages are fairly common in financial prospectuses and carry information about organizations, their addresses and relationships that is key to business tasks in client onboarding. Interestingly, directory pages sometimes have hierarchical structure, motivating the need to generalize the reading sequence to a reading tree. We present solutions to the problem of identifying directory pages and constructing the reading tree, using (learnt) classifiers for text segments and a bottom-up (right to left, bottom-to-top) traversal of segments. The solution is a key part of a production service supporting automatic extraction of organization, address and relationship information from client onboarding documents.

CLJun 1, 2021
KGPool: Dynamic Knowledge Graph Context Selection for Relation Extraction

Abhishek Nadgeri, Anson Bastos, Kuldeep Singh et al.

We present a novel method for relation extraction (RE) from a single sentence, mapping the sentence and two given entities to a canonical fact in a knowledge graph (KG). Especially in this presumed sentential RE setting, the context of a single sentence is often sparse. This paper introduces the KGPool method to address this sparsity, dynamically expanding the context with additional facts from the KG. It learns the representation of these facts (entity alias, entity descriptions, etc.) using neural methods, supplementing the sentential context. Unlike existing methods that statically use all expanded facts, KGPool conditions this expansion on the sentence. We study the efficacy of KGPool by evaluating it with different neural models and KGs (Wikidata and NYT Freebase). Our experimental evaluation on standard datasets shows that by feeding the KGPool representation into a Graph Neural Network, the overall method is significantly more accurate than state-of-the-art methods.

CLNov 14, 2017
Learning an Executable Neural Semantic Parser

Jianpeng Cheng, Siva Reddy, Vijay Saraswat et al.

This paper describes a neural semantic parser that maps natural language utterances onto logical forms which can be executed against a task-specific environment, such as a knowledge base or a database, to produce a response. The parser generates tree-structured logical forms with a transition-based approach which combines a generic tree-generation algorithm with domain-general operations defined by the logical language. The generation process is modeled by structured recurrent neural networks, which provide a rich encoding of the sentential context and generation history for making predictions. To tackle mismatches between natural language and logical form tokens, various attention mechanisms are explored. Finally, we consider different training settings for the neural semantic parser, including a fully supervised training where annotated logical forms are given, weakly-supervised training where denotations are provided, and distant supervision where only unlabeled sentences and a knowledge base are available. Experiments across a wide range of datasets demonstrate the effectiveness of our parser.

CLApr 27, 2017
Learning Structured Natural Language Representations for Semantic Parsing

Jianpeng Cheng, Siva Reddy, Vijay Saraswat et al.

We introduce a neural semantic parser that converts natural language utterances to intermediate representations in the form of predicate-argument structures, which are induced with a transition system and subsequently mapped to target domains. The semantic parser is trained end-to-end using annotated logical forms or their denotations. We obtain competitive results on various datasets. The induced predicate-argument structures shed light on the types of representations useful for semantic parsing and how these are different from linguistically motivated ones.

AIApr 24, 2015
Logical Conditional Preference Theories

Cristina Cornelio, Andrea Loreggia, Vijay Saraswat

CP-nets represent the dominant existing framework for expressing qualitative conditional preferences between alternatives, and are used in a variety of areas including constraint solving. Over the last fifteen years, a significant literature has developed exploring semantics, algorithms, implementation and use of CP-nets. This paper introduces a comprehensive new framework for conditional preferences: logical conditional preference theories (LCP theories). To express preferences, the user specifies arbitrary (constraint) Datalog programs over a binary ordering relation on outcomes. We show how LCP theories unify and generalize existing conditional preference proposals, and leverage the rich semantic, algorithmic and implementation frameworks of Datalog.