DBApr 4
Direct Access for Answers to Conjunctive Queries with AggregationIdan Eldar, Nofar Carmeli, Benny Kimelfeld
We study the fine-grained complexity of conjunctive queries with grouping and aggregation. For common aggregate functions (e.g., min, max, count, sum), such a query can be phrased as an ordinary conjunctive query over a database annotated with a suitable commutative semiring. We investigate the ability to evaluate such queries by constructing in loglinear time a data structure that provides logarithmic-time direct access to the answers ordered by a given lexicographic order. This task is nontrivial since the number of answers might be larger than loglinear in the size of the input, so the data structure needs to provide a compact representation of the space of answers. In the absence of aggregation and annotation, past research established a sufficient tractability condition on queries and orders. For queries without self-joins, this condition is not just sufficient, but also necessary (under conventional lower-bound assumptions in fine-grained complexity). We show that all past results continue to hold for annotated databases, assuming that the annotation itself does not participate in the lexicographic order. Yet, past algorithms do not apply to the count-distinct aggregation, which has no efficient representation as a commutative semiring; for this aggregation, we establish the corresponding tractability condition. We then show how the complexity of the problem changes when we include the aggregate and annotation value in the order. We also study the impact of having all relations but one annotated by the multiplicative identity (one), as happens when we translate aggregate queries into semiring annotations, and having a semiring with an idempotent addition, such as the case of min, max, and count-distinct over a logarithmic-size domain.
DBMar 19
Let's Play Tag: Linear Time Evaluation of Conjunctive Queries under TGD ConstraintsNofar Carmeli, Carsten Lutz, Marcin Przybyłko
We study the limits of linear time evaluation of conjunctive queries under constraints expressed as tuple-generating dependencies (TGDs), across several modes of query evaluation: single-testing, all-testing, counting, lexicographic direct access, and enumeration. While full classifications seem far beyond reach, we propose an approach that, for some evaluation modes and classes of TGDs, makes it possible to lift known dichotomies from the unconstrained setting. In particular, our approach applies to all mentioned evaluation modes except enumeration, when the constraints fall into one of two classes: non-recursive sets of TGDs in which every TGD uses at most binary relation symbols in the head or has at most two frontier variables; and frontier-guarded full TGDs. We further provide a collection of examples showcasing the challenges that arise for enumeration and for less restrictive classes of TGDs.
DBMar 10
Direct Access for Conjunctive Queries with NegationsFlorent Capelli, Nofar Carmeli, Oliver Irwin et al.
Given a conjunctive query $Q$ and a database $D$, a direct access to the answers of $Q$ over $D$ is the operation of returning, given an index $k$, the $k$-th answer for some order on its answers. While this problem is $\#\mathcal{P}$-hard in general with respect to combined complexity, many conjunctive queries have an underlying structure that allows for a direct access to their answers for some lexicographical ordering that takes polylogarithmic time in the size of the database after a polynomial time precomputation. Previous work has precisely characterised the tractable classes and given fine-grained lower bounds on the precomputation time needed depending on the structure of the query. In this paper, we generalise these tractability results to the case of signed conjunctive queries, that is, conjunctive queries that may contain negative atoms. Our technique is based on a class of circuits that can represent relational data. We first show that this class supports tractable direct access after a polynomial time preprocessing. We then give bounds on the size of the circuit needed to represent the answer set of signed conjunctive queries depending on their structure. Both results combined together allow us to prove the tractability of direct access for a large class of conjunctive queries. On the one hand, we recover the known tractable classes from the literature in the case of positive conjunctive queries. On the other hand, we generalise and unify known tractability results about negative conjunctive queries -- that is, queries having only negated atoms. In particular, we show that the class of $β$-acyclic negative conjunctive queries and the class of bounded nest set width negative conjunctive queries admit tractable direct access.
CLMay 29, 2020
Constructing Explainable Opinion Graphs from ReviewNofar Carmeli, Xiaolan Wang, Yoshihiko Suhara et al.
The Web is a major resource of both factual and subjective information. While there are significant efforts to organize factual information into knowledge bases, there is much less work on organizing opinions, which are abundant in subjective data, into a structured format. We present ExplainIt, a system that extracts and organizes opinions into an opinion graph, which are useful for downstream applications such as generating explainable review summaries and facilitating search over opinion phrases. In such graphs, a node represents a set of semantically similar opinions extracted from reviews and an edge between two nodes signifies that one node explains the other. ExplainIt mines explanations in a supervised method and groups similar opinions together in a weakly supervised way before combining the clusters of opinions together with their explanation relationships into an opinion graph. We experimentally demonstrate that the explanation relationships generated in the opinion graph are of good quality and our labeled datasets for explanation mining and grouping opinions are publicly available.