DB LGAug 16, 2020

DeepSampling: Selectivity Estimation with Predicted Error and Response Time

arXiv:2008.06831v11.2

Originality Incremental advance

AI Analysis

This provides a tool for spatial databases to manage AQP accuracy, addressing a known bottleneck in interactive query processing.

The paper tackles the problem of approximate query processing (AQP) for spatial data by proposing DeepSampling, a deep-learning model that predicts accuracy metrics like error and response time for selectivity estimation, enabling control over sample size to achieve desired accuracy.

The rapid growth of spatial data urges the research community to find efficient processing techniques for interactive queries on large volumes of data. Approximate Query Processing (AQP) is the most prominent technique that can provide real-time answer for ad-hoc queries based on a random sample. Unfortunately, existing AQP methods provide an answer without providing any accuracy metrics due to the complex relationship between the sample size, the query parameters, the data distribution, and the result accuracy. This paper proposes DeepSampling, a deep-learning-based model that predicts the accuracy of a sample-based AQP algorithm, specially selectivity estimation, given the sample size, the input distribution, and query parameters. The model can also be reversed to measure the sample size that would produce a desired accuracy. DeepSampling is the first system that provides a reliable tool for existing spatial databases to control the accuracy of AQP.

View on arXiv PDF

Similar