Scardina: Scalable Join Cardinality Estimation by Multiple Density Estimators
This addresses a bottleneck in query optimizers for database systems, though it appears incremental as it builds on existing machine learning-based methods.
The paper tackles the problem of inaccurate cardinality estimation in databases with many tables and strong correlations, proposing Scardina, a method using multiple density estimators that effectively handles large, complex schemas.
In recent years, machine learning-based cardinality estimation methods are replacing traditional methods. This change is expected to contribute to one of the most important applications of cardinality estimation, the query optimizer, to speed up query processing. However, none of the existing methods do not precisely estimate cardinalities when relational schemas consist of many tables with strong correlations between tables/attributes. This paper describes that multiple density estimators can be combined to effectively target the cardinality estimation of data with large and complex schemas having strong correlations. We propose Scardina, a new join cardinality estimation method using multiple partitioned models based on the schema structure.