학술논문

Discovery of Cross Joins
Document Type
Periodical
Source
IEEE Transactions on Knowledge and Data Engineering IEEE Trans. Knowl. Data Eng. Knowledge and Data Engineering, IEEE Transactions on. 35(7):6839-6851 Jul, 2023
Subject
Computing and Processing
Databases
Approximation algorithms
Probabilistic logic
Medical diagnostic imaging
Lifting equipment
Estimation
Benchmark testing
Algorithm
cross join
database
data mining
discovery
experiment
parameterized intractability
profiling
Language
ISSN
1041-4347
1558-2191
2326-3865
Abstract
A cross join between two attribute sets holds on a relation whenever its projection onto the union of the attribute sets is the cross join between its projections on the first and second attribute set. Hence, the cross join is a fundamental operator on database relations. For example, it can rewrite the division operator into a simple projection, or measure the independence of tuple values between two attribute sets during cardinality estimation. It is therefore surprising that we present the first research on the discovery problem of cross joins. We show that the problem of deciding whether there is a cross join that holds on a given relation is not only NP-complete but W[3]-complete in its arguably most natural parameter, namely its arity. We establish the first algorithms that discover all cross joins that hold on a given relation. We illustrate in experiments with benchmark data that our algorithms perform well within the limits established by our hardness results. Our treatment of cross joins and the design of our algorithms enables us to extend our findings to the discovery of cross joins that meet a given approximation ratio. Our experiments quantify the trade-off between discovery time and targeted ratio.