SkinnerDB: Reinforcement Learning for Query Optimization

Classified in Computers

Written on January 7, 2024 in English with a size of 3.78 KB

SkinnerDB: Regret-Bounded Query Evaluation via Reinforcement Learning

INTRODUCTION AND PROBLEM DEFINITION:

The work is on query optimization, more specifically: SkinnerDB focuses on finding the optimal Join Order. Because it has the most impact in practice.

SkinnerDB aims to get expected near optimal results without needing any a-priori information. It does not make strong assumptions either.

CONTRIBUTIONS:

Introduced a new quality criterion for query evaluation strategies that compares expected and optimal execution cost.
Proposed several adaptive execution strategies based on reinforcement learning.
Formally proved correctness and regret bounds for those execution strategies.
Experimental comparisations of those strategies, implemented in SkinnerDB, against various baselines.

COMPARED TO OTHER METHODS:

--Traditional query optimizers rely on a-priory information. They predict cost based on coarse-grained data statistics and under simplifying assumptions.

-- SkinnerDB maintains no data statistics and uses no simplifying cost and cardinality models. Instead it learns (near-)optimal query plans with RL.

--A lot of recent machine learning methods have been used for query optimization but they learn from previous queries and give good results only if current query is similar to previous ones.

-- SkinnerDB however, does not suffer from any kind of generalization error across queries.

HOW IT WORKS:

Preprocessing involves: Filtering on base tables, batching.
Post-processing involves: grouping, aggregation, and sorting.
Query-->preprocessor-->join processor-->postprocessor-->result.

UCT FOR JOIN ORDERING FROM RL

UCT stands for “Upper Confidence bounds for Trees”. The UCT algorithm balances between exploration and exploitation in a principled manner that results in probabilistic guarantees. UCT works well on very large search spaces.

The UCT algorithm maintains two counters per node: the number of times the node was visited and the average reward that was obtained for paths crossing through that node. If counters are available for all relevant nodes, the UCT algorithm selects at each step the child node c maximizing the formula r_c+w*sqrtroot(log(v_p)/v_c where r c is the average reward for c, vc and vp are the number of visits for child and parent node.

EXECUTION STRATEGIES FOR INTRAQUERY LEARNING

For this method to work efficiently there are some very precise requirements on the execution engine.

Early quality feedback: So the system we can learn fast
Low switching overhead: Since the will try different join orders frequently
Small execution state: Because the system will stop and resume join executions and shall not do too much redundant work.

SUMMARY SkinnerDB

Use Reinforcement Learning for intra-query learning
NO data statistics -- NO cardinality model -- NO cost model.
Formally guarantees near-optimal expected cost.
Intra-query learning pays off for difficult queries, does not suffer from poor generalization of inter-query learning.

CORRECTNESS:

--The correctness = The proposed system produces the same resulting tuples with an ordinary joining operation.

--This is proven in a verbal manner in the paper. In short: the system produces no duplicates because it keeps track of all component tuple indices in a set.

--Skinnerdb produces each result tuple at least once (ie does not miss any): --Completed result tuples always added to overall results and component tuples are processed in a way that all combinations are covered.

Related entries:

Tags: