Title: Machine Learning Empowered Query and System Optimization


Technical Area: Database



Query optimization is a typical NP-hard problem, where the conventional database optimization technologies, e.g., dynamic programming, genetic algorithm, etc., are applied to select the optimal access plan of a given query in a bounded time. Cost estimation is usually the foundation to make such a decision. However, cost estimation is determined by the cardinality estimation which heavily replies on the available statistics. In modern database systems, statistics are collected and utilized in a limited way where data skew and correlation are not handled well. As a result, the selectivity estimation of predicates, single or compound, is not accurate enough that causes cost estimation error and thus results in sub-optimal query plan and query performance problems.


Furthermore, the queries nowadays get more and more complex, along with the database system evolution where new hardware, new data sources, and new computation models are evolved in the database systems. The conventional query optimization approaches become less capable to handle such complex scenarios.


Besides query optimization, system performance also relies on many other sub- systems, e.g., workload management, resource management, database physical design, etc. Usually tuning a database system requires extensive expert experiences on many such sub-systems. However, manually tuning by such an expert is less and less feasible with thousands of database instances provided by database services on cloud, and thus raises huge challenges to cloud database service providers.


Meanwhile, artificial intelligence especially machine learning technologies in recent years reveal the promising direction in solving traditionally challenging problems in large number of domains. Therefore, in this collaborative research project the opportunities and appropriate approaches are to be explored to solve challenging query and system optimization problems by exploiting evolving machine learning technologies.



With this collaborative research project, we aim to solve the following problems:


One or more patent and/or paper publication are expected in solving each of above problems. Detailed implementation are expected upon the mutual agreement between research institute and Alibaba Group.


Related Research Topics