Title: Self-driving Database System


Technical Area: Database



Database management system is an important component of any modern data-intensive application. In the last two decades, both researchers and vendors have built advisory tools to assist DBA in anomaly detection, trouble diagnosis, system tuning, and physical design. Most of this previous work is incomplete because they still require humans to make the final decisions. The main reason is that making a self-driving DBMS perform well is historically a difficult task. For example. Typical DBMSs have hundreds of configuration “knobs” that control everything in the system, such as the cache sizes and how frequently data is flushed to disk. Getting the right configuration for these knobs is hard because they are not independent, not standardized, and not uniform. Automatic anomaly detection and trouble diagnosis is important as well. However, it is not easy for large scale DBMS due to the complexity of variety applications.



The goal is to develop the foundation and corresponding practical techniques for 1) the workload and resource consumption predication; 2) automatic configuration and tuning of DBMS; 3) system scheduling and resource allocation. Specifically, the projects will focus on (but not limited to):

1. DBMS workload prediction.
Based on the SQL workload of Alibaba OLTP DBMS, design the prediction algorithm for the future workload.

2. Resource consumption prediction.
Based on the SQL workload and resource consumption history, design the prediction algorithm for the future resource consumption.

3. Intelligent DBA.
Based on the workload prediction, tune the DBMS to adapt the future coming workload and changes.

4. Auto system enlargement/shrink, automatic anomaly detection and trouble diagnosis.
Based on the resource consumption prediction, re- allocate the resource to different database instances and further schedule database instances to different hosts. Based on the large scale logging system, perform anomaly detection and trouble diagnosis using artificial intelligent algorithm.

5. Parameter tuning.
Develop the practical techniques for the automatic configuration of DBMS by 1) reusing performance data gathered from previous sessions to tune new DBMS deployments; 2) using the online feedback to adjust configuration to perform continuous optimization.


Related Research Topics