Title: Distributed Transactions and Distributed Query Processing
Technical Area: Database
We are living in the age of big data. Enterprises are constantly challenged by strikingly different usage patterns around big data e.g. OLTP, OLAP, logging, and off-line analysis, etc. HTAP database is a fast-emerging trend. A hybrid database product that supports both online transaction processing and analytics has the potential to become one-stop solution for meeting most of the requirements of enterprise-level applications.
Current OLTP systems either do not support distributed transactions or do so through a single master/coordinator for transactional consistency. This not only introduces a single point of failure in the cluster but also limits the ability to scale horizontally. Spanner does provide distributed transaction concurrency based on prohibitively expensive special hardware (GPS plus Atomic Clock). A more cost effective, high performing and scalable distributed transaction solution is needed. In OLAP world, an effective and high quality distributed SQL query plan is the key factor. A distributed query plan must take into account resources like hardware, network throughput, disk layout, etc. Many of these resources are dynamically changing in distributed environment and this state of constant flux makes it quite challenging to incorporate such factors in query plan generation in an efficient and scalable manner.
The main target is to build a scale out database cluster with high performance distributed transaction processing and high performance distributed SQL query processing. The measurable outputs of this research may include but are not limited to:
- Decentralized distributed transaction system prototype that delivers high performance, with up to millions of QPS, hundred microseconds latency, and near-linear horizontal scalability and supports 1000+ nodes.
- Novel distributed query optimizer that can develop high quality query plan according to host hardware, network throughput, and data layout.
- The research work will be published in top conferences (CCF-A Level preferred).
Related Research Topics
- Decentralized distributed transaction system prototype that delivers high performance. It would be self-adaptive to various size of transaction, and have fast crash recovery capability for large scale transaction processing.
- Parallel/distributed query plan optimizers that can precisely and effectively combine resources of the whole distributed database and then develop high quality query plan dynamically.
- High performance distributed transaction architecture for the large-scale system with 1000+ nodes. Multi-region transaction systems supporting global distributed transactions.
- Distributed systems built on top of heterogeneous systems and the transaction coordination model for these systems such as batch computing engine, streaming engine, etc.