Theme Title: Distributed Storage System Based on Hybrid Model
Technical Area: Storage
As known, there is a large price gap between high-performance storage device and hard disk device. The hybrid storage model based on both devices is the best design. Based on distributed storage system of cloud service, this research focus on building hybrid storage system with kinds of disks such as SSD and HDD. On the hybrid system, this research focus on the data’s distribution strategy on the devices, and mathematical model for value evaluator of hybrid storage system.
This research proposes a method to build hybrid storage system for cloud cluster. Based on kinds of storage devices including high-performance and had disk, this research firstly investigates an intelligent data placement strategy, which can smartly place data on correct device and trigger data migration at correct time. Due to data migration has to product additional cost of disk IO and network, this research makes approach to reduce the cost as much as possible. Secondly, this research proposes an evaluation model to calculate the value of a hybrid storage cluster, which can guide us to build a cost-effective hybrid system. Thirdly, this research innovates the buffer evict algorithm for multi-level hybrid storage system. Because cluster is at the bottom level in cloud service, the buffer in cluster cannot be aware of the data access frequency, we need to review the problem from a new perspective, such as data access locality.
This research plans to publish at least one paper on top conference and completes one prototype system in a year.
Related Research Topics
This research will process work on 3 topics:
1.The analysis of data “hot/cold access”.
This research plans to use some smart methods, such as artificial intelligence, machine learning, data mining e.g., to predict data is ‘hot’ or ‘cold’. Then the ‘hot’ data is placed on High-performance and ‘cold’ data is placed on hard disk.
2.Mathematical model to evaluate hybrid storage.
This research plans to propose a mathematical method to evaluate the value of a hybrid system, which includes performance, storage price, network cost, migration cost e.g.
3.Buffer evict algorithm.
This research proposes a group of new algorithms for multi-level buffer in hybrid storage system. Data access frequency is no longer the consider standard, which is replaced by other methods such as access locality, prefetch data and so on.