Performance evaluation of cloud object storage for big data
The need for reliable and fast storage systems is increasingly critical in various fields including artificial intelligence and data analysis. A new architecture for large-scale data storage systems is proposed in this thesis, which focuses on comparing and optimizing performance of different software/hardware-defined storage technologies that effectively reduce the computational latency and improve the performance. The main contributions of this thesis are: (i) the combination of SMR (for storing data) and SSD (for storing metadata) is a viable solution for implementing large data storage systems, and (ii) the combination of CMR (for storing data) and SSD (for storing metadata) shows the highest performance for high performance computing. Our experiments are carried out on multiple settings, demonstrating that the proposed architecture successfully improves performance for sequential and random read/writes. The prototypes are evaluated with some realistic workloads, showing the superiority of the proposed data storage configurations. This provides new opportunities for efficiently processing and storing data and metadata in large-scale data analysis systems.