Building QoS-Aware Cloud Services




Rahman, Joy

Journal Title

Journal ISSN

Volume Title



Cloud services are increasingly popular for processing large-scale data due to their on-demand scalability and cost-effectiveness. Cloud as a service platform also enabling and accelerating the recent shift of enterprise application architectures to containers and microservices for better scalability, portability, and reliability. However, existing cloud services often suffer from unpredictable performance and inefficiencies inherent in cloud architecture. An example of cloud architecture-induced inefficiency is that big data analytics suffers from large overheads of data movement between two decoupled service layers i.e, compute (e.g Amazon EC) and storage (e.g Amazon S3). Furthermore, our study shows that containerized microservices in the cloud suffers from performance interference at various levels, i.e, inter-tenant and inter-container. Existing approaches of auto-scaling containerized applications are ineffective in the presence of performance issues arising from the contention of various shared resources. The key research challenges lie in determining which components of a microservices-based application should be scaled, and how much to scale to meet a performance SLO (service level objective) target in the face of dynamic workloads, inter-component performance dependencies and cloud-induced performance interference.

For the first problem, we developed a novel approach of in-situ big data processing on cloud storage through compute-storage multiplexing to improve data processing throughput by hiding data transfer overheads. This allows the storage cluster to leverage spare compute cycles, where they would otherwise be wasted. Our study examined the feasibility of the proposed approach and identified important research challenges that need to be addressed to avoid performance SLO violation of cloud storage requests when offloading data processing jobs on the storage cluster.

For the second problem, (1) we developed a machine learning based performance modeling approach that combines multi-layer data including container-level, VM level and hardware performance counter-based metric to predict the end-to-end tail latency of containerized microservices even in the presence of cloud-induced performance interference. (2) We further enhanced our modeling approach with a probabilistic machine learning technique that is highly adaptive to changing system dynamics, and that directly provides confidence bounds on its predictions. This is critical for making robust resource management decisions in an uncertain cloud environment. (3) We developed a robust and efficient resource scaling technique that meets the performance SLO target of containerized microservices by utilizing the proposed models.


This item is available only to currently enrolled UTSA students, faculty or staff. To download, navigate to Log In in the top right-hand corner of this screen, then select Log in with my UTSA ID.


big data, cloud, microservices, performance, response-time, web-services



Computer Science