Nima Mahmoudi

h-index8

4papers

20citations

Novelty36%

AI Score27

Ranked #156,475 of 194,257 authors (top 81%)#734 in DC (top 75%)

4 Papers

5.1DCFeb 23, 2022Code

Performance Modeling of Metric-Based Serverless Computing Platforms

Nima Mahmoudi, Hamzeh Khazaei

Analytical performance models are very effective in ensuring the quality of service and cost of service deployment remain desirable under different conditions and workloads. While various analytical performance models have been proposed for previous paradigms in cloud computing, serverless computing lacks such models that can provide developers with performance guarantees. Besides, most serverless computing platforms still require developers' input to specify the configuration for their deployment that could affect both the performance and cost of their deployment, without providing them with any direct and immediate feedback. In previous studies, we built such performance models for steady-state and transient analysis of scale-per-request serverless computing platforms (e.g., AWS Lambda, Azure Functions, Google Cloud Functions) that could give developers immediate feedback about the quality of service and cost of their deployments. In this work, we aim to develop analytical performance models for the latest trend in serverless computing platforms that use concurrency value and the rate of requests per second for autoscaling decisions. Examples of such serverless computing platforms are Knative and Google Cloud Run (a managed Knative service by Google). The proposed performance model can help developers and providers predict the performance and cost of deployments with different configurations which could help them tune the configuration toward the best outcome. We validate the applicability and accuracy of the proposed performance model by extensive real-world experimentation on Knative and show that our performance model is able to accurately predict the steady-state characteristics of a given workload with minimal amount of data collection.

2.3DCFeb 23, 2022Code

MLProxy: SLA-Aware Reverse Proxy for Machine Learning Inference Serving on Serverless Computing Platforms

Nima Mahmoudi, Hamzeh Khazaei

Serving machine learning inference workloads on the cloud is still a challenging task on the production level. Optimal configuration of the inference workload to meet SLA requirements while optimizing the infrastructure costs is highly complicated due to the complex interaction between batch configuration, resource configurations, and variable arrival process. Serverless computing has emerged in recent years to automate most infrastructure management tasks. Workload batching has revealed the potential to improve the response time and cost-effectiveness of machine learning serving workloads. However, it has not yet been supported out of the box by serverless computing platforms. Our experiments have shown that for various machine learning workloads, batching can hugely improve the system's efficiency by reducing the processing overhead per request. In this work, we present MLProxy, an adaptive reverse proxy to support efficient machine learning serving workloads on serverless computing systems. MLProxy supports adaptive batching to ensure SLA compliance while optimizing serverless costs. We performed rigorous experiments on Knative to demonstrate the effectiveness of MLProxy. We showed that MLProxy could reduce the cost of serverless deployment by up to 92% while reducing SLA violations by up to 99% that can be generalized across state-of-the-art model serving frameworks.

8.5SEJul 10, 2019Code

Executability of Python Snippets in Stack Overflow

Md Monir Hossain, Nima Mahmoudi, Changyuan Lin et al.

Online resources today contain an abundant amount of code snippets for documentation, collaboration, learning, and problem-solving purposes. Their executability in a "plug and play" manner enables us to confirm their quality and use them directly in projects. But, in practice that is often not the case due to several requirements violations or incompleteness. However, it is a difficult task to investigate the executability on a large scale due to different possible errors during the execution. We have developed a scalable framework to investigate this for SOTorrent Python snippets. We found that with minor adjustments, 27.92% of snippets are executable. The executability has not changed significantly over time. The code snippets referenced in GitHub are more likely to be directly executable. But executability does not affect the chances of the answer to be selected as the accepted answer significantly. These properties help us understand and improve the interaction of users with online resources that include code snippets.

4.3DCFeb 9, 2019

Performance Modeling of Microservice Platforms

Hamzeh Khazaei, Nima Mahmoudi, Cornel Barna et al.

Microservice architecture has transformed the way developers are building and deploying applications in the nowadays cloud computing centers. This new approach provides increased scalability, flexibility, manageability, and performance while reducing the complexity of the whole software development life cycle. The increase in cloud resource utilization also benefits microservice providers. Various microservice platforms have emerged to facilitate the DevOps of containerized services by enabling continuous integration and delivery. Microservice platforms deploy application containers on virtual or physical machines provided by public/private cloud infrastructures in a seamless manner. In this paper, we study and evaluate the provisioning performance of microservice platforms by incorporating the details of all layers (i.e., both micro and macro layers) in the modelling process. To this end, we first build a microservice platform on top of Amazon EC2 cloud and then leverage it to develop a comprehensive performance model to perform what-if analysis and capacity planning for microservice platforms at scale. In other words, the proposed performance model provides a systematic approach to measure the elasticity of the microservice platform by analyzing the provisioning performance at both the microservice platform and the back-end macroservice infrastructures.