Maja Vukovic

SE
5papers
318citations
Novelty40%
AI Score28

5 Papers

CLJun 5, 2023
CoSiNES: Contrastive Siamese Network for Entity Standardization

Jiaqing Yuan, Michele Merler, Mihir Choudhury et al.

Entity standardization maps noisy mentions from free-form text to standard entities in a knowledge base. The unique challenge of this task relative to other entity-related tasks is the lack of surrounding context and numerous variations in the surface form of the mentions, especially when it comes to generalization across domains where labeled data is scarce. Previous research mostly focuses on developing models either heavily relying on context, or dedicated solely to a specific domain. In contrast, we propose CoSiNES, a generic and adaptable framework with Contrastive Siamese Network for Entity Standardization that effectively adapts a pretrained language model to capture the syntax and semantics of the entities in a new domain. We construct a new dataset in the technology domain, which contains 640 technical stack entities and 6,412 mentions collected from industrial content management systems. We demonstrate that CoSiNES yields higher accuracy and faster runtime than baselines derived from leading methods in this domain. CoSiNES also achieves competitive performance in four standard datasets from the chemistry, medicine, and biomedical domains, demonstrating its cross-domain applicability.

SEJul 20, 2021Code
Mono2Micro: A Practical and Effective Tool for Decomposing Monolithic Java Applications to Microservices

Anup Kalia, Jin Xiao, Rahul Krishna et al.

In migrating production workloads to cloud, enterprises often face the daunting task of evolving monolithic applications toward a microservice architecture. At IBM, we developed a tool called Mono2Micro to assist with this challenging task. Mono2Micro performs spatio-temporal decomposition, leveraging well-defined business use cases and runtime call relations to create functionally cohesive partitioning of application classes. Our preliminary evaluation of Mono2Micro showed promising results. How well does Mono2Micro perform against other decomposition techniques, and how do practitioners perceive the tool? This paper describes the technical foundations of Mono2Micro and presents results to answer these two questions. To answer the first question, we evaluated Mono2Micro against four existing techniques on a set of open-source and proprietary Java applications and using different metrics to assess the quality of decomposition and tool's efficiency. Our results show that Mono2Micro significantly outperforms state-of-the-art baselines in specific metrics well-defined for the problem domain. To answer the second question, we conducted a survey of twenty-one practitioners in various industry roles who have used Mono2Micro. This study highlights several benefits of the tool, interesting practitioner perceptions, and scope for further improvements. Overall, these results show that Mono2Micro can provide a valuable aid to practitioners in creating functionally cohesive and explainable microservice decompositions.

SEJun 12, 2021Code
Lessons learned from hyper-parameter tuning for microservice candidate identification

Rahul Yedida, Rahul Krishna, Anup Kalia et al.

When optimizing software for the cloud, monolithic applications need to be partitioned into many smaller *microservices*. While many tools have been proposed for this task, we warn that the evaluation of those approaches has been incomplete; e.g. minimal prior exploration of hyperparameter optimization. Using a set of open source Java EE applications, we show here that (a) such optimization can significantly improve microservice partitioning; and that (b) an open issue for future work is how to find which optimizer works best for different problems. To facilitate that future work, see [https://github.com/yrahul3910/ase-tuned-mono2micro](https://github.com/yrahul3910/ase-tuned-mono2micro) for a reproduction package for this research.

LGSep 29, 2021
An Expert System for Redesigning Software for Cloud Applications

Rahul Yedida, Rahul Krishna, Anup Kalia et al.

Cloud-based software has many advantages. When services are divided into many independent components, they are easier to update. Also, during peak demand, it is easier to scale cloud services (just hire more CPUs). Hence, many organizations are partitioning their monolithic enterprise applications into cloud-based microservices. Recently there has been much work using machine learning to simplify this partitioning task. Despite much research, no single partitioning method can be recommended as generally useful. More specifically, those prior solutions are "brittle"; i.e. if they work well for one kind of goal in one dataset, then they can be sub-optimal if applied to many datasets and multiple goals. In order to find a generally useful partitioning method, we propose DEEPLY. This new algorithm extends the CO-GCN deep learning partition generator with (a) a novel loss function and (b) some hyper-parameter optimization. As shown by our experiments, DEEPLY generally outperforms prior work (including CO-GCN, and others) across multiple datasets and goals. To the best of our knowledge, this is the first report in SE of such stable hyper-parameter optimization. To aid reuse of this work, DEEPLY is available on-line at https://bit.ly/2WhfFlB.

SEJun 25, 2019
Technical Health Check For Cloud Service Providers

Muhammed Fatih Bulut, Hongtan Sun, Pritpal Arora et al.

Understanding the overall health of an IT Infrastructure is a key part of IT Service Management. Traditional approach to perform technical health check is by visiting customer's physical site and rigorously examining the IT infrastructure with Subject Matter Experts. Alternatively, periodic surveys are sent to Technical Architects who are responsible for the managed IT infrastructure. In essence, both site visits and surveys suffer from reactive nature, and subjective assessment. In this paper, we present technical health check for cloud providers, that monitors, assesses operational data and depicts the current health of an IT infrastructure in real time. We also discuss challenges and opportunities of technical health check in Hybrid Cloud Environment.