Berkay Kaplan

h-index1

4papers

10citations

Novelty24%

AI Score40

Ranked #100,327 of 205,806 authors (top 49%)#2,553 in CR (top 35%)

4 Papers

22.9NIApr 26Code

Globalping: A Community-Driven, Open-Source Platform for Scalable, Real-Time Network Measurements

Berkay Kaplan

We present Globalping, an open-source, community-driven platform for scalable, real-time global network measurements. It democratizes access to network diagnostics by offering every user, including non-technicals, technicals, and companies, the ability to perform ping, traceroute, and DNS lookups from a globally distributed network of user-hosted probes using either the intuitive Globalping front-end or REST API. Unlike solutions like RIPE Atlas, official integrations with other platforms, such as Slack and GitHub, make Globalping even more effective in real-time monitoring and collaboration.

CRAug 21, 2021Code

A Survey on Common Threats in npm and PyPi Registries

Berkay Kaplan, Jingyu Qian

Software engineers regularly use JavaScript and Python for both front-end and back-end automation tasks. On top of JavaScript and Python, there are several frameworks to facilitate automation tasks further. Some of these frameworks are Node Manager Package (npm) and Python Package Index (PyPi), which are open source (OS) package libraries. The public registries npm and PyPi use to host packages allow any user with a verified email to publish code. The lack of a comprehensive scanning tool when publishing to the registry creates security concerns. Users can report malicious code on the registry; however, attackers can still cause damage until they remove their tool from the platform. Furthermore, several packages depend on each other, making them more vulnerable to a bad package in the dependency tree. The heavy code reuse creates security artifacts developers have to consider, such as the package reach. This project will illustrate a high-level overview of common risks associated with OS registries and the package dependency structure. There are several attack types, such as typosquatting and combosquatting, in the OS package registries. Outdated packages pose a security risk, and we will examine the extent of technical lag present in the npm environment. In this paper, our main contribution consists of a survey of common threats in OS registries. Afterward, we will offer countermeasures to mitigate the risks presented. These remedies will heavily focus on the applications of Machine Learning (ML) to detect suspicious activities. To the best of our knowledge, the ML-focused countermeasures are the first proposed possible solutions to the security problems listed. In addition, this project is the first survey of threats in npm and PyPi, although several studies focus on a subset of threats.

LGSep 19, 2025

Unsupervised Outlier Detection in Audit Analytics: A Case Study Using USA Spending Data

Buhe Li, Berkay Kaplan, Maksym Lazirko et al.

This study investigates the effectiveness of unsupervised outlier detection methods in audit analytics, utilizing USA spending data from the U.S. Department of Health and Human Services (DHHS) as a case example. We employ and compare multiple outlier detection algorithms, including Histogram-based Outlier Score (HBOS), Robust Principal Component Analysis (PCA), Minimum Covariance Determinant (MCD), and K-Nearest Neighbors (KNN) to identify anomalies in federal spending patterns. The research addresses the growing need for efficient and accurate anomaly detection in large-scale governmental datasets, where traditional auditing methods may fall short. Our methodology involves data preparation, algorithm implementation, and performance evaluation using precision, recall, and F1 scores. Results indicate that a hybrid approach, combining multiple detection strategies, enhances the robustness and accuracy of outlier identification in complex financial data. This study contributes to the field of audit analytics by providing insights into the comparative effectiveness of various outlier detection models and demonstrating the potential of unsupervised learning techniques in improving audit quality and efficiency. The findings have implications for auditors, policymakers, and researchers seeking to leverage advanced analytics in governmental financial oversight and risk management.

CRSep 21, 2021

Attacks on Visualization-Based Malware Detection: Balancing Effectiveness and Executability

Hadjer Benkraouda, Jingyu Qian, Hung Quoc Tran et al.

With the rapid development of machine learning for image classification, researchers have found new applications of visualization techniques in malware detection. By converting binary code into images, researchers have shown satisfactory results in applying machine learning to extract features that are difficult to discover manually. Such visualization-based malware detection methods can capture malware patterns from many different malware families and improve malware detection speed. On the other hand, recent research has also shown adversarial attacks against such visualization-based malware detection. Attackers can generate adversarial examples by perturbing the malware binary in non-reachable regions, such as padding at the end of the binary. Alternatively, attackers can perturb the malware image embedding and then verify the executability of the malware post-transformation. One major limitation of the first attack scenario is that a simple pre-processing step can remove the perturbations before classification. For the second attack scenario, it is hard to maintain the original malware's executability and functionality. In this work, we provide literature review on existing malware visualization techniques and attacks against them. We summarize the limitation of the previous work, and design a new adversarial example attack against visualization-based malware detection that can evade pre-processing filtering and maintain the original malware functionality. We test our attack on a public malware dataset and achieve a 98% success rate.