Natan Katz

LG
h-index8
4papers
2citations
Novelty38%
AI Score28

4 Papers

LGOct 25, 2022
Parametric PDF for Goodness of Fit

Natan Katz, Uri Itai

The goodness of fit methods for classification problems relies traditionally on confusion matrices. This paper aims to enrich these methods with a risk evaluation and stability analysis tools. For this purpose, we present a parametric PDF framework.

LGAug 11, 2022
Goodness of Fit Metrics for Multi-class Predictor

Uri Itai, Natan Katz

The multi-class prediction had gained popularity over recent years. Thus measuring fit goodness becomes a cardinal question that researchers often have to deal with. Several metrics are commonly used for this task. However, when one has to decide about the right measurement, he must consider that different use-cases impose different constraints that govern this decision. A leading constraint at least in \emph{real world} multi-class problems is imbalanced data: Multi categorical problems hardly provide symmetrical data. Hence, when we observe common KPIs (key performance indicators), e.g., Precision-Sensitivity or Accuracy, one can seldom interpret the obtained numbers into the model's actual needs. We suggest generalizing Matthew's correlation coefficient into multi-dimensions. This generalization is based on a geometrical interpretation of the generalized confusion matrix.

LGOct 28, 2025
Filtering instances and rejecting predictions to obtain reliable models in healthcare

Maria Gabriela Valeriano, David Kohan Marzagão, Alfredo Montelongo et al.

Machine Learning (ML) models are widely used in high-stakes domains such as healthcare, where the reliability of predictions is critical. However, these models often fail to account for uncertainty, providing predictions even with low confidence. This work proposes a novel two-step data-centric approach to enhance the performance of ML models by improving data quality and filtering low-confidence predictions. The first step involves leveraging Instance Hardness (IH) to filter problematic instances during training, thereby refining the dataset. The second step introduces a confidence-based rejection mechanism during inference, ensuring that only reliable predictions are retained. We evaluate our approach using three real-world healthcare datasets, demonstrating its effectiveness at improving model reliability while balancing predictive performance and rejection rate. Additionally, we use alternative criteria - influence values for filtering and uncertainty for rejection - as baselines to evaluate the efficiency of the proposed method. The results demonstrate that integrating IH filtering with confidence-based rejection effectively enhances model performance while preserving a large proportion of instances. This approach provides a practical method for deploying ML systems in safety-critical applications.

CRAug 16, 2024
ML Study of MaliciousTransactions in Ethereum

Natan Katz

Smart contracts are a major tool in Ethereum transactions. Therefore hackers can exploit them by adding code vulnerabilities to their sources and using these vulnerabilities for performing malicious transactions. This paper presents two successful approaches for detecting malicious contracts: one uses opcode and relies on GPT2 and the other uses the Solidity source and a LORA fine-tuned CodeLlama. Finally, we present an XGBOOST model that combines gas properties and Hexa-decimal signatures for detecting malicious transactions. This approach relies on early assumptions that maliciousness is manifested by the uncommon usage of the contracts' functions and the effort to pursue the transaction.