PL SEMay 26

PoTo: A Hybrid Andersen's Points-to Analysis for Python

Ingkarat Rak-amnouykit, Ana Milanova, Guillaume Baudart, Martin Hirzel, Julian Dolby

arXiv:2409.0391862.42 citationsh-index: 31

AI Analysis

This work addresses the challenge of static analysis for Python, which is increasingly important for large and complex programs, by providing a practical points-to analysis that handles dynamic features and external libraries.

PoTo introduces a hybrid Andersen-style points-to analysis for Python that integrates static analysis with concrete evaluation for external library calls, enabling effective type inference for large programs. PoTo+ built on it outperforms Pytype and DLInfer on existing Python packages.

As Python is increasingly being adopted for large and complex programs, the importance of static analysis for Python (such as type inference) grows. Unfortunately, static analysis for Python remains a challenging task due to its dynamic language features and its abundant external libraries. To help fill this gap, this paper presents PoTo, an Andersen-style context-insensitive and flow-insensitive points-to analysis for Python. PoTo addresses Python-specific challenges and works for large programs via a novel hybrid evaluation, integrating traditional static points-to analysis with concrete evaluation in the Python interpreter for external library calls. Next, this paper presents PoTo+, a static type inference for Python built on the points-to analysis. We evaluate PoTo+ and compare it to two state-of-the-art Python type inference techniques: (1) the static rule-based Pytype and (2) the deep-learning based DLInfer. Our results show that PoTo+ outperforms both Pytype and DLInfer on existing Python packages.

View on arXiv PDF

Similar