PyTracer: Automatically profiling numerical instabilities in Python
This addresses the problem of unreliable scientific computing for Python users in data science, offering a scalable and generic tool, though it appears incremental as it builds on existing numerical analysis concepts.
The researchers tackled the challenge of analyzing numerical stability in large Python programs by developing PyTracer, a profiler that automatically quantifies and visualizes numerical instabilities, demonstrating its capabilities on key functions in SciPy and Scikit-learn.
Numerical stability is a crucial requirement of reliable scientific computing. However, despite the pervasiveness of Python in data science, analyzing large Python programs remains challenging due to the lack of scalable numerical analysis tools available for this language. To fill this gap, we developed PyTracer, a profiler to quantify numerical instability in Python applications. PyTracer transparently instruments Python code to produce numerical traces and visualize them interactively in a Plotly dashboard. We designed PyTracer to be agnostic to numerical noise model, allowing for tool evaluation through Monte-Carlo Arithmetic, random rounding, random data perturbation, or structured noise for a particular application. We illustrate PyTracer's capabilities by testing the numerical stability of key functions in both SciPy and Scikit-learn, two dominant Python libraries for mathematical modeling. Through these evaluations, we demonstrate PyTracer as a scalable, automatic, and generic framework for numerical profiling in Python.