SEJun 27, 2019

What Do Developers Ask About ML Libraries? A Large-scale Study Using Stack Overflow

Md Johirul Islam, Hoan Anh Nguyen, Rangeet Pan, Hridesh Rajan

arXiv:1906.11940v117.937 citations

Originality Synthesis-oriented

AI Analysis

This addresses challenges for software developers integrating ML into systems, but it is incremental as it builds on existing studies of developer questions.

The study analyzed 3,243 Stack Overflow posts to identify difficulties developers face when using ML libraries, revealing urgent needs for software engineering research, such as better debugging tools and API design improvements.

Modern software systems are increasingly including machine learning (ML) as an integral component. However, we do not yet understand the difficulties faced by software developers when learning about ML libraries and using them within their systems. To that end, this work reports on a detailed (manual) examination of 3,243 highly-rated Q&A posts related to ten ML libraries, namely Tensorflow, Keras, scikit-learn, Weka, Caffe, Theano, MLlib, Torch, Mahout, and H2O, on Stack Overflow, a popular online technical Q&A forum. We classify these questions into seven typical stages of an ML pipeline to understand the correlation between the library and the stage. Then we study the questions and perform statistical analysis to explore the answer to four research objectives (finding the most difficult stage, understanding the nature of problems, nature of libraries and studying whether the difficulties stayed consistent over time). Our findings reveal the urgent need for software engineering (SE) research in this area. Both static and dynamic analyses are mostly absent and badly needed to help developers find errors earlier. While there has been some early research on debugging, much more work is needed. API misuses are prevalent and API design improvements are sorely needed. Last and somewhat surprisingly, a tug of war between providing higher levels of abstractions and the need to understand the behavior of the trained model is prevalent.

View on arXiv PDF

Similar