SEMar 21, 2025
A Study of LLMs' Preferences for Libraries and Programming LanguagesLukas Twist, Jie M. Zhang, Mark Harman et al.
Large Language Models (LLMs) are increasingly used to generate code, influencing users' choices of libraries and programming languages in critical real-world projects. However, little is known about their systematic biases or preferences toward certain libraries and programming languages, which can significantly impact software development practices. To fill this gap, we perform the first empirical study of LLMs' preferences for libraries and programming languages when generating code, covering eight diverse LLMs. Our results reveal that LLMs exhibit a strong tendency to overuse widely adopted libraries such as NumPy; in up to 48% of cases, this usage is unnecessary and deviates from the ground-truth solutions. LLMs also exhibit a significant preference toward Python as their default language. For high-performance project initialisation tasks where Python is not the optimal language, it remains the dominant choice in 58% of cases, and Rust is not used a single time. These results indicate that LLMs may prioritise familiarity and popularity over suitability and task-specific optimality. This will introduce security vulnerabilities and technical debt, and limit exposure to newly developed, better-suited tools and languages. Understanding and addressing these biases is essential for the responsible integration of LLMs into software development workflows.
LGFeb 17, 2022
Gradients without BackpropagationAtılım Güneş Baydin, Barak A. Pearlmutter, Don Syme et al.
Using backpropagation to compute gradients of objective functions for optimization has remained a mainstay of machine learning. Backpropagation, or reverse-mode differentiation, is a special case within the general family of automatic differentiation algorithms that also includes the forward mode. We present a method to compute gradients based solely on the directional derivative that one can compute exactly and efficiently via the forward mode. We call this formulation the forward gradient, an unbiased estimate of the gradient that can be evaluated in a single forward run of the function, entirely eliminating the need for backpropagation in gradient descent. We demonstrate forward gradient descent in a range of problems, showing substantial savings in computation and enabling training up to twice as fast in some cases.
PLDec 7, 2015
In the Age of Web: Typed Functional-First Programming RevisitedTomas Petricek, Don Syme, Zach Bray
Most programming languages were designed before the age of web. This matters because the web changes many assumptions that typed functional language designers take for granted. For example, programs do not run in a closed world, but must instead interact with (changing and likely unreliable) services and data sources, communication is often asynchronous or event-driven, and programs need to interoperate with untyped environments. In this paper, we present how the F# language and libraries face the challenges posed by the web. Technically, this comprises using type providers for integration with external information sources and for integration with untyped programming environments, using lightweight meta-programming for targeting JavaScript and computation expressions for writing asynchronous code. In this inquiry, the holistic perspective is more important than each of the features in isolation. We use a practical case study as a starting point and look at how F# language and libraries approach the challenges posed by the web. The specific lessons learned are perhaps less interesting than our attempt to uncover hidden assumptions that no longer hold in the age of web.