Jeyashree Krishnan

2papers

2 Papers

32.3LGMay 9Code
Machine Learning Research Has Outpaced Its Communication Norms and NeurIPS Should Act

Ajay Mandyam Rangarajan, Jeyashree Krishnan

Machine learning research has grown exponentially while its communication norms have not. We argue NeurIPS should adopt explicit, measurable writing standards. We analyze 2.8 million arXiv papers (1991-2025), 24,772 NeurIPS papers (1987-2024), and 24.5 million PubMed papers (1990-2025), applying classical readability scores, the Hohmann writing style suite (including sensational language), acronym density and reuse, an LLM as judge readability protocol, and citations from OpenAlex and Semantic Scholar. Four patterns emerge. First, NeurIPS abstracts score harder to read on every classical readability metric: Flesch Reading Ease falls from about 24 in 1987 to 13 in 2024, and sensational language rises by about 50 percent in NeurIPS abstracts between 2015 and 2024. Second, acronym density in NeurIPS titles has grown from 0.33 per 100 words in 1987 to 3.21 in 2024, and about 89 percent of NeurIPS acronyms are used fewer than ten times, ten points above the science-wide baseline. Third, more readable NeurIPS papers tend to receive more citations, suggesting readability and impact are correlated and that less readable papers risk remaining fragmented. LLM as judge scores rate NeurIPS abstracts as roughly stable from 1987 to 2022, with early signs of improvement thereafter, a pattern that disagrees with every classical readability metric and raises a design question for enforcement: is the target reader a human or an LLM? Lastly, NeurIPS volume has grown roughly 50-fold between 1987 and 2024. Assuming the goal is to optimise for human readers, we propose seven standards NeurIPS could pilot at NeurIPS 2027: an acronym budget with a venue-approved term list, a human readability threshold, stricter citation standards, standalone visual elements, a plain language summary, a pre-registered acronym glossary, and open source audit tooling.

SCFeb 6, 2023
Tree-Based Learning on Amperometric Time Series Data Demonstrates High Accuracy for Classification

Jeyashree Krishnan, Zeyu Lian, Pieter E. Oomen et al.

Elucidating exocytosis processes provide insights into cellular neurotransmission mechanisms, and may have potential in neurodegenerative diseases research. Amperometry is an established electrochemical method for the detection of neurotransmitters released from and stored inside cells. An important aspect of the amperometry method is the sub-millisecond temporal resolution of the current recordings which leads to several hundreds of gigabytes of high-quality data. In this study, we present a universal method for the classification with respect to diverse amperometric datasets using data-driven approaches in computational science. We demonstrate a very high prediction accuracy (greater than or equal to 95%). This includes an end-to-end systematic machine learning workflow for amperometric time series datasets consisting of pre-processing; feature extraction; model identification; training and testing; followed by feature importance evaluation - all implemented. We tested the method on heterogeneous amperometric time series datasets generated using different experimental approaches, chemical stimulations, electrode types, and varying recording times. We identified a certain overarching set of common features across these datasets which enables accurate predictions. Further, we showed that information relevant for the classification of amperometric traces are neither in the spiky segments alone, nor can it be retrieved from just the temporal structure of spikes. In fact, the transients between spikes and the trace baselines carry essential information for a successful classification, thereby strongly demonstrating that an effective feature representation of amperometric time series requires the full time series. To our knowledge, this is one of the first studies that propose a scheme for machine learning, and in particular, supervised learning on full amperometry time series data.