3 Papers

SEDec 26, 2025
State-of-the-art Small Language Coder Model: Mify-Coder

Abhinav Parmar, Abhisek Panigrahi, Abhishek Kumar Dwivedi et al.

We present Mify-Coder, a 2.5B-parameter code model trained on 4.2T tokens using a compute-optimal strategy built on the Mify-2.5B foundation model. Mify-Coder achieves comparable accuracy and safety while significantly outperforming much larger baseline models on standard coding and function-calling benchmarks, demonstrating that compact models can match frontier-grade models in code generation and agent-driven workflows. Our training pipeline combines high-quality curated sources with synthetic data generated through agentically designed prompts, refined iteratively using enterprise-grade evaluation datasets. LLM-based quality filtering further enhances data density, enabling frugal yet effective training. Through disciplined exploration of CPT-SFT objectives, data mixtures, and sampling dynamics, we deliver frontier-grade code intelligence within a single continuous training trajectory. Empirical evidence shows that principled data and compute discipline allow smaller models to achieve competitive accuracy, efficiency, and safety compliance. Quantized variants of Mify-Coder enable deployment on standard desktop environments without requiring specialized hardware.

26.8AIApr 2Code
VISTA: Visualization of Token Attribution via Efficient Analysis

Syed Ahmed, Bharathi Vokkaliga Ganesh, Jagadish Babu P et al.

Understanding how Large Language Models (LLMs) process information from prompts remains a significant challenge. To shed light on this "black box," attention visualization techniques have been developed to capture neuron-level perceptions and interpret how models focus on different parts of input data. However, many existing techniques are tailored to specific model architectures, particularly within the Transformer family, and often require backpropagation, resulting in nearly double the GPU memory usage and increased computational cost. A lightweight, model-agnostic approach for attention visualization remains lacking. In this paper, we introduce a model-agnostic token importance visualization technique to better understand how generative AI systems perceive and prioritize information from input text, without incurring additional computational cost. Our method leverages perturbation-based strategies combined with a three-matrix analytical framework to generate relevance maps that illustrate token-level contributions to model predictions. The framework comprises: (1) the Angular Deviation Matrix, which captures shifts in semantic direction; (2) the Magnitude Deviation Matrix, which measures changes in semantic intensity; and (3) the Dimensional Importance Matrix, which evaluates contributions across individual vector dimensions. By systematically removing each token and measuring the resulting impact across these three complementary dimensions, we derive a composite importance score that provides a nuanced and mathematically grounded measure of token significance. To support reproducibility and foster wider adoption, we provide open-source implementations of all proposed and utilized explainability techniques, with code and resources publicly available at https://github.com/Infosys/Infosys-Responsible-AI-Toolkit

SESep 7, 2021
Interests, Difficulties, Sentiments, and Tool Usages of Concurrency Developers: A Large-Scale Study on Stack Overflow

Mehdi Bagherzadeh, Syed Ahmed, Srilakshmi Sripathi et al.

Context: Software developers are increasingly facing the challenges of writing code that is not only concurrent but also correct. Objective: To help these developers, it is necessary to understand concurrency topics they are interested in, their difficulty in finding answers for questions in these topics, their sentiment for these topics, and how they use concurrency tools and techniques to guarantee correctness. Method: We conduct a large-scale study on the entirety of Stack Overflow to understand interests, difficulties, sentiment, and tool usages of concurrency developers. We discuss the implications of our findings for the practice, research, and education of concurrent software development, and investigate the relation of our findings with the findings of the previous work. Results: A few findings of our study are: (1) questions that concurrency developers ask can be grouped into a hierarchy with 27 concurrency topics under 8 major categories, (2) thread safety is among the most popular concurrency topics and client-server concurrency is among the least popular, (3) irreproducible behavior is among the most difficult topics and memory consistency is among the least difficult, (4) data scraping is among the most positive concurrency topics and irreproducible behavior is among the most negative, (5) root cause identification has the most number of questions for usage of data race tools and alternative use has the least. Conclusion: The results of our study can not only help concurrency developers but also concurrency educators and researchers to better decide where to focus their efforts, by trading off one concurrency topic against another.