78.6SEApr 9Code
Tokalator: A Context Engineering Toolkit for Artificial Intelligence Coding AssistantsVahid Farajijobehdar, İlknur Köseoğlu Sarı, Nazım Kemal Üre et al.
Artificial Intelligence (AI)-assisted coding environments operate within finite context windows of 128,000-1,000,000 tokens (as of early 2026), yet existing tools offer limited support for monitoring and optimizing token consumption. As developers open multiple files, model attention becomes diluted and Application Programming Interface (API) costs increase in proportion to input and output as conversation length grows. Tokalator is an open-source context-engineering toolkit that includes a VS Code extension with real-time budget monitoring and 11 slash commands; nine web-based calculators for Cobb-Douglas quality modeling, caching break-even analysis, and $O(T^2)$ conversation cost proofs; a community catalog of agents, prompts, and instruction files; an MCP server and Command Line Interface (CLI); a Python econometrics API; and a PostgreSQL-backed usage tracker. The system supports 17 Large Language Models (LLMs) across three providers (Anthropic, OpenAI, Google) and is validated by 124 unit tests. An initial deployment on the Visual Studio Marketplace recorded 313 acquisitions with a 206.02\% conversion rate as of v3.1.3. A structured survey of 50 developers across three community sessions indicated that instruction-file injection and low-relevance open tabs are among the primary invisible budget consumers in typical AI-assisted development sessions.
CLJan 30
Leveraging LLMs For Turkish Skill ExtractionEzgi Arslan İltüzer, Özgür Anıl Özlü, Vahid Farajijobehdar et al.
Skill extraction is a critical component of modern recruitment systems, enabling efficient job matching, personalized recommendations, and labor market analysis. Despite Türkiye's significant role in the global workforce, Turkish, a morphologically complex language, lacks both a skill taxonomy and a dedicated skill extraction dataset, resulting in underexplored research in skill extraction for Turkish. This article seeks the answers to three research questions: 1) How can skill extraction be effectively performed for this language, in light of its low resource nature? 2)~What is the most promising model? 3) What is the impact of different Large Language Models (LLMs) and prompting strategies on skill extraction (i.e., dynamic vs. static few-shot samples, varying context information, and encouraging causal reasoning)? The article introduces the first Turkish skill extraction dataset and performance evaluations of automated skill extraction using LLMs. The manually annotated dataset contains 4,819 labeled skill spans from 327 job postings across different occupation areas. The use of LLM outperforms supervised sequence labeling when used in an end-to-end pipeline, aligning extracted spans with standardized skills in the ESCO taxonomy more effectively. The best-performing configuration, utilizing Claude Sonnet 3.7 with dynamic few-shot prompting for skill identification, embedding-based retrieval, and LLM-based reranking for skill linking, achieves an end-to-end performance of 0.56, positioning Turkish alongside similar studies in other languages, which are few in the literature. Our findings suggest that LLMs can improve skill extraction performance in low-resource settings, and we hope that our work will accelerate similar research on skill extraction for underrepresented languages.