Nikoo Salehfard

h-index1
2papers

2 Papers

CLMar 5Code
ARC-TGI: Human-Validated Task Generators with Reasoning Chain Templates for ARC-AGI

Jens Lehmann, Syeda Khushbakht, Nikoo Salehfard et al.

The Abstraction and Reasoning Corpus (ARC-AGI) probes few-shot abstraction and rule induction on small visual grids, but progress is difficult to measure on static collections of hand-authored puzzles due to overfitting, dataset leakage, and memorisation. We introduce ARC-TGI (ARC Task Generators Inventory), an open-source framework for task-family generators: compact Python programs that sample diverse ARC-AGI tasks while preserving a latent rule. ARC-TGI is built around a solver-facing representation: each generated task is paired with natural-language input and transformation reasoning chains and partially evaluated Python code implementing sampling, transformation, and episode construction. Crucially, ARC-TGI supports task-level constraints so that training examples collectively expose the variations needed to infer the underlying rule, a requirement for human-solvable ARC tasks that independent per-example sampling often fails to guarantee. All generators undergo human refinement and local verification to keep both grids and reasoning traces natural and consistent under variation. We release 461 generators covering 180 ARC-Mini tasks, 215 ARC-AGI-1 tasks (200 train, 15 test), and 66 ARC-AGI-2 tasks (55 train, 11 test), enabling scalable dataset sampling and controlled benchmarking.

LGApr 19, 2025
Integrating Single-Cell Foundation Models with Graph Neural Networks for Drug Response Prediction

Till Rossner, Ziteng Li, Jonas Balke et al.

AI-driven drug response prediction holds great promise for advancing personalized cancer treatment. However, the inherent heterogenity of cancer and high cost of data generation make accurate prediction challenging. In this study, we investigate whether incorporating the pretrained foundation model scGPT can enhance the performance of existing drug response prediction frameworks. Our approach builds on the DeepCDR framework, which encodes drug representations from graph structures and cell representations from multi-omics profiles. We adapt this framework by leveraging scGPT to generate enriched cell representations using its pretrained knowledge to compensate for limited amount of data. We evaluate our modified framework using IC$_{50}$ values on Pearson correlation coefficient (PCC) and a leave-one-drug out validation strategy, comparing it against the original DeepCDR framework and a prior scFoundation-based approach. scGPT not only outperforms previous approaches but also exhibits greater training stability, highlighting the value of leveraging scGPT-derived knowledge in this domain.