AIAug 16, 2025
Overcoming Knowledge Discrepancies: Structuring Reasoning Threads through Knowledge Balancing in Interactive ScenariosDaniel Burkhardt, Xiangwei Cheng
Reasoning in interactive problem solving scenarios requires models to construct reasoning threads that reflect user understanding and align with structured domain knowledge. However, current reasoning models often lack explicit semantic hierarchies, user-domain knowledge alignment, and principled mechanisms to prune reasoning threads for effectiveness. These limitations result in lengthy generic output that does not guide users through goal-oriented reasoning steps. To address this, we propose a prototype-inspired, two-phases Reasoning-Threads-Evaluation (ReT-Eval) framework, drawing inspiration from human-like reasoning strategies that emphasize structured knowledge reuse. In the first phase, semantically relevant knowledge structures are extracted from a sparse domain knowledge graph using a graph neural network and enriched with intrinsic large language model knowledge to resolve knowledge discrepancies. In the second phase, these threads are evaluated and pruned using a reward-guided strategy aimed at maintaining semantic coherence to generate effective reasoning threads. Experiments and expert evaluations show that ReT-Eval enhances user understanding and outperforms state-of-the-art reasoning models.
LGJul 14, 2025
Benchmarking and Evaluation of AI Models in Biology: Outcomes and Recommendations from the CZI Virtual Cells WorkshopElizabeth Fahsbender, Alma Andersson, Jeremy Ash et al.
Artificial intelligence holds immense promise for transforming biology, yet a lack of standardized, cross domain, benchmarks undermines our ability to build robust, trustworthy models. Here, we present insights from a recent workshop that convened machine learning and computational biology experts across imaging, transcriptomics, proteomics, and genomics to tackle this gap. We identify major technical and systemic bottlenecks such as data heterogeneity and noise, reproducibility challenges, biases, and the fragmented ecosystem of publicly available resources and propose a set of recommendations for building benchmarking frameworks that can efficiently compare ML models of biological systems across tasks and data modalities. By promoting high quality data curation, standardized tooling, comprehensive evaluation metrics, and open, collaborative platforms, we aim to accelerate the development of robust benchmarks for AI driven Virtual Cells. These benchmarks are crucial for ensuring rigor, reproducibility, and biological relevance, and will ultimately advance the field toward integrated models that drive new discoveries, therapeutic insights, and a deeper understanding of cellular systems.
LGJan 25, 2019
Finding Archetypal Spaces Using Neural NetworksDavid van Dijk, Daniel Burkhardt, Matthew Amodio et al.
Archetypal analysis is a data decomposition method that describes each observation in a dataset as a convex combination of "pure types" or archetypes. These archetypes represent extrema of a data space in which there is a trade-off between features, such as in biology where different combinations of traits provide optimal fitness for different environments. Existing methods for archetypal analysis work well when a linear relationship exists between the feature space and the archetypal space. However, such methods are not applicable to systems where the feature space is generated non-linearly from the combination of archetypes, such as in biological systems or image transformations. Here, we propose a reformulation of the problem such that the goal is to learn a non-linear transformation of the data into a latent archetypal space. To solve this problem, we introduce Archetypal Analysis network (AAnet), which is a deep neural network framework for learning and generating from a latent archetypal representation of data. We demonstrate state-of-the-art recovery of ground-truth archetypes in non-linear data domains, show AAnet can generate from data geometry rather than from data density, and use AAnet to identify biologically meaningful archetypes in single-cell gene expression data.