Zitong Lu

h-index6

6papers

26citations

Novelty54%

AI Score47

Ranked #34,729 of 194,257 authors (top 18%)#14 in NC (top 7%)

6 Papers

4.3NCApr 21, 2023Code

Generate your neural signals from mine: individual-to-individual EEG converters

Zitong Lu, Julie D. Golomb

Most models in cognitive and computational neuroscience trained on one subject do not generalize to other subjects due to individual differences. An ideal individual-to-individual neural converter is expected to generate real neural signals of one subject from those of another one, which can overcome the problem of individual differences for cognitive and computational models. In this study, we propose a novel individual-to-individual EEG converter, called EEG2EEG, inspired by generative models in computer vision. We applied THINGS EEG2 dataset to train and test 72 independent EEG2EEG models corresponding to 72 pairs across 9 subjects. Our results demonstrate that EEG2EEG is able to effectively learn the mapping of neural representations in EEG signals from one subject to another and achieve high conversion performance. Additionally, the generated EEG signals contain clearer representations of visual information than that can be obtained from real data. This method establishes a novel and state-of-the-art framework for neural conversion of EEG signals, which can realize a flexible and high-performance mapping from individual to individual and provide insight for both neural engineering and cognitive neuroscience.

5.3CLMay 27

VLMs May Not Globally Enhance Human Alignment over LLMs During Natural Reading

Jinzhou Wu, Zhengwu Ma, Jixing Li et al.

Large language models (LLMs) have become increasingly useful computational models of human language processing, but it remains unclear whether vision-language learning makes text representations more human-like during natural reading. Here, we address this question by comparing tightly matched LLM and vision-language model (VLM) pairs under a strictly text-only setting, allowing us to isolate the effect of multimodal training history from online visual input or cross-modal fusion. We evaluate model alignment with a human natural-reading dataset that includes whole-cortex fMRI responses and synchronized eye-tracking saccades. Our findings demonstrate that multimodal pretraining may not confer a uniform, global advantage in human alignment during natural reading, indicating that language-internal representations remain the key factor for modeling human text processing. However, the VLM advantage could emerge more selectively when sentences contain stronger visual semantic content, with converging evidence from both fMRI and eye-movement alignments. Together, our findings provide a controlled in silico framework for testing how visual learning history shapes model-human alignment of language processing, suggesting that multimodal pretraining contributes selectively rather than globally to human-like language representations during natural reading.

11.1AIOct 17, 2025Code

ScholarEval: Research Idea Evaluation Grounded in Literature

Hanane Nour Moussa, Patrick Queiroz Da Silva, Daniel Adu-Ampratwum et al.

As AI tools become increasingly common for research ideation, robust evaluation is critical to ensure the validity and usefulness of generated ideas. We introduce ScholarEval, a retrieval augmented evaluation framework that assesses research ideas based on two fundamental criteria: soundness - the empirical validity of proposed methods based on existing literature, and contribution - the degree of advancement made by the idea across different dimensions relative to prior research. To evaluate ScholarEval, we introduce ScholarIdeas, the first expert-annotated dataset of multi-domain research ideas and reviews, comprised of 117 ideas across four disciplines: artificial intelligence, neuroscience, biochemistry, and ecology. Our evaluation shows that ScholarEval achieves significantly higher coverage of points mentioned in the human expert annotated rubrics in ScholarIdeas compared to all baselines. Furthermore, ScholarEval is consistently preferred over our strongest baseline o4-mini-deep-research, a reasoning and search-enabled agentic system by OpenAI, in terms of evaluation actionability, depth, and evidence support. Our large-scale user study also shows that ScholarEval significantly outperforms deep research in literature engagement, idea refinement, and usefulness. We openly release our code, dataset, and ScholarEval tool for the community to use and build on.

8.5IVJul 15, 2024

Teaching CORnet Human fMRI Representations for Enhanced Model-Brain Alignment

Zitong Lu, Yile Wang

Deep convolutional neural networks (DCNNs) have demonstrated excellent performance in object recognition and have been found to share some similarities with brain visual processing. However, the substantial gap between DCNNs and human visual perception still exists. Functional magnetic resonance imaging (fMRI) as a widely used technique in cognitive neuroscience can record neural activation in the human visual cortex during the process of visual perception. Can we teach DCNNs human fMRI signals to achieve a more brain-like model? To answer this question, this study proposed ReAlnet-fMRI, a model based on the SOTA vision model CORnet but optimized using human fMRI data through a multi-layer encoding-based alignment framework. This framework has been shown to effectively enable the model to learn human brain representations. The fMRI-optimized ReAlnet-fMRI exhibited higher similarity to the human brain than both CORnet and the control model in within-and across-subject as well as within- and across-modality model-brain (fMRI and EEG) alignment evaluations. Additionally, we conducted an in-depth analyses to investigate how the internal representations of ReAlnet-fMRI differ from CORnet in encoding various object dimensions. These findings provide the possibility of enhancing the brain-likeness of visual models by integrating human neural data, helping to bridge the gap between computer vision and visual neuroscience.

5.2CVJan 30, 2024

Achieving More Human Brain-Like Vision via Human EEG Representational Alignment

Zitong Lu, Yile Wang, Julie D. Golomb

Despite advancements in artificial intelligence, object recognition models still lag behind in emulating visual information processing in human brains. Recent studies have highlighted the potential of using neural data to mimic brain processing; however, these often rely on invasive neural recordings from non-human subjects, leaving a critical gap in understanding human visual perception. Addressing this gap, we present, 'Re(presentational)Al(ignment)net', a vision model aligned with human brain activity based on non-invasive EEG, demonstrating a significantly higher similarity to human brain representations. Our innovative image-to-brain multi-layer encoding framework advances human neural alignment by optimizing multiple model layers and enabling the model to efficiently learn and mimic the human brain's visual representational patterns across object categories and different modalities. Our findings suggest that ReAlnets better align artificial neural networks with human brain representations, making it more similar to human brain processing than traditional computer vision models, which takes an important step toward bridging the gap between artificial and human vision and achieving more brain-like artificial intelligence systems.

2.3NCFeb 23, 2025

Language learning shapes visual category-selectivity in deep neural networks

Zitong Lu, Yuxin Wang

Category-selective regions in the human brain-such as the fusiform face area (FFA), extrastriate body area (EBA), parahippocampal place area (PPA), and visual word form area (VWFA)-support high-level visual recognition. Here, we investigate whether artificial neural networks (ANNs) exhibit analogous category-selective neurons and how these representations are shaped by language experience. Using an fMRI-inspired functional localizer approach, we identified face-, body-, place-, and word-selective neurons in deep networks presented with category images and scrambled controls. Both the purely visual ResNet and a linguistically supervised Lang-Learned ResNet contained category-selective neurons that increased in proportion across layers. However, compared to the vision-only model, the Lang-Learned ResNet showed a greater number but lower specificity of category-selective neurons, along with reduced spatial localization and attenuated activation strength-indicating a shift toward more distributed, semantically aligned coding. These effects were replicated in the large-scale vision-language model CLIP. Together, our findings reveal that language experience systematically reorganizes visual category representations in ANNs, providing a computational parallel to how linguistic context may shape categorical organization in the human brain.