Jie Fan

LO
h-index15
17papers
314citations
Novelty42%
AI Score54

17 Papers

RONov 9, 2025
Image-based Morphological Characterization of Filamentous Biological Structures with Non-constant Curvature Shape Feature

Jie Fan, Francesco Visentin, Barbara Mazzolai et al.

Tendrils coil their shape to anchor the plant to supporting structures, allowing vertical growth toward light. Although climbing plants have been studied for a long time, extracting information regarding the relationship between the temporal shape change, the event that triggers it, and the contact location is still challenging. To help build this relation, we propose an image-based method by which it is possible to analyze shape changes over time in tendrils when mechano-stimulated in different portions of their body. We employ a geometric approach using a 3D Piece-Wise Clothoid-based model to reconstruct the configuration taken by a tendril after mechanical rubbing. The reconstruction shows high robustness and reliability with an accuracy of R2 > 0.99. This method demonstrates distinct advantages over deep learning-based approaches, including reduced data requirements, lower computational costs, and interpretability. Our analysis reveals higher responsiveness in the apical segment of tendrils, which might correspond to higher sensitivity and tissue flexibility in that region of the organs. Our study provides a methodology for gaining new insights into plant biomechanics and offers a foundation for designing and developing novel intelligent robotic systems inspired by climbing plants.

3.5ITMay 16
Achieving $α$-Fairness in Clustered Cell-Free Networking: A Tight Relaxation Approach

Chaowen Deng, Jie Fan, Boxiang Ren et al.

Clustered cell-free networking has emerged as a promising architecture to balance the high performance of cell-free massive MIMO and the scalability of traditional cellular systems. However, achieving fairness across subnetworks remains a critical yet largely unsolved challenge. This paper investigates the fairness problem in clustered cell-free networking and proposes a unified and tunable alpha-fairness scheme that effectively balances overall spectral efficiency and inter-subnetwork fairness. Using the closed-form deterministic equivalent of the ergodic sum capacity, we reformulate the combinatorial clustering problem as a continuous optimization problem. Leveraging the concavity/convexity properties of the alpha-fair objective, we classify the problem into four distinct cases according to the value of alpha. For each case, we establish the exact equivalence between the original integer program and its continuous relaxation, and develop efficient algorithms with guaranteed convergence. Extensive simulations show that the proposed scheme achieves up to 11% improvement in Jain's fairness index and 45% gain in minimum subnetwork capacity, with only a negligible 5% reduction in aggregate throughput.

CVOct 22, 2024Code
Towards Real Zero-Shot Camouflaged Object Segmentation without Camouflaged Annotations

Cheng Lei, Jie Fan, Xinran Li et al.

Camouflaged Object Segmentation (COS) faces significant challenges due to the scarcity of annotated data, where meticulous pixel-level annotation is both labor-intensive and costly, primarily due to the intricate object-background boundaries. Addressing the core question, "Can COS be effectively achieved in a zero-shot manner without manual annotations for any camouflaged object?" we affirmatively respond and introduce a robust zero-shot COS framework. This framework leverages the inherent local pattern bias of COS and employs a broad semantic feature space derived from salient object segmentation (SOS) for efficient zero-shot transfer. We incorporate an Masked Image Modeling (MIM) based image encoder optimized for Parameter-Efficient Fine-Tuning (PEFT), a Multimodal Large Language Model (M-LLM), and a Multi-scale Fine-grained Alignment (MFA) mechanism. The MIM pre-trained image encoder focuses on capturing essential low-level features, while the M-LLM generates caption embeddings processed alongside these visual cues. These embeddings are precisely aligned using MFA, enabling our framework to accurately interpret and navigate complex semantic contexts. To optimize operational efficiency, we introduce a learnable codebook that represents the M-LLM during inference, significantly reducing computational overhead. Our framework demonstrates its versatility and efficacy through rigorous experimentation, achieving state-of-the-art performance in zero-shot COS with $F_β^w$ scores of 72.9\% on CAMO and 71.7\% on COD10K. By removing the M-LLM during inference, we achieve an inference speed comparable to that of traditional end-to-end models, reaching 18.1 FPS. Code: https://github.com/R-LEI360725/ZSCOS-CaMF

AIOct 16, 2025Code
HugAgent: Benchmarking LLMs for Simulation of Individualized Human Reasoning

Chance Jiajie Li, Zhenze Mo, Yuhan Tang et al.

Simulating human reasoning in open-ended tasks has long been a central aspiration in AI and cognitive science. While large language models now approximate human responses at scale, they remain tuned to population-level consensus, often erasing the individuality of reasoning styles and belief trajectories. To advance the vision of more human-like reasoning in machines, we introduce HugAgent (Human-Grounded Agent Benchmark), which rethinks human reasoning simulation along three dimensions: (i) from averaged to individualized reasoning, (ii) from behavioral mimicry to cognitive alignment, and (iii) from vignette-based to open-ended data. The benchmark evaluates whether a model can predict a specific person's behavioral responses and the underlying reasoning dynamics in out-of-distribution scenarios, given partial evidence of their prior views. HugAgent adopts a dual-track design: a human track that automates and scales the think-aloud method to collect ecologically valid human reasoning data, and a synthetic track for further scalability and systematic stress testing. This architecture enables low-cost, extensible expansion to new tasks and populations. Experiments with state-of-the-art language models reveal persistent adaptation gaps, positioning HugAgent as the first extensible benchmark for aligning machine reasoning with the individuality of human thought. The benchmark, along with its complete data collection pipeline and companion chatbot, is open-sourced as HugAgent (https://anonymous.4open.science/r/HugAgent) and TraceYourThinking (https://anonymous.4open.science/r/trace-your-thinking).

IVJan 3, 2022Code
BDG-Net: Boundary Distribution Guided Network for Accurate Polyp Segmentation

Zihuan Qiu, Zhichuan Wang, Miaomiao Zhang et al.

Colorectal cancer (CRC) is one of the most common fatal cancer in the world. Polypectomy can effectively interrupt the progression of adenoma to adenocarcinoma, thus reducing the risk of CRC development. Colonoscopy is the primary method to find colonic polyps. However, due to the different sizes of polyps and the unclear boundary between polyps and their surrounding mucosa, it is challenging to segment polyps accurately. To address this problem, we design a Boundary Distribution Guided Network (BDG-Net) for accurate polyp segmentation. Specifically, under the supervision of the ideal Boundary Distribution Map (BDM), we use Boundary Distribution Generate Module (BDGM) to aggregate high-level features and generate BDM. Then, BDM is sent to the Boundary Distribution Guided Decoder (BDGD) as complementary spatial information to guide the polyp segmentation. Moreover, a multi-scale feature interaction strategy is adopted in BDGD to improve the segmentation accuracy of polyps with different sizes. Extensive quantitative and qualitative evaluations demonstrate the effectiveness of our model, which outperforms state-of-the-art models remarkably on five public polyp datasets while maintaining low computational complexity. Code: https://github.com/zihuanqiu/BDG-Net

LGApr 13, 2024
Fast Gradient Computation for Gromov-Wasserstein Distance

Wei Zhang, Zihao Wang, Jie Fan et al.

The Gromov-Wasserstein distance is a notable extension of optimal transport. In contrast to the classic Wasserstein distance, it solves a quadratic assignment problem that minimizes the pair-wise distance distortion under the transportation of distributions and thus could apply to distributions in different spaces. These properties make Gromov-Wasserstein widely applicable to many fields, such as computer graphics and machine learning. However, the computation of the Gromov-Wasserstein distance and transport plan is expensive. The well-known Entropic Gromov-Wasserstein approach has a cubic complexity since the matrix multiplication operations need to be repeated in computing the gradient of Gromov-Wasserstein loss. This becomes a key bottleneck of the method. Currently, existing methods accelerate the computation focus on sampling and approximation, which leads to low accuracy or incomplete transport plan. In this work, we propose a novel method to accelerate accurate gradient computation by dynamic programming techniques, reducing the complexity from cubic to quadratic. In this way, the original computational bottleneck is broken and the new entropic solution can be obtained with total quadratic time, which is almost optimal complexity. Furthermore, it can be extended to some variants easily. Extensive experiments validate the efficiency and effectiveness of our method.

CVSep 9, 2025
Object-level Correlation for Few-Shot Segmentation

Chunlin Wen, Yu Zhang, Jie Fan et al.

Few-shot semantic segmentation (FSS) aims to segment objects of novel categories in the query images given only a few annotated support samples. Existing methods primarily build the image-level correlation between the support target object and the entire query image. However, this correlation contains the hard pixel noise, \textit{i.e.}, irrelevant background objects, that is intractable to trace and suppress, leading to the overfitting of the background. To address the limitation of this correlation, we imitate the biological vision process to identify novel objects in the object-level information. Target identification in the general objects is more valid than in the entire image, especially in the low-data regime. Inspired by this, we design an Object-level Correlation Network (OCNet) by establishing the object-level correlation between the support target object and query general objects, which is mainly composed of the General Object Mining Module (GOMM) and Correlation Construction Module (CCM). Specifically, GOMM constructs the query general object feature by learning saliency and high-level similarity cues, where the general objects include the irrelevant background objects and the target foreground object. Then, CCM establishes the object-level correlation by allocating the target prototypes to match the general object feature. The generated object-level correlation can mine the query target feature and suppress the hard pixel noise for the final prediction. Extensive experiments on PASCAL-${5}^{i}$ and COCO-${20}^{i}$ show that our model achieves the state-of-the-art performance.

LOJul 22, 2025
Axiomatizing Rumsfeld Ignorance

Jie Fan

In a recent paper, Kit Fine presents some striking results concerning the logical properties of (first-order) ignorance, second-order ignorance and Rumsfeld ignorance. However, Rumsfeld ignorance is definable in terms of ignorance, which makes some existing results and the axiomatization problem trivial. A main reason is that the accessibility relations for the implicit knowledge operator contained in the packaged operators of ignorance and Rumsfeld ignorance are the same. In this work, we assume the two accessibility relations to be different so that one of them is an arbitrary subset of the other. This will avoid the definability issue and retain most of the previous validities. The main results are axiomatizations over various proper bi-frame classes. Finally we apply our framework to analyze Fine's results.

CYJun 8, 2025
Simulating Society Requires Simulating Thought

Chance Jiajie Li, Jiayi Wu, Zhenze Mo et al.

Simulating society with large language models (LLMs), we argue, requires more than generating plausible behavior; it demands cognitively grounded reasoning that is structured, revisable, and traceable. LLM-based agents are increasingly used to emulate individual and group behavior, primarily through prompting and supervised fine-tuning. Yet current simulations remain grounded in a behaviorist "demographics in, behavior out" paradigm, focusing on surface-level plausibility. As a result, they often lack internal coherence, causal reasoning, and belief traceability, making them unreliable for modeling how people reason, deliberate, and respond to interventions. To address this, we present a conceptual modeling paradigm, Generative Minds (GenMinds), which draws from cognitive science to support structured belief representations in generative agents. To evaluate such agents, we introduce the RECAP (REconstructing CAusal Paths) framework, a benchmark designed to assess reasoning fidelity via causal traceability, demographic grounding, and intervention consistency. These contributions advance a broader shift: from surface-level mimicry to generative agents that simulate thought, not just language, for social simulations.

ROApr 14, 2025
Toward Aligning Human and Robot Actions via Multi-Modal Demonstration Learning

Azizul Zahid, Jie Fan, Farong Wang et al.

Understanding action correspondence between humans and robots is essential for evaluating alignment in decision-making, particularly in human-robot collaboration and imitation learning within unstructured environments. We propose a multimodal demonstration learning framework that explicitly models human demonstrations from RGB video with robot demonstrations in voxelized RGB-D space. Focusing on the "pick and place" task from the RH20T dataset, we utilize data from 5 users across 10 diverse scenes. Our approach combines ResNet-based visual encoding for human intention modeling and a Perceiver Transformer for voxel-based robot action prediction. After 2000 training epochs, the human model reaches 71.67% accuracy, and the robot model achieves 71.8% accuracy, demonstrating the framework's potential for aligning complex, multimodal human and robot behaviors in manipulation tasks.

CHEM-PHAug 9, 2021
ChemiRise: a data-driven retrosynthesis engine

Xiangyan Sun, Ke Liu, Yuquan Lin et al.

We have developed an end-to-end, retrosynthesis system, named ChemiRise, that can propose complete retrosynthesis routes for organic compounds rapidly and reliably. The system was trained on a processed patent database of over 3 million organic reactions. Experimental reactions were atom-mapped, clustered, and extracted into reaction templates. We then trained a graph convolutional neural network-based one-step reaction proposer using template embeddings and developed a guiding algorithm on the directed acyclic graph (DAG) of chemical compounds to find the best candidate to explore. The atom-mapping algorithm and the one-step reaction proposer were benchmarked against previous studies and showed better results. The final product was demonstrated by retrosynthesis routes reviewed and rated by human experts, showing satisfying functionality and a potential productivity boost in real-life use cases.

QMMar 6, 2021
Molecular modeling with machine-learned universal potential functions

Ke Liu, Zekun Ni, Zhenyu Zhou et al.

Molecular modeling is an important topic in drug discovery. Decades of research have led to the development of high quality scalable molecular force fields. In this paper, we show that neural networks can be used to train a universal approximator for energy potential functions. By incorporating a fully automated training process we have been able to train smooth, differentiable, and predictive potential functions on large-scale crystal structures. A variety of tests have also been performed to show the superiority and versatility of the machine-learned model.

LOFeb 22, 2020
Notes on neighborhood semantics for logics of unknown truths and false beliefs

Jie Fan

In this article, we study logics of unknown truths and false beliefs under neighborhood semantics. We compare the relative expressivity of the two logics. It turns out that they are incomparable over various classes of neighborhood models, and the combination of the two logics are equally expressive as standard modal logic over any class of neighborhood models. We propose morphisms for each logic, which can help us explore the frame definability problem, show a general soundness and completeness result, and generalize some results in the literature. We axiomatize the two logics over various classes of neighborhood frames. Last but not least, we extend the results to the case of public announcements, which has good applications to Moore sentences and some others.

LOSep 24, 2018
A family of neighborhood contingency logics

Jie Fan

This article proposes the axiomatizations of contingency logics of various natural classes of neighborhood frames. In particular, by defining a suitable canonical neighborhood function, we give sound and complete axiomatizations of monotone contingency logic and regular contingency logic, thereby answering two open questions raised by Bakhtiari, van Ditmarsch, and Hansen. The canonical function is inspired by a function proposed by Kuhn in~1995. We show that Kuhn's function is actually equal to a related function originally given by Humberstone.

LGMar 16, 2018
Chemi-net: a graph convolutional network for accurate drug property prediction

Ke Liu, Xiangyan Sun, Lei Jia et al.

Absorption, distribution, metabolism, and excretion (ADME) studies are critical for drug discovery. Conventionally, these tasks, together with other chemical property predictions, rely on domain-specific feature descriptors, or fingerprints. Following the recent success of neural networks, we developed Chemi-Net, a completely data-driven, domain knowledge-free, deep learning method for ADME property prediction. To compare the relative performance of Chemi-Net with Cubist, one of the popular machine learning programs used by Amgen, a large-scale ADME property prediction study was performed on-site at Amgen. The results showed that our deep neural network method improved current methods by a large margin. We foresee that the significantly increased accuracy of ADME prediction seen with Chemi-Net over Cubist will greatly accelerate drug discovery.

BMJul 26, 2017
Prediction of amino acid side chain conformation using a deep neural network

Ke Liu, Xiangyan Sun, Jun Ma et al.

A deep neural network based architecture was constructed to predict amino acid side chain conformation with unprecedented accuracy. Amino acid side chain conformation prediction is essential for protein homology modeling and protein design. Current widely-adopted methods use physics-based energy functions to evaluate side chain conformation. Here, using a deep neural network architecture without physics-based assumptions, we have demonstrated that side chain conformation prediction accuracy can be improved by more than 25%, especially for aromatic residues compared with current standard methods. More strikingly, the prediction method presented here is robust enough to identify individual conformational outliers from high resolution structures in a protein data bank without providing its structural factors. We envisage that our amino acid side chain predictor could be used as a quality check step for future protein structure model validation and many other potential applications such as side chain assignment in Cryo-electron microscopy, crystallography model auto-building, protein folding and small molecule ligand docking.

AINov 30, 2013
Knowing Whether

Jie Fan, Yanjing Wang, Hans van Ditmarsch

Knowing whether a proposition is true means knowing that it is true or knowing that it is false. In this paper, we study logics with a modal operator Kw for knowing whether but without a modal operator K for knowing that. This logic is not a normal modal logic, because we do not have Kw (phi -> psi) -> (Kw phi -> Kw psi). Knowing whether logic cannot define many common frame properties, and its expressive power less than that of basic modal logic over classes of models without reflexivity. These features make axiomatizing knowing whether logics non-trivial. We axiomatize knowing whether logic over various frame classes. We also present an extension of knowing whether logic with public announcement operators and we give corresponding reduction axioms for that. We compare our work in detail to two recent similar proposals.