Hong Zheng

CL
h-index10
11papers
52citations
Novelty44%
AI Score36

11 Papers

IVJul 5, 2024Code
Unraveling Radiomics Complexity: Strategies for Optimal Simplicity in Predictive Modeling

Mahdi Ait Lhaj Loutfi, Teodora Boblea Podasca, Alex Zwanenburg et al.

Background: The high dimensionality of radiomic feature sets, the variability in radiomic feature types and potentially high computational requirements all underscore the need for an effective method to identify the smallest set of predictive features for a given clinical problem. Purpose: Develop a methodology and tools to identify and explain the smallest set of predictive radiomic features. Materials and Methods: 89,714 radiomic features were extracted from five cancer datasets: low-grade glioma, meningioma, non-small cell lung cancer (NSCLC), and two renal cell carcinoma cohorts (n=2104). Features were categorized by computational complexity into morphological, intensity, texture, linear filters, and nonlinear filters. Models were trained and evaluated on each complexity level using the area under the curve (AUC). The most informative features were identified, and their importance was explained. The optimal complexity level and associated most informative features were identified using systematic statistical significance analyses and a false discovery avoidance procedure, respectively. Their predictive importance was explained using a novel tree-based method. Results: MEDimage, a new open-source tool, was developed to facilitate radiomic studies. Morphological features were optimal for MRI-based meningioma (AUC: 0.65) and low-grade glioma (AUC: 0.68). Intensity features were optimal for CECT-based renal cell carcinoma (AUC: 0.82) and CT-based NSCLC (AUC: 0.76). Texture features were optimal for MRI-based renal cell carcinoma (AUC: 0.72). Tuning the Hounsfield unit range improved results for CECT-based renal cell carcinoma (AUC: 0.86). Conclusion: Our proposed methodology and software can estimate the optimal radiomics complexity level for specific medical outcomes, potentially simplifying the use of radiomics in predictive modeling across various contexts.

CLJul 8, 2023
Is ChatGPT a Good Personality Recognizer? A Preliminary Study

Yu Ji, Wen Wu, Hong Zheng et al.

In recent years, personality has been regarded as a valuable personal factor being incorporated into numerous tasks such as sentiment analysis and product recommendation. This has led to widespread attention to text-based personality recognition task, which aims to identify an individual's personality based on given text. Considering that ChatGPT has recently exhibited remarkable abilities on various natural language processing tasks, we provide a preliminary evaluation of ChatGPT on text-based personality recognition task for generating effective personality data. Concretely, we employ a variety of prompting strategies to explore ChatGPT's ability in recognizing personality from given text, especially the level-oriented prompting strategy we designed for guiding ChatGPT in analyzing given text at a specified level. The experimental results on two representative real-world datasets reveal that ChatGPT with zero-shot chain-of-thought prompting exhibits impressive personality recognition ability and is capable to provide natural language explanations through text-based logical reasoning. Furthermore, by employing the level-oriented prompting strategy to optimize zero-shot chain-of-thought prompting, the performance gap between ChatGPT and corresponding state-of-the-art model has been narrowed even more. However, we observe that ChatGPT shows unfairness towards certain sensitive demographic attributes such as gender and age. Additionally, we discover that eliciting the personality recognition ability of ChatGPT helps improve its performance on personality-related downstream tasks such as sentiment classification and stress prediction.

CVNov 28, 2025
Convolutional Feature Noise Reduction for 2D Cardiac MR Image Segmentation

Hong Zheng, Nan Mu, Han Su et al.

Noise reduction constitutes a crucial operation within Digital Signal Processing. Regrettably, it frequently remains neglected when dealing with the processing of convolutional features in segmentation networks. This oversight could trigger the butterfly effect, impairing the subsequent outcomes within the entire feature system. To complete this void, we consider convolutional features following Gaussian distributions as feature signal matrices and then present a simple and effective feature filter in this study. The proposed filter is fundamentally a low-amplitude pass filter primarily aimed at minimizing noise in feature signal inputs and is named Convolutional Feature Filter (CFF). We conducted experiments on two established 2D segmentation networks and two public cardiac MR image datasets to validate the effectiveness of the CFF, and the experimental findings demonstrated a decrease in noise within the feature signal matrices. To enable a numerical observation and analysis of this reduction, we developed a binarization equation to calculate the information entropy of feature signals.

IVMar 23, 2025
Multi-Disease-Aware Training Strategy for Cardiac MR Image Segmentation

Hong Zheng, Yucheng Chen, Nan Mu et al.

Accurate segmentation of the ventricles from cardiac magnetic resonance images (CMRIs) is crucial for enhancing the diagnosis and analysis of heart conditions. Deep learning-based segmentation methods have recently garnered significant attention due to their impressive performance. However, these segmentation methods are typically good at partitioning regularly shaped organs, such as the left ventricle (LV) and the myocardium (MYO), whereas they perform poorly on irregularly shaped organs, such as the right ventricle (RV). In this study, we argue that this limitation of segmentation models stems from their insufficient generalization ability to address the distribution shift of segmentation targets across slices, cardiac phases, and disease conditions. To overcome this issue, we present a Multi-Disease-Aware Training Strategy (MTS) and restructure the introduced CMRI datasets into multi-disease datasets. Additionally, we propose a specialized data processing technique for preprocessing input images to support the MTS. To validate the effectiveness of our method, we performed control group experiments and cross-validation tests. The experimental results show that (1) network models trained using our proposed strategy achieved superior segmentation performance, particularly in RV segmentation, and (2) these networks exhibited robust performance even when applied to data from unknown diseases.

CLDec 14, 2023
Metacognition-Enhanced Few-Shot Prompting With Positive Reinforcement

Yu Ji, Wen Wu, Yi Hu et al.

Few-shot prompting elicits the remarkable abilities of large language models by equipping them with a few demonstration examples in the input. However, the traditional method of providing large language models with all demonstration input-output pairs at once may not effectively guide large language models to learn the specific input-output mapping relationship. In this paper, inspired by the regulatory and supportive role of metacognition in students' learning, we propose a novel metacognition-enhanced few-shot prompting, which guides large language models to reflect on their thought processes to comprehensively learn the given demonstration examples. Furthermore, considering that positive reinforcement can improve students' learning motivation, we introduce positive reinforcement into our metacognition-enhanced few-shot prompting to promote the few-shot learning of large language models by providing response-based positive feedback. The experimental results on two real-world datasets show that our metacognition-enhanced few-shot prompting with positive reinforcement surpasses traditional few-shot prompting in classification accuracy and macro F1.

CVJan 19, 2022
DMF-Net: Dual-Branch Multi-Scale Feature Fusion Network for copy forgery identification of anti-counterfeiting QR code

Zhongyuan Guo, Hong Zheng, Changhui You et al.

Anti-counterfeiting QR codes are widely used in people's work and life, especially in product packaging. However, the anti-counterfeiting QR code has the risk of being copied and forged in the circulation process. In reality, copying is usually based on genuine anti-counterfeiting QR codes, but the brands and models of copiers are diverse, and it is extremely difficult to determine which individual copier the forged anti-counterfeiting code come from. In response to the above problems, this paper proposes a method for copy forgery identification of anti-counterfeiting QR code based on deep learning. We first analyze the production principle of anti-counterfeiting QR code, and convert the identification of copy forgery to device category forensics, and then a Dual-Branch Multi-Scale Feature Fusion network is proposed. During the design of the network, we conducted a detailed analysis of the data preprocessing layer, single-branch design, etc., combined with experiments, the specific structure of the dual-branch multi-scale feature fusion network is determined. The experimental results show that the proposed method has achieved a high accuracy of copy forgery identification, which exceeds the current series of methods in the field of image forensics.

LGOct 27, 2021
GACAN: Graph Attention-Convolution-Attention Networks for Traffic Forecasting Based on Multi-granularity Time Series

Sikai Zhang, Hong Zheng, Hongyi Su et al.

Traffic forecasting is an integral part of intelligent transportation systems (ITS). Achieving a high prediction accuracy is a challenging task due to a high level of dynamics and complex spatial-temporal dependency of road networks. For this task, we propose Graph Attention-Convolution-Attention Networks (GACAN). The model uses a novel Att-Conv-Att (ACA) block which contains two graph attention layers and one spectral-based GCN layer sandwiched in between. The graph attention layers are meant to capture temporal features while the spectral-based GCN layer is meant to capture spatial features. The main novelty of the model is the integration of time series of four different time granularities: the original time series, together with hourly, daily, and weekly time series. Unlike previous work that used multi-granularity time series by handling every time series separately, GACAN combines the outcome of processing all time series after each graph attention layer. Thus, the effects of different time granularities are integrated throughout the model. We perform a series of experiments on three real-world datasets. The experimental results verify the advantage of using multi-granularity time series and that the proposed GACAN model outperforms the state-of-the-art baselines.

CLAug 28, 2020
Cost-Quality Adaptive Active Learning for Chinese Clinical Named Entity Recognition

Tingting Cai, Yangming Zhou, Hong Zheng

Clinical Named Entity Recognition (CNER) aims to automatically identity clinical terminologies in Electronic Health Records (EHRs), which is a fundamental and crucial step for clinical research. To train a high-performance model for CNER, it usually requires a large number of EHRs with high-quality labels. However, labeling EHRs, especially Chinese EHRs, is time-consuming and expensive. One effective solution to this is active learning, where a model asks labelers to annotate data which the model is uncertain of. Conventional active learning assumes a single labeler that always replies noiseless answers to queried labels. However, in real settings, multiple labelers provide diverse quality of annotation with varied costs and labelers with low overall annotation quality can still assign correct labels for some specific instances. In this paper, we propose a Cost-Quality Adaptive Active Learning (CQAAL) approach for CNER in Chinese EHRs, which maintains a balance between the annotation quality, labeling costs, and the informativeness of selected instances. Specifically, CQAAL selects cost-effective instance-labeler pairs to achieve better annotation quality with lower costs in an adaptive manner. Computational results on the CCKS-2017 Task 2 benchmark dataset demonstrate the superiority and effectiveness of the proposed CQAAL.

CLAug 22, 2019
NE-LP: Normalized Entropy and Loss Prediction based Sampling for Active Learning in Chinese Word Segmentation on EHRs

Tingting Cai, Zhiyuan Ma, Hong Zheng et al.

Electronic Health Records (EHRs) in hospital information systems contain patients' diagnosis and treatments, so EHRs are essential to clinical data mining. Of all the tasks in the mining process, Chinese Word Segmentation (CWS) is a fundamental and important one, and most state-of-the-art methods greatly rely on large-scale of manually-annotated data. Since annotation is time-consuming and expensive, efforts have been devoted to techniques, such as active learning, to locate the most informative samples for modeling. In this paper, we follow the trend and present an active learning method for CWS in EHRs. Specically, a new sampling strategy combining Normalized Entropy with Loss Prediction (NE-LP) is proposed to select the most representative data. Meanwhile, to minimize the computational cost of learning, we propose a joint model including a word segmenter and a loss prediction model. Furthermore, to capture interactions between adjacent characters, bigram features are also applied in the joint model. To illustrate the effectiveness of NE-LP, we conducted experiments on EHRs collected from the Shuguang Hospital Affiliated to Shanghai University of Traditional Chinese Medicine. The results demonstrate that NE-LP consistently outperforms conventional uncertainty-based sampling strategies for active learning in CWS.

AIMay 2, 2018
From the Periphery to the Center: Information Brokerage in an Evolving Network

Bo Yan, Yiping Liu, Jiamou Liu et al.

Interpersonal ties are pivotal to individual efficacy, status and performance in an agent society. This paper explores three important and interrelated themes in social network theory: the center/periphery partition of the network; network dynamics; and social integration of newcomers. We tackle the question: How would a newcomer harness information brokerage to integrate into a dynamic network going from periphery to center? We model integration as the interplay between the newcomer and the dynamics network and capture information brokerage using a process of relationship building. We analyze theoretical guarantees for the newcomer to reach the center through tactics; proving that a winning tactic always exists for certain types of network dynamics. We then propose three tactics and show their superior performance over alternative methods on four real-world datasets and four network models. In general, our tactics place the newcomer to the center by adding very few new edges on dynamic networks with approximately 14000 nodes.

CVJan 18, 2016
A Comparative Study of Object Trackers for Infrared Flying Bird Tracking

Ying Huang, Hong Zheng, Haibin Ling et al.

Bird strikes present a huge risk for aircraft, especially since traditional airport bird surveillance is mainly dependent on inefficient human observation. Computer vision based technology has been proposed to automatically detect birds, determine bird flying trajectories, and predict aircraft takeoff delays. However, the characteristics of bird flight using imagery and the performance of existing methods applied to flying bird task are not well known. Therefore, we perform infrared flying bird tracking experiments using 12 state-of-the-art algorithms on a real BIRDSITE-IR dataset to obtain useful clues and recommend feature analysis. We also develop a Struck-scale method to demonstrate the effectiveness of multiple scale sampling adaption in handling the object of flying bird with varying shape and scale. The general analysis can be used to develop specialized bird tracking methods for airport safety, wildness and urban bird population studies.